Skip to main content

RTP Payload Format for Versatile Video Coding (VVC)
draft-ietf-avtcore-rtp-vvc-18

Yes

Murray Kucherawy

No Objection

Erik Kline
(Alvaro Retana)
(Andrew Alston)
(Robert Wilton)

Note: This ballot was opened for revision 16 and is now closed.

Murray Kucherawy
Yes
Zaheduzzaman Sarker
(was Discuss) Yes
Comment (2022-08-18) Sent
Thanks for addressing my comments and resolving the discuss.

for future ref, I am copying here the discuss point that got resolved -

    In section 7.3.2.3, it says sprop-max-don-diff and sprop-depack-buf-bytes parameter should be interpreted differently than usual interpretation of the parameters according to RFC 3264. This is a significant change and kind of easy to miss. This section does not use any normative text to enforce the change either.
Erik Kline
No Objection
Francesca Palombini
(was Discuss) No Objection
Comment (2022-06-20 for -16) Sent
# ART AD Review of draft-ietf-avtcore-rtp-vvc-16

cc @fpalombini

Thank you for the work on this document, and for addressing my previous DISCUSSes.

Thanks for posting the media-types review request https://mailarchive.ietf.org/arch/msg/media-types/iinskT_KIviiCsmnL32ql4PuQfU/ and thanks to Martin Dürst for his review. 

Re the IANA registration: in recent years we have preferred to use "IETF" for change controller, to indicate that this comes from a consensus document, as document in https://datatracker.ietf.org/doc/html/draft-leiba-ietf-iana-registrations-00. So in this case I would suggest using "IETF <avtcore@ietf.org>". I am ok with no changes for the other review comments, thanks for the replies.

Francesca

## Comments

### DONL and NALU size in figures 5 and 6

Section 4.3.2:
```
   The first aggregation unit in an AP consists of a conditional 16-bit
   DONL field (in network byte order) followed by a 16-bit unsigned size
   information (in network byte order) that indicates the size of the
```
Which indicates DONL to be a 16-bit field, but in the figure 5 DONL appears to be 24 bits.

```
   An aggregation unit that is not the first aggregation unit in an AP
   will be followed immediately by a 16-bit unsigned size information
   (in network byte order) that indicates the size of the NAL unit in
```

Same for the NALU size: 16 bits in the paragraph above, but 24 bits in figure 6.

EDIT: from the authors - "Aggregation units can start and end at octet boundaries.  We tried to emphasize that by having the first octet in the 32-bit dword belonging to something else.  That’s why there’s the colon between bit 7 and bit 8.  The colon signifies the start and end of the aggregation unit. " I suggest adding a sentence clarifying the above to avoid confusion in the reader.

### Values from \[VVC\] undefined

In section 3.1.1, there are a number of values that are not defined:  GDR_NUT, CRA_NUT, IDR_W_RADL, IDR_N_LP. I understand these come from \[VVC\] and are reported as is, however they make the text harder to parse since to reference to these values is given.

### Wrong reference

Section 4.3:
```
      header.  This payload structure is specified in Section 4.4.1.
```
4.4.1 should be 4.3.1.

### sprop-max-don-diff

sprop-max-don-diff appears first in section 4.3.1 - it would be good to add a reference to 7.2, where its meaning is defined.

### Base 64

In Section 7.2, Base64 is used - please specify if the encoding follows "Base 64 Encoding" (Section 4) or "Base 64 Encoding with URL and Filename Safe Alphabet" (Section 5) of RFC 4648. (This can easily be done in one sentence, rather than repeated everytime base64 is mentioned).

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues.

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments
Paul Wouters
(was Discuss) No Objection
Comment (2022-07-08 for -17) Sent
My DISCUSSes were addressed in version -17 (see  https://mailarchive.ietf.org/arch/msg/avt/RZr6jWX-S1u6k0efT1Gm-YAbXZw/)

Old DISCUSSes:

Please be aware that this document is far outside my area of expertise,
and my comments might make no sense. Please do not be nervous to tell
me I am wrong - likely I am....

#1

   [VVC] is particularly vulnerable to such
   attacks, as it is extremely simple to generate datagrams containing
   NAL units that affect the decoding process of many future NAL units.
   Therefore, the usage of data origin authentication and data integrity
   protection of at least the RTP packet is RECOMMENDED, for example,
   with SRTP [RFC3711].

If something is "particularly vulnerable", why is its security counter
measures only RECOMMENDED instead of REQUIRED ? Is there a real world
use case where this vulnerable protocol should continue despite the
threat without these counter measures?

#2

Media-Aware Network Element (MANE) are briefly mentioned in the Security
Considerations, but it is unclear to me how a user can opt-in or opt-out
of using these or how it could even evaluate a MANE for trustworthiness.
Does a user even know if there is a MANE ?
And especially combining the two issues, if a MANE can rewrite the SEI,
would it not mean that it could attack a user with malicious data that
appear trusted?

#3

In the IANA Considerations, it points to another section. It is customary
to just make this section stand on its own with clear and explicit instructions
for IANA so they do not need to read or understand large parts of the document.

COMMENTS:

   forbidden_zero_bit.  Required to be zero in VVC.  Note that the
   inclusion of this bit in the NAL unit header was to enable
   transport of VVC video over MPEG-2 transport systems (avoidance of
   start code emulations) [MPEG2S].  In the context of this memo the
   value 1 may be used

So it MUST be zero but MAY be 1? A bit odd for a "forbidden_zero_bit".
Also, what is "this memo"? Does it mean this document or does it mean [MPEG2S] ?
(also, "forbidden zero" kind of reads like "forbidden to be zero" which is
the opposite of what is meant)

PayloadHdr appears without value in Figure 3 and with "(Type=28)" in Figure 4.
Does the first occurance without value also use type=28 ? If so, can this be added? If not, can the non-28 value be added?

What does the ":" character denote in Figure 5 ,6 and 8?

  Fragments of the same NAL unit MUST be sent in consecutive
   order with ascending RTP sequence numbers (with no other RTP packets
   within the same RTP stream being sent between the first and last
   fragment).

Why is this? I would say the RTP seq numbers would allow the target to
order the packets, which it has to do anyway if the network causes re-ordering.
Why then, can the host not do this? Eg if it has two crypto modules to
independently encrypt these packets without needing to sync sending them to
ensure this requirement?

    Security considerations:
    See Section 9 of RFC XXXX.

Does XXXX refer to this document? eg [RFC TBD1]? Or is this a placeholder
for another RFC that was forgotten and needs fixing? I think it is
for this document since it has Security Considerations in Section 9.

In Section 7.3.2 starts with "This section describes the negotiation of
unicast messages" but really also describes using Multicast so it is a
bit contradicting.

   Congestion control for RTP SHALL be used

I always find MUST clearer than SHALL? Might be a non-native english speaker
issue contaminated by Gandalf's speech. The same paragraph uses "MUST monitor"
and not "SHALL monitor", so better at least use either MUST or SHALL for both
of these?
Roman Danyliw
No Objection
Comment (2022-06-14 for -16) Sent
** Section 1.1.2.

The decoding capability information includes parameters that stay
   constant for the lifetime of a VVC bitstream, which in IETF terms can
   translate to a session .  

I appreciate the clarity.  Would it be possible reframe this to be explicit on the definition of this “IETF session” for a reader that might not know what that means?

** Section 4.3.1.  What is the Type value in the PayloadHdr of this Single NAL Unit Packet?  Section 4.3.2 which describes APs says its type value is 28.  Section 4.3.3 which describes fragments says its type is 29.  [VCC] is behind a paywall so I am unable to check the Table 5 per the reference in Section 1.1.4.

** Section 4.3.1.  An early reference to sprop-max-don-diff defined later in the document would be very helpful.

** Section 4.3.3.
   FuType: 5 bits

      The field FuType MUST be equal to the field Type of the fragmented
      NAL unit.

What is the reference for the possible values of this field?

** Section 9.  Thank you for discussing the trade-offs with deploying the MANE.  Since E2E security was noted for deployments without the MANE, please also be explicitly on what including the MANE means. 

OLD
   To be allowed to perform
   such operations, a MANE is required to be a trusted entity that is
   included in the security context establishment. 

NEW
To be allowed to perform such operations, a MANE is required to be a trusted entity that is included in the security context establishment.  
This on-path inclusion of the MANE forgoes end-to-end security guarantees for the end points.
Éric Vyncke
No Objection
Comment (2022-06-12 for -16) Sent
# Éric Vyncke, INT AD, review of # Éric Vyncke, INT AD, review of draft-ietf-avtcore-rtp-vvc-16
CC @evyncke

Thank you for the work put into this document.

Please find below some non-blocking COMMENT points (but replies would be appreciated even if only for my own education).

Special thanks to Bernard Aboba for the shepherd's write-up including the WG consensus, but I miss the justification of the intended status. 

I hope that this helps to improve the document,

Regards,

-éric

## COMMENTS

### Section 1

Even if the content was a little over my head, it is a nice introduction.

### Section 4.3.2
```
   An AP MUST carry at least two aggregation units and can carry as many
   aggregation units as necessary; however, the total amount of data in
   an AP obviously MUST fit into an IP packet, and the size SHOULD be
   chosen so that the resulting IP packet is smaller than the MTU size
   so to avoid IP layer fragmentation.
```

I am afraid that I do not fully understand the "MUST fit" and "size SHOULD be chosen" because having a size smaller than the MTU is the only way to fit in one IP packet. Should the 2nd "SHOULD" be a "MUST" ?

### Section 4.3.3
```
   Fragments of the same NAL unit MUST be sent in consecutive
   order with ascending RTP sequence numbers (with no other RTP packets
   within the same RTP stream being sent between the first and last
   fragment).
```

What is the expected behaviour of the receiver when packets with fragments are received out of orders ? Section 6 does not seem to cover this.

### Section 7.1

`The receiver MUST ignore any parameter unspecified in this memo.` should this doc leave the door open for other docs to update/augment this specification ?


## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues. 

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments
Alvaro Retana Former IESG member
No Objection
No Objection (for -16) Not sent

                            
Andrew Alston Former IESG member
No Objection
No Objection (for -16) Not sent

                            
Robert Wilton Former IESG member
No Objection
No Objection (for -16) Not sent