AVTCORE Working Group                                              Y. He
Internet-Draft                                                    W. Zia
Intended status: Standards Track                                Qualcomm
Expires: 24 December 2022                                    C. Herglotz
                                                                     FAU
                                                             E. Francois
                                                            InterDigital
                                                            22 June 2022


        RTP Control Protocol (RTCP) Messages for Green Metadata
                draft-he-avtcore-rtcp-green-metadata-00

Abstract

   This memo describes an RTCP feedback message format for the ISO/IEC
   International Standard 23001-11, known as Energy Efficient Media
   Consumption (Green metadata), developed by the ISO/IEC JTC 1/SC 29/
   WG 3 MPEG System.  The RTCP payload format specified in this document
   enables receivers to provide feedback to the senders and thus allows
   for short-term adaptation and feedback-based energy efficient
   mechanisms to be implemented.  The payload format has broad
   applicability in real-time video communication services.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 24 December 2022.

Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.






He, et al.              Expires 24 December 2022                [Page 1]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Abbreviations . . . . . . . . . . . . . . . . . . . . . . . .   3
   4.  Format of RTCP Feedback Messages  . . . . . . . . . . . . . .   3
     4.1.  Temporal-Spatial Resolution Request . . . . . . . . . . .   4
       4.1.1.  Message format  . . . . . . . . . . . . . . . . . . .   5
       4.1.2.  Semantics . . . . . . . . . . . . . . . . . . . . . .   6
       4.1.3.  Timing Rules  . . . . . . . . . . . . . . . . . . . .   6
       4.1.4.  Handling of Message in Mixers and Translators . . . .   6
     4.2.  Temporal-Spatial Resolution Notification (TSRN) . . . . .   7
       4.2.1.  Message format  . . . . . . . . . . . . . . . . . . .   7
       4.2.2.  Semantics . . . . . . . . . . . . . . . . . . . . . .   8
       4.2.3.  Timing Rules  . . . . . . . . . . . . . . . . . . . .   8
       4.2.4.  Handling of TSRN in Mixers and Translators  . . . . .   9
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
     7.2.  Informative References  . . . . . . . . . . . . . . . . .  10
   Appendix A.  Change History . . . . . . . . . . . . . . . . . . .  10
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

1.  Introduction

   ISO/IEC 23001-11 specification, Energy Efficient Media Consumption
   (Green metadata) [GreenMetadata], specifies metadata that facilitates
   reduction of energy usage during media consumption.  Two main types
   of metadata are defined in the specification.  The first type
   consists of metadata generated by a video encoder which provides
   information about the decoding complexity of the delivered bitstream
   and about the quality of the decoded content.  This first type of
   metadata is conveyed via the supplemental enhancement information
   (SEI) message mechanism specified in the video coding standard ITU-T
   Recommendation H.264 and ISO/IEC 14496-10 [AVC], H.265 and ISO/IEC
   23008-5 [HEVC], H.266 and ISO/IEC 23090-3 [VVC].





He, et al.              Expires 24 December 2022                [Page 2]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   The second type consists of metadata generated by a decoder as
   feedback conveyed to the encoder to adapt the decoder energy
   consumption.  This document focuses on this second type of metadata
   which is conveyed as extension of RTCP feedback messages [RFC4585].
   The feedback in the second type of metadata specified in ISO/IEC
   23001-11 [GreenMetadata] includes decoder operations reduction
   request, coding tools configuration request and spatial and temporal
   scaling request.  This document defines new RTCP payload format for
   the spatial and temporal resolution request and notification feedback
   message.

2.  Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Abbreviations

   AVPF: The extended RTP profile for RTCP-based feedback

   FCI: Feedback Control Information [RFC4585]

   FMT: Feedback Message Type [RFC4585]

   PSFB: Payload-specific FB message [RFC4585]

   TSRR: Temporal-Spatial Resolution Request

   TSRN: Temporal-Spatial Resolution Notification

4.  Format of RTCP Feedback Messages

   This document extends the RTCP feedback messages defined in the RTP/
   AVPF [RFC4585] and [RFC5104] by defining a Green Metadata feedback
   message.  The message can be used by the receiver to inform the
   sender of the desirable coding spatial resolution and temporal
   resolution (frame rate) of the bitstream delivered, and by the sender
   to indicate the coding spatial and temporal resolution it will use
   henceforth.

   RTCP Green Metadata feedback message follows a similar message format
   as RTCP Temporal-Spatial Trade-off Request and Notification
   [RFC5104].  The message may be sent in a regular full compound RTCP
   packet or in an early RTCP packet, as per the RTP/AVPF rules.




He, et al.              Expires 24 December 2022                [Page 3]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   AVPF [RFC4585][RFC5104] define seven payload-specific feedback
   messages and one application layer feedback message.  This document
   specifies two additional payload-specific feedback messages:
   Temporal-Spatial Resolution Request (TSRR) and Temporal-Spatial
   Resolution Notification (TSRN).  All are identified by means of the
   the feedback message type (FMT) parameter as follows:

   Assigned in [RFC4585]:

   1: Picture Loss Indication (PLI)

   2: Slice Lost Indication (SLI)

   3: Reference Picture Selection Indication (RPSI)

   15: Application layer FB message

   31: reserved for future expansion of the number space

   Assigned in [RFC5104]:

   4: Full Intra Request (FIR) Command

   5: Temporal-Spatial Trade-off Request (TSTR)

   6: Temporal-Spatial Trade-off Notification (TSTN)

   7: Video Back Channel Message (VBCM)

   Assigned in this document:

   8: Temporal-Spatial Resolution Request (TSRR)

   9: Temporal-Spatial Resolution Notification (TSRN)

   Unassigned:

   0: unassigned

   10-14: unassigned

   16-30: unassigned

4.1.  Temporal-Spatial Resolution Request

   The TSRR feedback message is identified by RTCP packet type value
   PT=PSFB and FMT=8.




He, et al.              Expires 24 December 2022                [Page 4]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   The FCI field MUST contain one or more TSRR FCI entries.

4.1.1.  Message format

   The content of the FCI entry for the Temporal-Spatial Resolution
   Request is depicted in Figure 1.


   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                              SSRC                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   Seq nr.     |         Reserved          |   Frame Rate      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Picture Width         |    Picture Height           |0 0 0|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            Syntax of an FCI Entry in the TSRR Message

                                  Figure 1

   SSRC (32 bits): The Synchronization Source (SSRC) of the media sender
   that is requested to apply the frame rate and picture resolution.

   Seq nr. (8 bits): Request sequence number.  The sequence number space
   is unique for pairing of the SSRC of request source and the SSRC of
   the request target.  The sequence number SHALL be increased by 1
   modulo 256 for each new command.  A repetition SHALL NOT increase the
   sequence number.  The initial value is arbitrary.

   Reserved (14 bits): All bits SHALL be set to 0 by the sender and
   SHALL be ignored on reception.

   Frame Rate (10 bits): frames_per_second.  This field specifies the
   frame rate as defined in clause 5.3 of [GreenMetadata].  An integer
   value between 0 and 1023 that indicates the coding frame rate that is
   requested.  The value of Frame Rate equal to 0 indicates that no
   video bitstream is requested.

   Picture Width (14 bits): pic_width_in_luma_samples.  This field
   specifies the picture width as defined in clause 5.3 of
   [GreenMetadata].  An integer value between 0 and 16383 that indicates
   the coding picture width in the units of luma samples that is
   requested.  The value of Picture Width equal to 0 indicates that no
   video bitstream is requested.






He, et al.              Expires 24 December 2022                [Page 5]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   Picture Height (14 bits): pic_height_in_luma_samples.  This specifies
   the picture height as defined in clause 5.3 of [GreenMetadata].  An
   integer value between 0 and 16383 that indicates the coding picture
   height in the units of luma samples that is requested.  The value of
   Picture Height equal to 0 indicates that no video bitstream is
   requested.

4.1.2.  Semantics

   A decoder can suggest a temporal-spatial resolution by sending a TSRR
   message to an encoder.  If the encoder is capable of adjusting its
   temporal-spatial resolution, it SHOULD take into account the received
   TSRR message for future coding of pictures.  A value of 0 for either
   Frame Rate, Picture Width and/or Picture Height suggests no video
   bitstream is to be decoded.

   The reaction to the reception of more than one TSRR message by a
   media sender from different media receivers is left open to the
   implementation.  The selected Frame Rate, Picture Width and Picture
   Height SHALL be communicated to the media receivers by means of the
   TSRN message (see section Section 4.2).

   Within the common packet header for feedback messages (as defined in
   section 6.1 of [RFC4585]), the "SSRC of packet sender" field
   indicates the source of the request, and the "SSRC of media source"
   is not used and SHALL be set to 0.  The SSRCs of the media senders to
   which the TSRR applies are in the corresponding FCI entries.

   A TSRR message MAY contain requests to multiple media senders, using
   one FCI entry per target media sender.

4.1.3.  Timing Rules

   The timing follows the rules outlined in section 3 of [RFC4585].
   This request message is not time critical and SHOULD be sent using
   regular RTCP timing.  Only if it is known that the user interface
   requires quick feedback, the message MAY be sent with early or
   immediate feedback timing.

4.1.4.  Handling of Message in Mixers and Translators

   A mixer or media translator that encodes content sent to the session
   participant issuing the TSRR SHALL consider the request to determine
   if it can fulfill it by changing its own encoding parameters.  A
   media translator unable to fulfill the request MAY forward the
   request unaltered towards the media sender.  A mixer encoding for
   multiple session participants will need to consider the joint needs
   of these participants before generating a TSRR on its own behalf



He, et al.              Expires 24 December 2022                [Page 6]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   towards the media sender.

4.2.  Temporal-Spatial Resolution Notification (TSRN)

   The TSRN message is identified by RTCP packet type value PT=PSFB and
   FMT=9.

   The FCI field SHALL contain one or more TSRN FCI entries.

4.2.1.  Message format

   The content of the FCI entry for the Temporal-Spatial Resolution
   Notification is depicted in Figure 2.


   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                              SSRC                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   Seq nr.     |         Reserved          |   Frame Rate      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Picture Width         |    Picture Height           |0 0 0|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            Syntax of an FCI Entry in the TSRN Message

                                  Figure 2

   SSRC (32 bits): The Synchronization Source (SSRC) of the source of
   the TSRR that resulted in this notification.

   Seq nr. (8 bits): The sequence number value from the TSRR that is
   being acknowledged.

   Reserved (14 bits): All bits SHALL be set to 0 by the sender and
   SHALL be ignored on reception.

   Frame Rate (10 bits): The frame rate the media sender is using
   henceforth.

   Picture Width (14 bits): The coding picture width the media sender is
   using henceforth.

   Picture Height (14 bits): The coding picture height the media sender
   is using henceforth.






He, et al.              Expires 24 December 2022                [Page 7]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   It is to note that the returned value (Frame Rate, Picture Width,
   Picture Height) may differ from the requested one, for example, in
   cases where a media encoder cannot change its frame rate or picture
   resolution, or when pre-recorded content is used.

4.2.2.  Semantics

   This feedback message is used to acknowledge the reception of a TSRR.
   For each TSRR received targeted at the session participant, a TSRN
   FCI entry SHALL be sent in a TSRN feedback message.  A single TSRN
   message MAY acknowledge multiple requests using multiple FCI entries.
   The Frame Rate, Picture Width and Picture Height value included SHALL
   be the same in all FCI entries of the TSRN message.  Including an FCI
   for each requestor allows each requesting entity to determine that
   the media sender received the request.  The notification SHALL also
   be sent in response to TSRR repetitions received.  If the request
   receiver has received TSRR with several different sequence numbers
   from a single requestor, it SHALL only respond to the request with
   the highest (modulo 256) sequence number.  Note that the highest
   sequence number may be a smaller integer value due to the wrapping of
   the field.  Appendix A.1 of [RFC3550] has an algorithm for keeping
   track of the highest received sequence number for RTP packets; it
   could be adapted for this usage.

   The TSRN SHALL include the Temporal-Spatial Resolution Frame Rate,
   Picture Width and Picture Height that will be used as a result of the
   request.  This is not necessarily the same Frame Rate, Picture Width
   and Picture Height as requested, as the media sender may need to
   aggregate requests from several requesting session participants.  It
   may also have some other policies or rules that limit the selection.

   Within the common packet header for feedback messages (as defined in
   section 6.1 of [RFC4585]), the "SSRC of packet sender" field
   indicates the source of the Notification, and the "SSRC of media
   source" is not used and SHALL be set to 0.  The SSRCs of the
   requesting entities to which the Notification applies are in the
   corresponding FCI entries.

4.2.3.  Timing Rules

   The timing follows the rules outlined in section 3 of [RFC4585].
   This acknowledgement message is not extremely time critical and
   SHOULD be sent using regular RTCP timing.








He, et al.              Expires 24 December 2022                [Page 8]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


4.2.4.  Handling of TSRN in Mixers and Translators

   A mixer or translator that acts upon a TSRR SHALL also send the
   corresponding TSRN.  In cases where it needs to forward a TSRR
   itself, the notification message MAY need to be delayed until the
   TSRR has been responded to.

5.  Security Considerations

   The defined messages have certain properties that have security
   implications.  These must be addressed and taken into account by
   users of this protocol.

   Spoofed or maliciously created feedback messages of the type defined
   in this specification can have the following implications:

   *  severely reduced picture resolution due to false TSRR messages
      that sets the picture width and height to a very low value;

   *  severely reduced frame rate due to false TSRR messages that sets
      the frame rate to a very low value.

   To prevent these attacks, there is a need to apply authentication and
   integrity protection of the feedback messages.  This can be
   accomplished against threats external to the current RTP session
   using the RTP profile that combines Secure RTP [SRTP] and AVPF into
   SAVPF [SAVPF].  In the mixer cases, separate security contexts and
   filtering can be applied between the mixer and the participants, thus
   protecting other users on the mixer from a misbehaving participant.

6.  IANA Considerations

   Placeholder

7.  References

7.1.  Normative References

   [GreenMetadata]
              "ISO/IEC DIS 23001-11, Information technology - MPEG
              Systems Technologies - Part 11: Energy-Efficient Media
              Consumption (Green Metadata)", 2022,
              <https://www.iso.org/standard/73674.html>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.



He, et al.              Expires 24 December 2022                [Page 9]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
              "Extended RTP Profile for Real-time Transport Control
              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
              DOI 10.17487/RFC4585, July 2006,
              <https://www.rfc-editor.org/info/rfc4585>.

   [RFC5104]  Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
              "Codec Control Messages in the RTP Audio-Visual Profile
              with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
              February 2008, <https://www.rfc-editor.org/info/rfc5104>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

7.2.  Informative References

   [AVC]      "Advanced video coding, ITU-T Recommendation H.264", 2021,
              <https://www.itu.int/rec/T-REC-H.264>.

   [HEVC]     "High efficiency video coding, ITU-T Recommendation
              H.265", 2021, <https://www.itu.int/rec/T-REC-H.265>.

   [SAVPF]    Ott, J. and E. Carrara, ""Extended Secure RTP Profile for
              RTCP-based Feedback (RTP/SAVPF)"", 2008,
              <https://datatracker.ietf.org/doc/pdf/rfc5124>.

   [SRTP]     Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
              Norrman, "The Secure Real-time Transport Protocol(SRTP)",
              2004, <https://datatracker.ietf.org/doc/pdf/rfc3711>.

   [VVC]      "Versatile Video Coding, ITU-T Recommendation H.266",
              2022, <http://www.itu.int/rec/T-REC-H.266>.

Appendix A.  Change History

   To RFC Editor: PLEASE REMOVE ThIS SECTION BEFORE PUBLICATION

   draft-he-avtcore-rtcp-green-metadata-00 ........ initial version

Authors' Addresses





He, et al.              Expires 24 December 2022               [Page 10]


Internet-Draft      RTCP Messages for Green Metadata           June 2022


   Yong He
   Qualcomm
   5775 Morehouse Drive
   San Diego,  92121
   United States of America
   Email: yong.he@qti.qualcomm.com


   Waqar Zia
   Qualcomm
   Anzinger Str. 13
   81671 Munich
   Germany
   Email: wzia@qti.qualcomm.com


   Christian Herglotz
   FAU
   Schlossplatz 4
   91054 Erlangen
   Germany
   Email: christian.herglotz@fau.de


   Edouard Francois
   InterDigital
   975 Avenue des Champs Blancs
   35576 Cesson-Sevigne
   France
   Email: edouard.francois@interdigital.com





















He, et al.              Expires 24 December 2022               [Page 11]