Skip to main content

Video BFrame RTP Header Extension
draft-deping-avtcore-video-bframe-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Expired".
Author li
Last updated 2022-07-25
RFC stream (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-deping-avtcore-video-bframe-00
avtcore                                                            D. Li
Internet-Draft                                                 ByteDance
Intended status: Standards Track                            26 July 2022
Expires: 27 January 2023

                   Video BFrame RTP Header Extension
                  draft-deping-avtcore-video-bframe-00

Abstract

   This document describes an RTP header extension used to convey
   decoding time information about video when Bi-directional predicted
   frames exist.It adds CompositionTime(CTS) as value so that receiver
   can decode video with correct sequence.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 January 2023.

Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Li                       Expires 27 January 2023                [Page 1]
Internet-Draft                     VBF                         July 2022

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  RTP header extension format . . . . . . . . . . . . . . . . .   3
     3.1.  Video rtp sender  . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Video rtp receiver  . . . . . . . . . . . . . . . . . . .   4
     3.3.  Usage considerations  . . . . . . . . . . . . . . . . . .   4
   4.  Session Description Protocol (SDP) Signaling  . . . . . . . .   5
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   5
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   5
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   5
   8.  Normative References  . . . . . . . . . . . . . . . . . . . .   5
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   6

1.  Introduction

   As video codec, H264/HEVC is widely used in RTP base system.  Those
   codec support I-Frame, B-Frame, and P-frame . Most RTP systems do not
   support B-Frame, while B-Frame is widely used in streaming systems,
   with the rapid deploy of Real Time Communication(RTC) in low latency
   streaming scenario, support for Bi-directional predicted frames in
   RTP base system are necessary.

   Video streams contain a lot of details, including timestamps, so a
   decoder knows how to handle the content properly.  The
   DTS(DecodingTimeStamp) decides when a frame has to be decoded, while
   the PTS(PresentationTimeStamp) describes when a frame has to be
   presented.This difference becomes important when using B-frames,
   which are frames that can have references to frames in the past, but
   also to frames in the future.  Given that, there will be frames in
   the future, which a decoder needs to decode first in order to use
   them as reference.  Therefore, decoder needs DTS when B-frames exist,
   while, the RTP timestamp reflects the presentation time(PTS) only.
   This document specifies an RTP extension header that allows video rtp
   senders deliver CTS(CompositionTime) to rtp receiver .

   The CTS value is PTS minus DTS.  Therefore , the rtp receiver gets
   DTS value via RTP timestamp adding CTS value.

   This new header extension uses the general mechanism for RTP header
   extensions as described in ([RFC5285])].  Rtp sender only needs to
   add CTS to the first rtp packet when the video frame contains several
   packets, which reduces overhead.

Li                       Expires 27 January 2023                [Page 2]
Internet-Draft                     VBF                         July 2022

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   *  RTP: Real-time Transport Protocol (RFC 3550)

   *  RTCP: RTP Control Protocol (RFC 3550)

   *  RTCP RR: RTCP Receiver Report

   *  RTCP SR: RTCP Sender Report

   *  SDP: Session Description Protocol (RFC 4566)

   *  Clock Rate: The multiplier used to convert from a wallclock value
      in seconds to an equivalent RTP timestamp value (without the fixed
      random offset).  Note that RFC 3550 uses various terms like "clock
      frequency", "media clock rate", "timestamp unit", "timestamp
      frequency", and "RTP timestamp clock rate" as synonymous to clock
      rate.

   *  RTP Sender: A logical network element that sends RTP packets,
      sends RTCP SR packets, and receives RTCP reception report blocks.

   *  RTP Receiver: A logical network element that receives RTP packets,
      receives RTCP SR packets, and sends RTCP reception report blocks.

   *  RTC: Real Time Communication

   *  PTS: Video Presentation TimeStamp

   *  DTS: Video Decoding TimeStamp

   *  CTS: Video CompositionTime

3.  RTP header extension format

   The general RTP payload format follows the RTP header format
   ([RFC3550]) and generic RTP header extensions ([RFC8285]), RTP header
   extension MAY encoded using the one-byte header or two-byte header as
   described in ([RFC8285]).  The two-byte header format is used as an
   example in this memo.

Li                       Expires 27 January 2023                [Page 3]
Internet-Draft                     VBF                         July 2022

   The following RTP header extension is RECOMMENDED.  The ID is
   assigned per ([RFC8285]), and format is shown below.

    0                   1                   2
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | ID | Len=2 |              cts              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                         Figure 1: extension format

   ID: extension id.

   cts: PTS minus DTS and divide by 90 (Video Clock Rate)

3.1.  Video rtp sender

   The video sender here MAY be video client or middle box perform RTP
   switch.  Video client MAY encode video with B-frame, it SHOULD add
   this rtp header extension in the rtp packetization module . Only
   adding in the first rtp packet is RECOMMENDED when the video frame
   contains multi rtp packets, which will reduce overhead.  The middle
   box MAY perform RTMP or other streaming video protocols translate to
   rtp streams work, it SHOULD add this header extension when streaming
   video contains B-frame.

3.2.  Video rtp receiver

   The video rtp receiver here is a client which decodes video . It
   SHOULD extract cts value when this extension exists , and calculate
   DTS value with rtp timestamp(PTS) and CTS.

   DTS = PTS - CTS * 90

   90 is video clock rate, Video receiver construction frame and put to
   jitter buffer, decoder MUST decode frame by DTS sequence, and video
   render module MUST render the decoded frame with PTS sequence, which
   come from rtp timestamp.

3.3.  Usage considerations

   In practice, when receiver that decode video does not support
   B-frame, In order to successfully decode an incoming video stream, it
   is RECOMMENDED An RTP middle box discard B-frame when video rtp
   sender contains B-frame, the decoder at the Endpoint SHOULD add
   whether it support video B-frame capability in SDP payload format
   specific paramaters(a=fmtp), and follow the Offer/Answer procedure
   describe in ([RFC8285]).

Li                       Expires 27 January 2023                [Page 4]
Internet-Draft                     VBF                         July 2022

4.  Session Description Protocol (SDP) Signaling

   The URI for declaring this header extension in an extmap attribute is
   "urn:ietf:params:rtp-hdrext:CompositionTime".  It does not contain
   any extension attributes, It follows the standard mechanism described
   in ([RFC8285]) An example attribute line in SDP:

   a=extmap:19 uri:ietf:rtc:rtp-hdrext:video:CompositionTime;

5.  Security Considerations

   The security considerations of the RTP specification ([RFC3550]) and
   the general mechanism for RTP header extensions ([RFC8285]) apply.
   and all the security considerations of typologies ([RFC7667])
   ([RFC7201]) for these two types of RTP intermediaries are applicable
   to this header extension.

   Security considerations for SDP are described in the corresponding
   section in ([RFC8866]), In the Secure Real-time Transport Protocol
   (SRTP) ([RFC3711]), RTP header extensions are authenticated but not
   encrypted.  When this header extension is used, cts are therefore
   visible on a frame-by-frame basis to an attacker passively observing
   the video stream, In scenarios where this is a concern, additional
   mechanisms MUST be used to protect the confidentiality of the header
   extension.  This mechanism could be header extension encryption
   ([RFC6904]), or a lower-level security and authentication mechanism
   such as IPsec ([RFC4301]).

6.  IANA Considerations

   IANA has registered the following entry in the "RTP Compact Header
   Extensions" registry: Extension URI: uri:ietf:rtc:rtp-
   hdrext:video:CompositionTime Description: video B frame
   compositionTime Contact: lideping.byter@bytedance.com

7.  Acknowledgements

8.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

Li                       Expires 27 January 2023                [Page 5]
Internet-Draft                     VBF                         July 2022

   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
              RFC 3711, DOI 10.17487/RFC3711, March 2004,
              <https://www.rfc-editor.org/info/rfc3711>.

   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
              Internet Protocol", RFC 4301, DOI 10.17487/RFC4301,
              December 2005, <https://www.rfc-editor.org/info/rfc4301>.

   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
              Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
              2008, <https://www.rfc-editor.org/info/rfc5285>.

   [RFC6904]  Lennox, J., "Encryption of Header Extensions in the Secure
              Real-time Transport Protocol (SRTP)", RFC 6904,
              DOI 10.17487/RFC6904, April 2013,
              <https://www.rfc-editor.org/info/rfc6904>.

   [RFC7201]  Westerlund, M. and C. Perkins, "Options for Securing RTP
              Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,
              <https://www.rfc-editor.org/info/rfc7201>.

   [RFC7667]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667,
              DOI 10.17487/RFC7667, November 2015,
              <https://www.rfc-editor.org/info/rfc7667>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC8285]  Singer, D., Desineni, H., and R. Even, Ed., "A General
              Mechanism for RTP Header Extensions", RFC 8285,
              DOI 10.17487/RFC8285, October 2017,
              <https://www.rfc-editor.org/info/rfc8285>.

   [RFC8866]  Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP:
              Session Description Protocol", RFC 8866,
              DOI 10.17487/RFC8866, January 2021,
              <https://www.rfc-editor.org/info/rfc8866>.

Author's Address

   Deping li
   ByteDance
   Email: lideping.byter@bytedance.com

Li                       Expires 27 January 2023                [Page 6]