Internet Engineering Task Force                 Audio-Video Transport WG
INTERNET-DRAFT                                                  D. Tynan
draft-tynan-rtp-bt656-00.txt                              Claddagh Films
                                                         July 28th, 1997
                                                         Expires: 2/2/98

                RTP Payload Format for BT.656-3 Encoding

Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups.  Note that other groups may also distribute
   working documents as Internet Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced or obsoleted by other documents at any
   time.  It is not appropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on (Africa), (Europe), (Pacific Rim), (US East Coast), or (US West Coast).

   Distribution of this document is unlimited.


          This document specifies the RTP payload format for
          encapsulating ITU Recommendation BT.656-3 video streams in the
          Real-Time Transport Protocol (RTP).  Each RTP packet contains
          one scan line as defined by ITU Recommendation BT.601-5, and
          includes decoding and positioning information.

1. Introduction

   This document describes a scheme to packetize a BT.656-3 video stream
   for transport using RTP [1].  A BT.656-3 video stream is defined by
   ITU-R Recommendation BT.656-3 [2], as a means of interconnecting
   digital television equipment operating on the 525-line or 625-line
   standards, and complying with the 4:2:2 encoding parameters as
   defined in ITU-R Recommendation BT.601-5 (formerly CCIR-601) [3],
   Part A.

Tynan                                                           [Page 1]

Internet-Draft        draft-tynan-rtp-bt656-00.txt       July 28th, 1997

   RTP is defined by the Internet Engineering Task Force (IETF) to
   provide end-to-end network transport functions suitable for
   applications transmitting real-time data over multicast or unicast
   network services.  The complete specification of RTP for a particular
   application will require a profile specification document [4], a
   payload format specification, and an RTP protocol document [1].  This
   document is intended to serve as the payload format specification for
   BT.656-3 video streams.

2. Definitions

   For the purposes of this document, the following definitions apply:

   Cb, Cr: An 8-bit coded color-difference sample (as per BT.601-5 [3]).
   Each color-difference value has 225 quantization levels in the centre
   part of the quantization scale with zero corresponding to 128.

   Y: An 8-bit coded luminance sample (as per BT.601-5 [3]).  Each
   luminance value has 220 quantization levels with the black level
   corresponding to level 16 and the peak white level corresponding to

   SAV, EAV: Video timing reference codes which appear at the start and
   end of a BT.656 scan line.

3. Payload Design

   ITU Recommendation BT.656-3 defines a schema for the digital
   interconnection of television video signals in conjunction with
   BT.601-5 which defines the digital representation of the original
   analog signal.  While BT.601-5 refers to images with or without color
   subsampling, the interconnection standard (BT.656-3) specifically
   requires 4:2:2 subsampling.  This specification also requires 4:2:2
   subsampling such that the luminance stream occupies twice the
   bandwidth of each of the two color-difference streams.  For normal
   4:3 aspect ratio images, this results in 720 luminance samples per
   scan line, and 360 samples of each of the two chrominance channels.
   The total number of 8-bit samples per scan line then, is 1440.

   Due to the lack of any form of video compression within BT.656-3, the
   resultant video stream can be considered ``broadcast quality''.
   However, such a stream requires approximately 20 megabytes per second
   of network bandwidth.  In order to minimize packet size and to
   optimize the packetization of the stream, each video scan line is
   encoded within an RTP packet.

   To allow for scan line synchronization, each packet includes certain

Tynan                                                           [Page 2]

Internet-Draft        draft-tynan-rtp-bt656-00.txt       July 28th, 1997

   flag bits (as defined in BT.656-3) and a unique scan line number.
   The SAV and EAV timing reference codes are removed.  Furthermore, no
   line blanking samples are included, so no ancillary data can be
   included in the line blanking period.  It is the responsibility of
   the receiver to generate the timing reference codes, and to insert
   the correct number of line blanking samples.

   Similarly, to save on bandwidth, there is no requirement that the
   frame blanking samples be provided.  However, it is possible to
   include frame blanking samples if such samples contain relevant
   information, such as a vertical-interlace time code, or teletext
   data.  In the absence of frame blanking samples, the receiver must
   generate black levels as described below, to complete the correct
   number of scan lines per field.  If frame blanking samples are
   provided, they must be copied without modification into the resultant
   BT.656-3 stream.

   Scan lines must be sent in sequential order.  Missing scan lines can
   be replaced by the same scan line of the previous frame, the last
   scan line received, or true black at the discretion of the receiver.

4. Usage of RTP

   Due to the unreliable nature of the RTP protocol, and the lack of an
   orderly delivery mechanism, each packet contains enough information
   to form a single scan line without reference to prior scan lines or
   prior frames.  In addition to the RTP header, a fixed length payload
   header is included in each packet.  This header is four octets in

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   |                           RTP Header                          |
   |                         Payload Header                        |
   |                          Payload Data                         |
   |                                .                              |
   |                                .                              |

4.1. RTP Header usage

   Each RTP packet starts with a fixed RTP header.  The following fields
   of the RTP fixed header are used for BT.656-3 encapsulation:

   Marker bit (M): The Marker bit of the RTP header is set to 1 when the

Tynan                                                           [Page 3]

Internet-Draft        draft-tynan-rtp-bt656-00.txt       July 28th, 1997

   current packet carries the final scan line of the current frame.  0

   Payload Type (PT): The Payload Type shall specify a BT.656-3 video
   payload format.

   Timestamp: The RTP Timestamp encodes the sampling instance of the
   video frame currently being rendered.  All scan line packets within
   the same frame will have the same timestamp.  The timestamp should
   refer to the 'Ov' field synchronization point of the first (even)
   field.  For BT.656-3 payloads, the RTP timestamp is based on a 90kHz

5. Payload Header

   The payload header is a fixed four-octet header broken down as

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   | N |S|F|V|    Reserved (Must Be 0)     |    Scan Line (SL)     |

   N: 2 bits
   This field indicates the type of frame encoding within the payload.
   Currently only two types of encoding are defined.  If this field is
   01, then an NTSC frame encoding (525 line/60 fields per second) is
   specified.  If the field is 00, then a PAL frame (625 line/50 fields
   per second) is specified.

   S: 1 bit
   Sample rate.  When 1, the sample rate is 18MHz.  Otherwise, the
   normal sample rate of 13.5MHz is used.  In the high sample rate case
   (S=1), the scan line width is 1144 for NTSC frames (N=1) or 1152
   samples for PAL frames (N=0).  With the normal sample rate (S=0), the
   scan line width is always 720 samples regardless of encoding.  High
   sample rates are used only for 16:9 systems.

   F: 1 bit
   When 0, indicates the first field of a frame (line 1 through 312
   inclusive for N=0, and line 4 through 265 inclusive for N=1).  Any
   other scan line is considered a component of the second field, and
   the F bit will be set to 1.

   V: 1 bit
   When 1, indicates that the scan line is part of the vertical
   interval.  Should always be 0 unless frame blanking data is sent.  In

Tynan                                                           [Page 4]

Internet-Draft        draft-tynan-rtp-bt656-00.txt       July 28th, 1997

   which case, the V bit should be set to 1 for scan lines which do not
   form an integral part of the image.  For receivers which do not
   receive scan lines during the vertical interval, BT.656 vertical
   interval data should be generated by repeating the 4-byte sequence
   0x80, 0x10, 0x80, 0x10, representing a true black level.

   Reserved: 15 bits
   Reserved for future use, must be set to zero.

   Scan Line (SL): 12 bits
   Indicates the scan line encapsulated in the payload.  Valid values
   range from 1 through 625 inclusive.  If no frame blanking data is
   being transmitted, only scan lines 23 through 310 inclusive, and
   lines 336 through 623 inclusive should be sent in the case of N=0.
   For 525/60 encoding (N=1), scan lines 10 through 263 inclusive and
   lines 273 through 525 should be transmitted.

   If a receiver is generating a BT.656-3 data stream directly from this
   stream, the F and V bits must be copied from the header rather than
   being generated implicitly from the scan line number.  In the event
   of a conflict, the F and V bits have precedence.

6. Payload Format

   In keeping with the 4:2:2 color subsampling of BT.656 and BT.601, for
   each pair of color-difference samples, two luminance samples will be
   transmitted.  As per BT.656, the format for transmission shall be Cb,
   Y, Cr, Y.  The following is a representation of a 720 sample packet:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   |      Cb0      |      Y0       |      Cr0      |      Y1       |
   |      Cb1      |      Y2       |      Cr1      |      Y3       |
   |     Cb359     |     Y718      |     Cr359     |     Y719      |

   1144 and 1152 sample packets should increase the packet size
   accordingly while maintaining the sample order.

7. References

Tynan                                                           [Page 5]

Internet-Draft        draft-tynan-rtp-bt656-00.txt       July 28th, 1997

[1]   RTP: A Transport Protocol for Real-Time Applications, H.
     Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RFC 1889.
[2]   Interfaces for Digital Component Video Signals in 525-Line and
     625-Line Television Systems operating at the 4:2:2 Level of
     Recommendation ITU-R BT.601 (Part A), ITU-R Recommendation
     BT.656-3, 1995.
[3]   Studio Encoding Parameters of Digital Television for Standard 4:3
     and Wide-Screen 16:9 Aspect Ratios, ITU-R Recommendation BT.601-5,
[4]   RTP Profile for Audio and Video Conference with Minimal Control,
     H. Schulzrinne, RFC 1890.

8. Authors Address

   Dermot Tynan
   Claddagh Films Limited
   3 White Oaks
   Clybaun Road

   Tel: +353 91 529944

Tynan                                                           [Page 6]