INTERNET-DRAFT Eric Edwards
draft-ietf-avt-rtp-jpeg2000-04.txt Satoshi Futemma
Nobuyoshi Tomita
Eisaburo Itakura
Sony Corporation
October 27, 2003
Expires: April 26, 2004
RTP Payload Format for JPEG 2000 Video Streams
Status of this Memo
This document is an Internet-Draft and is in subject to all
provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-Drafts
as reference materials or to cite them other than as "work in
progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Drafts Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This document describes a payload format for transporting JPEG
2000 video streams using RTP (Real-time Transport Protocol). JPEG
2000 video streams are formed as a continuous series of JPEG 2000
still images. This payload format will allow for JPEG 2000's
scalability and robustness to be maximized in streaming
applications.
Table of Contents
1. Introduction .......................................... 2
1.1 Conventions Used in this Document ..................... 3
2. JPEG 2000 Video Features .............................. 4
3. Payload Design ........................................ 4
4. Payload Format ........................................ 4
4.1 RTP fixed header usage ................................ 4
4.2 RTP Payload header format ............................. 5
5. RTP Packetization ..................................... 7
5.1 Non-intelligent mode .................................. 8
5.2 Intelligent mode ..................................... 9
6. Scalable Delivery and Priority field .................. 10
Edwards, et al. [Page 1]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
6.1 Priority mapping table ................................ 11
6.1.1 Default priority mapping .............................. 11
6.1.2 User-defined priority table ........................... 11
6.2 Sender's Action ..................................... 12
6.3 Receiver's Action ..................................... 12
7. JPEG 2000 main header compensation .................... 13
7.1 Sender processing ..................................... 13
7.2 Receiver processing ................................... 14
8. Optional Payload Header ............................... 14
9. Security Consideration ................................ 15
10. IANA Consideration .................................... 16
10.1 MIME Registration ..................................... 16
10.2 SDP Parameters ........................................ 17
11. Intellectual Property Right Statement ................. 17
12. Informative Appendix - Recommended Practices .......... 18
13. References ........................................... 18
14. Authors' Addresses .................................... 19
15. Full Copyright Statement .............................. 20
1. Introduction
This document specifies payload formats for JPEG 2000 video
streams over the Real-time Transport Protocol (RTP). JPEG 2000 is
an ISO/IEC International Standard developed for next-generation
still image encoding. Its basic encoding technology is described
in [1][6].
Part 3 of the JPEG 2000 standard defines Motion JPEG 2000[6].
However, Part 3 defines only the file format but not the
transmission format for streaming on the Internet. For this
reason, it is necessary to define the RTP format for JPEG 2000
video streams.
JPEG 2000 supports many powerful features that are not supported
in the current JPEG standard [1][7][8]:
o Higher compression efficiency than JPEG with less visual
loss especially at extreme compression ratios.
o A single codestream that offers both lossy and superior
lossless compression.
o Robust transmission over noisy environments.
o Progressive transmission by pixel accuracy (SNR Scalability)
and resolution.
o Random codestream access and processing.
The JPEG-2000 algorithm is briefly explained below. Fig. 1
shows a block diagram of JPEG 2000 encoding method.
Edwards, et al. [Page 2]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
+-----+
| ROI |
+-----+
|
V
+----------+ +----------+ +------------+
|DC, comp. | | Wavelet | | |
raw image ==> |transform-|==>|transform-|==>|Quantization|==+
| ation | | ation | | | |
+----------+ +----------+ +------------+ |
|
+-----------+ +----------+ +------------+ |
| | | | | | |
JPEG 2000 <==| Data |<==|Arithmetic|<==|Coefficient |<=+
codestream | Ordering | | coding | |bit modeling|
+-----------+ +----------+ +------------+
Fig. 1: Block diagram of the JPEG 2000 encoder
Each color component or tile is transformed into wavelet
coefficients. The component or tile is sub-sampled into various
levels usually vertically and horizontally from high frequencies
(which contains all the sharp details) to the low frequencies
(which contains all the flat areas.) Quantization is performed on
the coefficients within each sub-band. The wavelet coefficient is
divided by the quantization step size and the result is truncated.
After quantization, code blocks are formed from within the
precincts within the tiles. Precincts are a finer separation than
tiles and code blocks are the smallest separation of the image
data. Entropy coding is performed within each code block and
arithmetically encoded by bit plane. After the coefficients of
all code blocks have been coded into a short bit stream, a header
is added turning it into a packet. The header has all the
information needed to decompress the packet into code blocks. A
group of packets are called layers.
The standard has four ways to transmit and decode a compressed
image: by resolution, quality, position, or component. Packets
can be ordered in any way to maximize these features.
This is only to serve as an introduction to JPEG 2000 and to aid
in understanding the rest of this document. Further details of
the encoder can be found in various texts on JPEG 2000 [1].
To decompress a JPEG 2000 codestream, one would follow the
reverse order of the encoding order, minus the quantization step.
It is outside the scope of this document to describe in detail
this procedure. Please refer to various JPEG 2000 texts for
details [1].
1.1 Conventions Used in this Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
Edwards, et al. [Page 3]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
in this document are to be interpreted as described in RFC2119
[2].
2. JPEG 2000 Video Features
JPEG 2000 video streams are formed as a continuous series of JPEG
2000 still images. The previously described features of JPEG 2000
can be used effectively in a streaming application. A JPEG 2000
video stream has the following merits:
In JPEG 2000 SNR is improved dramatically over classic JPEG at a
low bit rates.
This is a Full Intra format- each frame is independently
compressed - and therefore has a low encoding and decoding delay.
JPEG 2000 has flexible and accurate rate control. This is
suitable for traffic control and congestion control during network
transmission.
JPEG 2000 can provide its own codestream error resilience markers
to aid in codestream recovery.
3. Payload Design
To provide a payload format that exploits the JPEG 2000 video
stream, described in the previous section, the following must be
taken into consideration:
- Provisions for packet loss
On the Internet, 5% packet loss is common and this percentage
may sometimes come to 20% or more. To split JPEG 2000 video
streams into RTP packets, efficient packetization of the code
stream is required to minimize the effects of problems in
decoding due to missing code-blocks in error prone
environments. If the main header is lost in transmission, the
image cannot be decoded. Accordingly, a system to compensate
for the loss of the main header is required.
- A packetizing scheme that exploits JPEG 2000 functionality.
A packetizing scheme so that an image can be progressively
transmitted and reconstructed progressively by the receiver
using JPEG 2000 functionality would be very powerful. It
would allow for maximizing performance over various network
conditions and variations in computing power of receiving
platforms.
4. Payload Format
4.1 RTP fixed header usage
Edwards, et al. [Page 4]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
For each RTP packet, the RTP fixed header is followed by the JPEG
2000 payload header, which is followed by JPEG 2000 codestream.
The RTP header fields that have a meaning specific to a JPEG
2000 video stream are described as follows:
Marker bit (M): The marker bit of the RTP fixed header MUST be set
to 1 on the last RTP packet of a video frame, and otherwise, it
must be 0. When transmission is performed by multiple RTP
sessions, the bit is set in the last packet of the frame in each
session.
Payload type (PT): The payload type is dynamically assigned by
means outside the scope of this document. A payload type in the
dynamic range shall be chosen by means of an out of band
signaling protocol (e.g., RTSP, SIP, etc.)
Timestamp: The RTP timestamp is in units of 90 kHz. The same
timestamp must appear in each fragment of a given frame. When a
JPEG 2000 image is an interlaced, the odd field and the
corresponding even field have the same timestamps. The initial
value of the timestamp is random to make known plaintext attacks
on encryption more difficult, even if the source itself does not
encrypt, as the packets may flow through a translator that does.
4.2 RTP Payload header format
The RTP payload header format for JPEG 2000 video stream is as
follows:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|X|E|MHF|mh_id|T| priority | tile number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| reserved |tp | fragment offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fig. 2: RTP payload header format for JPEG 2000
X : 1 bit
Extension bit flag. This bit MUST be set to 1 when a JPEG
2000 optional payload header follows this header, the JPEG
2000 payload header, otherwise it MUST be set to 0. The
details of optional payload headers are described in Section 8
of this document.
E : 1 bit
Enable bit flag. If this bit is set to 1, it means
"intelligent packetization" described in Section 5.2. If E
bit is 0, it means non-intelligent packetization" and a
Edwards, et al. [Page 5]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
receiver MUST ignore any other payload header information
other than extension bit flag and fragment offset.
MHF (Main Header Flag) : 2 bits
MHF shows whether the main header is packed into the RTP
packet or not. When the main header exists in the RTP packet,
the sender MUST set the first bit to 1, otherwise this field
MUST set to 0. If the first bit is 1, the second bit is
valid, and if the last part of the main header is included
(either whole or fragmented), the sender MUST set the second
bit to 1. In other words, this field is either 3(=0b11) or
2(=0b10) if the main header exists in the RTP packet,
otherwise 0. Table of MHF usage is below:
+----+-------------------------------------------------------+
|MHF | Description |
+----+-------------------------------------------------------+
| 00 | no main header is packed at all |
| 01 | reserved for future use. |
| 10 | the fragmented main header (not last part) is packed. |
| 11 | a whole main header or the last part of the |
| | fragmented main header is packed. |
+----+-------------------------------------------------------+
Table 1: MHF usage values
The receiver checks MHF to determine the main header range and
may perform main header compensation described in Section 7 if
the main header is lost.
mh_id : 3 bits
Main header identification value. This is used for JPEG
2000 main header recovery. The same mh_id is used as long as
the coding parameters described in the main header remain
unchanged. The initial value of mh_id is random. Mh_id value
must increase by 1 every time a new main header is
transmitted. Once the mh_id value is greater than 7, it must
roll over and start at 1 again. Usage of this header is
described in Section 7 of this document. This field is only
valid when E bit is 1. If the E bit is 0, then this field
SHOULD be zero.
T (Tile field invalidation flag) : 1 bit
T bit indicates whether the tile number field is invalid or
not. A sender MUST set T bit when the tile number field is
invalid.
There are two cases where the tile number field is invalid.
One is the case that an RTP packet holds only the JPEG 2000
Edwards, et al. [Page 6]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
main header. In this case, a sender can not set any number in
the tile number field because no JPEG 2000 tile-part bitstream
is included in the RTP packet.
The other case is that multiple tile-part bitstreams are
packed together in an RTP packet. In general, it is advisable
to pack only one tile bitstream in an RTP packet, but if the
tile-part length is too small it is efficient to pack together
multiple tile-parts in one RTP packet. In this case it is
meaningless to assign a number (e.g. the smallest tile number)
because the number is designed for decoding an arbitrary tile
easily, which is not valid when multiple tile parts are
combined in a single packet. Therefore, T bit indication is
needed.
priority : 8 bits
The priority field indicates the importance of the JPEG 2000
packet included in the payload. Typically, a higher priority
is set in the packets containing JPEG 2000 packets containing
the lower sub-bands.
tile number : 16 bits
This field shows the tile number that a bitstream belongs to
only when the T bit is 0. A receiver can easily decode an
arbitrary tile by checking this field. If T bit is set to 1, a
receiver MUST ignore this field.
tp (type) : 2 bits
This field indicates how a JPEG 2000 image is scanned (meaning
- progressive or interlace).
0: The image is progressively scanned. On a computer monitor,
it should be displayed as-is at the specified width and
height in the JPEG 2000 main header.
1: The image is an odd field of an interlaced video signal.
The height specified in the JPEG 2000 main header is half
of the height of the entire displayed image. In a
receiver, an odd field should be de-interlaced with the
even field following it so that lines from each image can
alternate.
2: The image is an even field of an interlaced video signal.
3: The image is a single field from an interlaced video
signal, intended to be displayed full frame as if it were
received as both the odd & even field of the frame. On a
computer monitor, each line in the image should be
displayed twice, doubling the height of the image.
fragment offset : 24 bits
Edwards, et al. [Page 7]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
This value must be set to the byte offset in the JPEG 2000
data stream contained in this RTP packet.
JPEG 2000 frames are typically larger than the underlying
network's maximum transfer units (MTU), therefore frames might be
fragmented into several packets. The fragment offset is the
data offset in bytes of the current packet from the start of
the JPEG 2000 codestream. This field helps the receiver to
reassemble the JPEG 2000 codestream.
To perform scalable video delivery by using multiple RTP
sessions, the offset value from the first byte of the same
frame is set for fragment offset. It is possible, in scalable
video delivery using multiple RTP sessions, the fragment
offset may not start with 0 in some RTP sessions even if the
packet is the first one of the frame.
5. RTP Packetization
As shown in Fig. 3, a JPEG 2000 codestream is structured from the
main header beginning with the SOC marker, one or more tiles (only
one tile for no tile division), and the EOC marker to indicate the
end of the codestream. Each tile consists of a tile-part header
that starts with the SOT marker and ends with the SOD marker, and
bitstream (a series of jp2-packet).
+-- +------------+
Main | | SOC | Required as the first marker.
header| +------------+
| | main | Main header marker segments
+-- +------------+
| | SOT | Required at the beginning of each
Tile- | +------------+ tile-part header.
part | | T0,TP0 | Tile 0, tile-part 0 header marker
header| +------------+ segments
| | SOD | Required at the end of each tile-part
+-- +------------+ header
| bitstream | Tile-part bitstream
+-- +------------+
| | SOT |
Tile- | +------------+
part | | T1,TP0 |
header| +------------+
| | SOD |
+-- +------------+
| bit stream |
+------------+
| EOC | Required as the last marker in the code
+------------+ stream
Fig. 3: Construction of the JPEG 2000 codestream
Edwards, et al. [Page 8]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
Two packetization modes can be used for a JPEG 2000 RTP packet:
non-intelligent mode and intelligent mode.
A sender is allowed to packetize the JPEG 2000 codestream in
either mode, but MUST not change the mode within the same JPEG
2000 codestream. A sender may implement only one mode, but a
receiver MUST interpret both modes. A receiver identifies the
packetization mode with E bit flag in the payload header to
process the RTP packet properly.
In both modes, a sender usually partitions the JPEG 2000 codestream
in the way that IP fragmentation never occurs. Any packets larger
than the MTU size might be fragmented into multiple smaller IP
packets by the IP layer. Therefore, if one fragment is lost
during transmission, a receiver might not be able to reassemble
the IP packet, so that it is recognized as a loss of the whole
fragmented packet.
5.1 Non-intelligent mode
This mode is prepared for a thin sender, which has insufficient CPU
power to parse the JPEG 2000 codestream syntax and to partition
the codestream per jp2-packet.
In this mode, a sender segments the JPEG 2000 codestream along
arbitrary lengths into RTP packets, and E bit flag in the payload
header MUST be set to 0.
Typically, a sender fragments a JPEG 2000 codestream in a fixed
length. An example of this packetization is below:
+------+-------+-------------------------------+
|RTP |payload| JPEG 2000 codestream fragment |
|header|header | |
+------+-------+-------------------------------+
+------+-------+-------------------------------+
|RTP |payload| JPEG 2000 codestream fragment |
|header|header | |
+------+-------+-------------------------------+
...
+------+-------+-------------------------------+
|RTP |payload| JPEG 2000 codestream fragment |
|header|header | |
+------+-------+-------------------------------+
Fig. 4: Example of non-intelligent mode packetization
A receiver recognizes that the codestream is packetized in
non-intelligent mode by checking E bit flag, then RTP packets with
same RTP timestamps are de-packetized to the JPEG 2000 codestream
using fragment offset in the payload header.
In this mode, X bit and fragment offset are interpreted and any
other fields are ignored.
Edwards, et al. [Page 9]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
If a receiver receives the RTP packets in both modes and both RTP
timestamps are the same, then it SHOULD ignore the all RTP packets
with the timestamp.
5.2 Intelligent mode
In this mode, a new concept for a packetization unit is
introduced. A packetization unit is defined as either a JPEG 2000
main header, a tile-part header, or a jp2-packet.
First, a sender divides the JPEG 2000 codestream into
packetization units by parsing the codestream or by getting any
indexing informations from encoder, and then packs the
packetization units into RTP packets. A sender can put an
arbitrary number of packetization units into an RTP packet, but it
MUST preserve the codestream order. An example of this kind of
RTP packet format is below:
+------+-------+---------------+---------------+
|RTP |payload| packetization | packetization |
|header|header | unit | unit |
+------+-------+---------------+---------------+
Fig. 5 An Example of RTP packet format with multiple
packetization units
Sometimes, packetization units may be not 32-bits aligned, so
additional padding octets are needed. In an RTP packet with
multiple packetization units, any required paddings MUST be added
at the end of concatenated packetization units.
If a packetization unit is larger than MTU size, it can be
fragmented. To pack a fragmented packetization unit, the
fragmented unit MUST NOT be packed with the succeeding
packetization unit into the same RTP packet. An example of this
kind of RTP packet format is below:
+------+-------+-----------------------------+
|RTP |payload| packetization unit fragment |
|header|header | |
+------+-------+-----------------------------+
+------+-------+-----------------------------+
|RTP |payload| packetization unit fragment |
|header|header | |
+------+-------+-----------------------------+
...
+------+-------+-----------------------------+
|RTP |payload| packetization unit fragment |
|header|header | |
+------+-------+-----------------------------+
Fig. 6 An Example of RTP packet format with a fragmented
Edwards, et al. [Page 10]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
packetization unit
6. Scalable Delivery and Priority field
JPEG 2000 codestream has rich functionality built into it so
decoders can easily handle scalable delivery or progressive
transmission. Progressive transmission that allows images to be
reconstructed with increasing pixel accuracy or spatial resolution
is essential for many applications. This feature allows the
reconstruction of images with different resolutions and pixel
accuracy, as needed or desired, for different target devices. The
largest image source devices can provide a code stream that is
easily processed for the smallest image display device.
The jp2-packets contain all compressed image data from a
specific layer, a specific component, a specific resolution level,
and a specific precinct. The order in which these jp2-packets are
found in the codestream is called the "progression order". The
ordering of the jp2-packets can progress along four axes: layer,
component, resolution level and precinct.
Providing a priority field to indicate importance of data contained
in a given RTP packet can exploit JPEG 2000 progressive and
scalable functions.
The lower the number of priority value is the higher priority. In
other words, the priority value 0 is the highest priority and 255
is the lowest priority. We define the priority value 0 as special
priorities for the headers (the main header or tile-part header)
When any headers (the main header or tile-part header) are packed
into the RTP packet, the sender MUST set the priority value to 0.
6.1 Priority mapping table
For the progression order, the priority value for each jp2-packet
is given by the priority mapping table. There are two types of
priority mapping: default priority mapping and user-defined
priority mapping. In principle, the priority mapping table is
negotiated between the sender and the receiver through external
protocols (such as: RTSP, SIP, etc), which not within the scope of
this document. However, in some environments such as a multicast
video-conference environment, it might be difficult to negotiate
the priority-mapping table between senders and receivers. We
define the default priority mapping for such a situation. The
receiver interprets the priority as a user-defined priority value
only when the priority-mapping table has been negotiated and
otherwise the receiver interprets as the default priority.
6.1.1 Default priority mapping
The JPEG 2000 codestream is ordered in some progression order and
in the most cases the foremost jp2-packets are more important
than the latter ones.
Edwards, et al. [Page 11]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
In default priority mapping, a priority value is defined as
jp2-packet sequence number, in which the first jp2-packet in a
tile MUST be assigned the value 1. For every successive packet
this number is incremented by one. When the maximum number (=255)
is reached, the number remains at 255. A jp2-packet sequence
number is also hinted from Nsop of SOP marker segment (Annex A.8.1
[1]) in the JPEG 2000 codestream.
6.1.2 User-defined priority table
The user-defined priority table is freely defined by users, but
priority value 0 MUST be used for the headers (the main header and
tile-part headers).
For example, in the LRCP order codestream with 3 layers and 3
resolutions, the user-defined priority table can be defined below
(the format is not significant). 4 level priorities is defined
in the below example.
priority 1: L=0,R=0, C=any, P=any
priority 2: L=0,R=1-2, C=any, P=any
priority 3: L=1,R=any, C=any, P=any
priority 4: L=2,R=any, C=any, P=any
As another example, the resolution-based priority table can be
defined as below:
Priority 1: R=0, L=0, C=any, P=any
Priority 2: R=0, L=1-2, C=any, P=any
Priority 3: R=1, L=any, C=any, P=any
Priority 4: R=2, L=any, C=any, P=any
As another example, the component-based priority table can be
defined as below:
Priority 1: C=0, L=0, R=0, P=any
Priority 2: C=0, L=0, R=any, P=any
C=0, L=any, R=0, P=any
Priority 3: C=1-2, L=any, R=any, P=any
To change the priority-mapping table, a new priority-mapping table
must be sent from the sender to the receiver as needed.
6.2 Sender's Action
A priority value is given in accordance with the priority mapping
table. If multiple jp2-packets are packed into the same RTP
packet, the lowest priority value is set.
Accordingly, a sender can transmit each priority using separate
multiple RTP sessions. For example, in layered multicast a sender
can transmit each priority through each multicast group.
Edwards, et al. [Page 12]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
6.3 Receiver's Action
Progressive transmission that allows images to be reconstructed
with increasing pixel accuracy or spatial resolution is essential
for many applications. This feature allows the reconstruction of
images with different resolutions and pixel accuracy, as needed or
desired, for different target devices. The image architecture
provides for the efficient delivery of image data in many
applications such as client/server applications. The receiver
should decode packets above a certain priority to obtain maximum
performance depending on the receiver's platform.
The receiver can determine on its own (using or not using the
mapping table or other variables) the priority value level the RTP
packets it should decode.
For example, when a less powerful CPU is used or the terminal has
only a low-resolution display, decoding only RTP packets below a
certain priority permits obtaining optimal performance.
If any high-priority RTP packet is not received when a packet loss
occurs, frame(s) can be skipped because no significant visual loss
may be perceived even if decoding can be successfully performed.
When the priority value is uninterpreted or unexpected,
a receiver MUST ignore the priority field of this RTP packet.
7. JPEG 2000 main header compensation
The JPEG 2000 main header has various encoding parameters. A
decoder decodes the JPEG 2000 codestream by using the parameters
described in the JPEG 2000 main header. If an RTP packet is lost
with the JPEG 2000 main header, the corresponding JPEG 2000
codestream cannot be decoded, even if all of the following RTP
packets has been successfully received.
A recovery of the main header that has been lost is very simple
with this procedure. In the case of JPEG 2000 video, it is common
that encode parameters will not vary greatly from each successive
frame. Even if the RTP packet including the main header of a frame
has dropped, decoding processing may be performed by using the
main header of the previous frame if this previous frame is
already encoded by the same encode parameters.
The mh_id field of the payload header is used to recognize whether
the encoding parameters of the main header are the same as the
encoding parameters of the previous frame. The same value is set
in mh_id of the RTP packet in the same frame. Mh_id and encode
parameters are not associated with each other as 1:1 but they are
used to recognize whether the encode parameters of the previous
frame are the same or not.
The mh_id field value SHOULD be saved from previous frames to be
used to recover the current frame's main header, if lost. If the
Edwards, et al. [Page 13]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
mh_id of the current frame has the same value as the mh_id value
of the previous frame, the previous frame's main header SHOULD be
used to decode the current frame, in case of a lost header.
The sender MUST increment mh_id when parameters in the header
change and send a new main header accordingly.
The receiver MAY use the mh_id and MAY retain the header for such
compensation.
7.1 Sender processing
The sender must transmit RTP packets with the same mh_id value
unless the encoder parameters are different from the previous
frame. The encode parameters are the fixed information marker
segment (SIZ marker) and functional marker segments (COD, COC,
RGN, QCD, QCC, and POC) specified in JPEG 2000 Part 1 Annex A [1].
If the encode parameters have been changed, the sender
transmitting RTP packets MUST increment the mh_id value by one.
The initial mh_id value should be 1. When the mh_id value exceeds
7, the value MUST return to 1 again.
If the mh_id field is set to 0, the receiver MUST not save the
main header and MUST NOT compensate for lost headers using the
above method.
7.2 Receiver processing
When the receiver has received the main header correctly, the RTP
sequence number, the mh_id and main header should be saved except
when the mh_id value is 0. Only the last main header that was
received correctly SHOULD be saved. That is, if there has been a
saved main header, the previous one is deleted and the new main
header is saved.
When the main header is not received, the receiver compares the
current mh_id value (this mh_id can be known by receiving at least
one RTP packet) with the saved mh_id value. When the values are
the same, decoding may be performed by using the saved main
header.
Knowing whether the main header is lost or not maybe difficult,
especially when the main header is fragmented.
In all cases, the main header will start with fragment offset = 0.
In the case of fragmented main header, only the first fragment
will have the fragment offset = 0.
8. Optional Payload Header
An optional payload header is intended for sending application
specific data. When X bit in the payload header is set, an
optional payload header follows the payload header. The JPEG 2000
video stream payload comes after the optional payload header.
Edwards, et al. [Page 14]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
When X bit in payload header is set, a receiver MUST process the
optional payload header. An optional payload header that a
receiver cannot recognize MUST be skipped in specified length.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| optype |X| length | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| option specific format ..... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fig. 7 : JPEG 2000 video stream optional payload header generic
format
Optype : 7 bits
Optype describes the optional payload header type. Optypes
0-63 are reserved as fixed, well-known mappings to be defined
by future revisions of this document. Optypes 64-127 can be
freely used for an application's own definition. If some
options would be fully tested and widely used, they shall be
registered with the Internet Assigned Number Authority (IANA).
X : 1 bit
Further extension bit. This must be set to 1 if another
optional payload header follows this optional payload header;
otherwise it must be set to 0.
When the extension bit of the optional header is 1, another
optional payload header MUST come immediately after this
optional payload header.
length : 16 bits
This value must be the length of optional header in bytes
(including optype, X, length field). The receiver shall
perform processing for the optional header when the extension
bit of the JPEG 2000 payload header is 1.
9. Security Consideration
RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP
specifications[3]. This implies that confidentiality of the media
streams is achieved by encryption. Because the data compression
used with this payload format is applied end-to-end, encryption
may be performed on the compressed data so there is no conflict
between the two operations.
Edwards, et al. [Page 15]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
A potential denial-of-service threat exists for data encodings
using compression techniques that have non-uniform receiver-end
computational load. The attacker can inject pathological
datagrams into the stream which are complex to decode and cause
the receiver to be overloaded. However, JPEG 2000 coding does not
exhibit any significant non-uniformity.
If QoS enhanced service is used, RTP receivers SHOULD monitor
packet loss to ensure that the service that was requested is
actually being delivered. If it is not, then they SHOULD assume
that they are receiving best-effort service and behave accordingly.
If best-effort service is being used, users of this payload format
MUST monitor packet loss to ensure that the packet loss rate is
within acceptable parameters. Packet loss is considered
acceptable if a TCP flow across the same network path,
experiencing the same network conditions, would achieve an average
throughput, measured on a reasonable timescale, that is not less
than the RTP flow is achieving. This condition can be satisfied
by implementing congestion control mechanisms to adapt the
transmission rate (or the number of layers subscribed for a
layered multicast session), or by arranging for a receiver to
leave the session if the loss rate is unacceptably high.
As with any IP-based protocol, in some circumstances a receiver
may be overloaded simply by the receipt of too many packets,
either desired or undesired. Network-layer authentication may be
used to discard packets from undesired sources, but the processing
cost of the authentication itself may be too high. In a multicast
environment, pruning of specific sources may be implemented in
future versions of IGMP [9] and in multicast routing protocols to
allow a receiver to select which sources are allowed to reach it.
10. IANA Consideration
10.1 MIME Registration
This document defines a new RTP payload name and associated MIME
type, jpeg2000. The MIME registration form for JPEG 2000 video
stream is enclosed below:
MIME media type name: video
MIME subtype name: jpeg2000
Required parameters: none
Optional parameters: none
Encoding considerations:
JPEG 2000 video stream can be transmitted
with RTP as specified in RFC XXXX.
Edwards, et al. [Page 16]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
Security considerations: see section 9 of RFC XXXX.
Interoperability considerations:
JPEG 2000 video stream is a sequence of JPEG 2000 still
images. An implementation in compliant with [1] can decode and
attempt to display the encoded JPEG 2000 video stream.
Published specification: ISO/IEC 15444-1, RFC XXXX
Applications which use this media type:
video streaming and communication.
Additional information: none
Magic number(s): none
File extension(s): none
Macintosh File Type Code(s): none
Person & email address to contact for further information:
Eric Edwards
Email: Eric.Edwards@am.sony.com
Intended usage: COMMON
Author/Change controller:
Eric Edwards
Email: Eric.Edwards@am.sony.com
10.2 SDP Parameters
The MIME media type video/jpeg2000 string is mapped to fields in
the Session Description Protocol (SDP) [4] as follows:
o The media name in the "m=" line of SDP MUST be video.
o The encoding name in the "a=rtpmap" line of SDP MUSE be jpeg2000
(the MIME subtype).
o The clock rate in the "a=rtpmap" line MUSE be 90000.
Therefore, an example of media representation in SDP is as
follows:
m=video 49170/2 RTP/AVP 98
a=rtpmap:98 jpeg2000/90000
11. Intellectual Property Right Statement
The IETF takes no position regarding the validity or scope of any
Edwards, et al. [Page 17]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
intellectual property or other rights that might be claimed to
pertain to the implementation or use of the technology described
in this document or the extent to which any license under such
rights might or might not be available; neither does it represent
that it has made any effort to identify any such rights.
Information on the IETF's procedures with respect to rights in
standards-track and standards-related documentation can be found
in BCP-11. Copies of claims of rights made available for
publication and any assurances of licenses to be made available,
or the result of an attempt made to obtain a general license or
permission for the use of such proprietary rights by implementors
or users of this specification can be obtained from the IETF
Secretariat.
The IETF invites any interested party to bring to its attention
any copyrights, patents or patent applications, or other
proprietary rights which may cover technology that may be required
to practice this standard. Please address the information to the
IETF Executive Director.
The IETF has been notified of intellectual property rights claimed
in regard to some or all of the specification contained in this
document. For more information consult the online list of claimed
rights.
12. Informative Appendix - Recommended Practices
As the JPEG 2000 coding standard is highly flexible, many
different but compliant data streams can be produced and still be
labeled as a JPEG 2000 data stream.
The following is a set of recommendations set forth from our
experience in developing JPEG 2000 and this payload
specification. Implementations of this standard must handle all
possibilities mentioned in this specification. The following is a
listing of items an implementation could optimize.
Error Resilience Markers
The use of error resilience markers in the JPEG 2000 data
stream is highly recommended in all situations. Error
recovery with these markers is helpful to the decoder and save
external resources. Markers such as: RESET, RESTART, and
ERTERM.
YPbPr Color space
The YPbPr color space provides the greatest amount of
compression in color with respect to the human visual
system. When used with JPEG 2000, the usage of this color
space can provide excellent visual results at extreme bit
rates.
Edwards, et al. [Page 18]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
Progression Ordering
JPEG 2000 offers many different ways to order the final code
stream to optimize the transfer with the presentation. The
most useful ordering in our usage cases have been for layer
progression and resolution progression ordering.
Tiling and Packets
JPEG 2000 packets are formed regardless of the encoding
method. The encoder has little control over the size of these
JPEG 2000 packets as they maybe large or small.
Tiling splits the image up into smaller areas and each are
encoded separately. With tiles, the JPEG 2000 packet sizes
are also reduced. When using tiling, almost all JPEG 2000
packet sizes are an acceptable size (i.e. smaller than the MTU
size of most networks.)
13. References
Normative References
[1] ISO/IEC JTC1/SC29, ISO/IEC 15444-1 "Information technology -
JPEG 2000 image coding system - Part 1: Core coding system",
July 2002.
[2] S. Bradner, "Key words for use in RFCs to Indicate Requirement
Levels", BCP14, RFC2119, March 1997.
[3] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson,
"RTP: A Transport Protocol for Real Time Applications", RFC
1889, January 1996.
[4] M. Handley and V. Jacobson, "SDP: Session Description
Protocol", RFC 2327, April 1998.
Informative References
[5] ISO/IEC JTC1/SC29/WG1, "JPEG2000 Part I Final Committe Draft
Version 1.0", http://www.jpeg.org/public/fcd15444-1.pdf, March
2000.
[6] ISO/IEC JTC1/SC29/WG1, "Motion JPEG 2000 Final Committee Draft
1.0", http://www.jpeg.org/public/fcd15444-3.doc, March, 2001.
[7] ISO/IEC JTC1/SC29/WG1, "JPEG2000 requirements and profiles
version 6.3", draft in progress,
http://www.jpeg.org/public/wg1n1803.pdf
[8] Diego Santa-Cruz, Touradj Ebrahimi, Joel Askelof, Mathias
Larsson and Charilaos Christopoulos, "JPEG 2000 still image
Edwards, et al. [Page 19]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
coding versus other standards", In Proc. of SPIE's 45th annual
meeting, Application of Digital Image Processing XXIII,
vol.4115, pp.446-454, July 2000.
[9] Deering, S., "Host Extensions for IP Multicasting", STD 5,
RFC 1112, August 1989.
14. Authors' Addresses
Eric Edwards
Sony Corporation
Media Processing Division
Platform Technology Center of America
3300 Zanker Road, MD: SJ2C4
San Jose, CA 95134
Phone: +1 408 955 6462
Fax: +1 408 955 5724
Email: Eric.Edwards@am.sony.com
Satoshi Futemma/Nobuyoshi Tomita/Eisaburo Itakura
Sony Corporation
6-7-35 Kitashinagawa Shinagawa-ku
Tokyo 141-0001 JAPAN
Phone: +81 3 5448 3096
Fax: +81 3 5448 4622
Email: {satosi-f|n-tomita|itakura}@sm.sony.co.jp
15. Full Copyright Statement
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise
explain it or assist in its implementation may be prepared, copied,
published and distributed, in whole or in part, without
restriction of any kind, provided that the above copyright notice
and this paragraph are included on all such copies and derivative
works. However, this document itself may not be modified in any
way, such as by removing the copyright notice or references to the
Internet Society or other Internet organizations, except as needed
for the purpose of developing Internet standards in which case the
procedures for copyrights defined in the Internet Standards
process must be followed, or as required to translate it into
languages other than English.
The limited permissions granted above are perpetual and will not
be revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on
an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
Edwards, et al. [Page 20]
INTERNET-DRAFT draft-ietf-avt-rtp-jpeg2000-04.txt October 27, 2003
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Edwards, et al. [Page 21]