Internet Engineering Task Force G. Liebl,
T.Stockhammer
Internet Draft LNT, Munich Univ. of
Technology
Document: draft-ietf-avt-uxp-00.txt
February 2001 M. Wagner, J.Pandel,
G. Baese, M. Nguyen,
F. Burkert
Expires: August 2001 Siemens AG, Munich
An RTP Payload Format for Erasure-Resilient Transmission of Progressive
Multimedia Streams
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts. Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- Drafts
as reference material or to cite them other than as "work in
progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
1. Abstract
This document specifies an efficient way to ensure erasure-resilient
transmission of progressively encoded multimedia sources via RTP
using Reed-Solomon codes. The level of erasure protection can be
explicitly adapted to the importance of the respective parts in the
source stream, thus allowing a graceful degradation of application
quality with increasing packet loss rate on the network. Hence, this
type of unequal erasure protection (UXP) schemes is intended to cope
with the rapidly varying channel conditions on wireless access links
to the Internet backbone. Nevertheless, backward compatibility to
currently standardized non-progressive multimedia codecs is ensured,
since equal erasure protection (EXP) represents a subset of generic
UXP. By defining a comparably simple payload format, the proposed
scheme can be easily integrated into the existing framework for RTP.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page1]
Internet Draft Unequal Erasure Protection February 2001
2. Conventions used in this document
The following terms are used throughout this document:
1.) Message block: a higher layer transport unit (e.g. an IP
packet), that enters/leaves the segmentation/reassembly stage at the
interface to wireless data link layers.
2.) Segment: denotes a link layer transport unit.
3.) CRC: Cyclic Redundancy Check, usually added to transport units
at the sender to detect the existence of erroneous bits in a
transport unit at the receiver.
4.) Segmentation/Reassembly Process: If the size of the transport
units at the link layer is smaller than that at the upper layers,
message blocks have to be split up into several parts, i.e.
segments, which are then transmitted subsequently over the link. If
nothing is lost, the original message block can be restored at the
receiving entity (reassembly).
5.) Quality-of-service: application-dependent criterion to define a
certain desired operation point.
6.) Codec: denotes a functional pair consisting of a source encoding
unit at the sender and a corresponding source decoding unit at the
receiver; usually standardized for different multimedia applications
like audio or video.
7.) Progressive source coding: results in a stream of coded data
whose distinct elements are of different importance to the
reconstruction process at the decoder. Elements are commonly ordered
from highest to least importance, where the latter elements depend
on the previous.
8.) Reed-Solomon (RS) code: belongs to the class of linear nonbinary
block codes, and is uniquely specified by the block length n, the
number of parity symbols t, and the symbol alphabet.
9.) n: is a variable, which denotes both the block length of a RS
codeword, and the number of columns in a TB (see 15).
10.) k: is a variable, which denotes the number of information
symbols in a RS codeword.
11.) t: is a variable, which denotes the number of parity symbols in
a RS codeword.
12.) Erasure: When a packet is lost during transmission, an erasure
is said to have happened. Since the position of the erased packet in
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 2]
Internet Draft Unequal Erasure Protection February 2001
a sequence is usually known, a corresponding erasure marker can be
set at the receiving entity.
13.) Base layer: comprises the first and most important elements in
a progressively encoded bitstream, without which all subsequent
information is useless.
14.) Enhancement layer: comprises one or more sets of the less
important subsequent elements in a progressively encoded bitstream.
A specific enhancement layer can be decoded, if and only if the base
layer and all previous enhancement layer data (of higher importance)
is available.
15.) Transmission block (TB): denotes a memory array of L rows and n
columns. Each row of a TB represents a RS codeword, whereas each
column represents the payload of an RTP packet.
16.) L: is a variable, which denotes both the number of rows in a TB
and the payload length of an RTP packet in bytes.
17.) Unequal erasure protection (UXP): denotes a specific strategy
which varies the level of erasure protection across a TB according
to a given redundancy profile.
18.) Equal erasure protection (EXP): is a subset of UXP, for which
the level of erasure protection is kept constant across a TB.
19.) Redundancy profile: describes the size of the different erasure
protection classes in a TB, i.e. the number of rows (codewords) per
class.
20.) Erasure protection class: contains a set of rows (codewords) of
the TB with same erasure correction capability.
21.) i: is a variable, which denotes the number of parity bytes for
each row in erasure protection class i.
22.) CA_i: is a variable, which denotes the set of rows contained in
erasure protection class i.
23.) A_i: is a variable, which denotes the total number of rows
contained in erasure protection class i, i.e. the cardinality of
CA_i.
24.) T: is a variable, which denotes the number of parity bytes for
each row in the highest erasure protection class (with respect to
application data) in a TB.
25.) AV: denotes the erasure protection vector of length (T+1) used
to describe a certain redundancy profile.
26.) DP: descriptor used for in-band signaling of the erasure
protection vector
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 3]
Internet Draft Unequal Erasure Protection February 2001
27.) Stuffing: insertion of predefined symbol patterns. Stuffing is
performed, if the information part of an erasure protection class
cannot be filled completely with (application) payload data.
28.) Interleaver: performs the spreading of a codeword, i.e. a row
in the TB, over n successive packets, such that the probability of
an erasure burst in a codeword is kept small.
29.) UXP header: is the additional header information contained in
each RTP packet after UXP has been applied.
30.) X: denotes a currently not used extension field of 1 bit in the
UXP header.
31.) P: is a variable which denotes the number of parity symbols per
row used to protect the inband signaling of the redundancy profile.
32.) ceil(.): denotes the ceiling function, i.e. rounding up to the
next integer.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC-2119 [].
3. Introduction
Due to the increasing popularity of high-quality multimedia
applications over the Internet and the high level of public
acceptance of existing mobile communication systems, there is a
strong demand for a future combination of these two techniques: One
possible scenario consists of an integrated communication
environment, where users can set up multimedia connections anytime
and anywhere via radio access links to the Internet.
For this reason, several packet-oriented transmission modes have
been proposed for next generation wireless standards like EGPRS
(Enhanced General Packet Radio Service) or UMTS (Universal Mobile
Telecommunications System), which are mostly based on the same
principle: Long message blocks, i.e. IP packets, that enter the
wireless part of the network are split up into segments of desired
length, which can be multiplexed onto link layer packets of fixed
size. The latter are then transmitted sequentially over the wireless
link, reassembled, and passed on to the next network element.
However, compared to the rather benign channel characteristics on
today's fixed networks, wireless links suffer from severe fading,
noise, and interference conditions in general, thus resulting in a
comparably high residual bit error rate after detection and
decoding. By use of efficient CRC-mechanisms, these bit errors are
usually detected with very high probability, and every corrupted
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 4]
Internet Draft Unequal Erasure Protection February 2001
segment, i.e. which contains at least one erroneous bit, is
discarded to prevent error propagation through the network. But if
only one single segment is missing at the reassembly stage, the
upper layer IP packet cannot be reconstructed anymore. The result is
a significant increase in packet loss rate at IP level.
Since most multimedia applications can only recover from a very
limited number of lost message blocks, it is vitally necessary to
keep packet loss at IP level within a certain acceptable range
depending on the individual quality-of-service requirements.
However, due to the delay constraints typically imposed by most
audio or video codecs, the use of ARQ-schemes is often prohibited
both at link level and at transport level. In addition,
retransmission strategies cannot be applied to any broadcast or
multicast scenarios. Thus, forward erasure correction strategies
have to be considered, which provide a simple means to reconstruct
the content of lost packets at the receiver from the redundancy that
has been spread out over a certain number of subsequent packets.
There already exist some previous studies and proposals regarding
erasure-resilient packet transmission, of whom the most important
one with respect to RTP is described in [1]. Since most of them are
based on the assumption that all parts in a message block are
equally important to the receiver, i.e. the respective application
cannot operate on partly complete blocks, they were optimized with
respect to assigning equal erasure protection over the whole message
block. However, recent developments both in audio and video coding
have introduced the notion of progressively encoded source streams,
for which unequal erasure protection strategies seem to be more
promising, as it will be explained in more detail below. Although
the scheme defined in [1] is in principle capable of supporting some
kind of unequal erasure protection, possible implementations seem to
be quite complex with respect to the gain in performance. Finally,
in [1] it is assumed that subsequent RTP packets can have variable
length, which would cause significant segmentation overhead at the
link layer of almost all wireless systems.
This document defines a payload format for RTP, such that different
elements in a progressively encoded multimedia stream can be
protected against packet erasures according to their respective
quality-of-service requirement. The general principle, including the
use of Reed-Solomon codes together with an appropriate interleaving
scheme for adding redundancy, follows the ideas already presented in
[2], but allows for finer granularity in the structure of the
progressive source stream. The proposed scheme is generic in the way
that it (1) is independent of the type of multimedia stream, be it
audio or video, and (2) can be adapted to varying transmission
quality very quickly by use of inband-signaling.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 5]
Internet Draft Unequal Erasure Protection February 2001
4. Reed-Solomon Codes
Reed-Solomon (RS) codes are a special class of linear nonbinary
block codes, which are known to offer maximum erasure correction
capability with minimum amount of redundancy.
An arbitrary t-erasure-correcting (n,k) RS code defined over Galois
field GF(q) has the following parameters [3]:
- Block length: n=q-1
- No. of information symbols in a codeword: k
- No. of parity-check symbols in a codeword: n-k=t
- Minimum distance: d=t+1
In what follows, only systematic RS codes over GF(2^8) shall be
considered, i.e. the symbols of interest can be directly related to
a tuple of eight bits, which is commonly called a byte in packet
transmission. The principle structure of a codeword is shown in Fig.
1.
By shortening the initial (n=255,n-t) RS code, any desired (n',n'-t)
RS code for a given erasure correction capability t may be obtained.
block of n bytes
<----------------->
+-+-+-+-+-+-+-+-+-+
|&|&|&|&|&|&|&|*|*|
+-+-+-+-+-+-+-+-+-+
<------------><--->
k=n-t t
(&:info) (*:parity)
Fig. 1: Structure of a systematic RS codeword
5. Progressive Source Coding
If the output of a multimedia codec, be it audio or video, is said
to be progressive, the encoded bitstream must consist of several
distinct elements, often organized in separate layers. The latter
shall be defined via their relative importance with respect to the
quality of the reconstruction process at the receiver. Hence, there
exists at least one layer, often called base layer, without which
reconstruction fails at all, whereas all the other layers, often
called enhancement layers, just help to continually improve the
quality. Consequently, the different layers shall be mapped on the
bitstream in decreasing order of importance, i.e. the base layer
data is followed by the various enhancement layers.
An example can be found in the fine granular scalability modes which
have been proposed to various standardization bodies like MPEG-4 [4]
or ITU (H.26L) [5], where the resolution of the scaling process in
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 6]
Internet Draft Unequal Erasure Protection February 2001
the progressive source encoder is as low as one symbol in the
enhancement layer.
From the above definition, it is quite obvious that the most
important base layer data must be protected as strongly as possible
against packet loss during transmission. However, the protection of
the enhancement layers could be continually lowered, since a loss at
this stage has only minor consequences for the reconstruction
process. Thus, by using a suitable unequal erasure protection
strategy across the message block, which contains the progressively
encoded source stream, the overhead due to redundancy spent per
block is reduced. Furthermore, if channel conditions get worse
during transmission, only more and more enhancement layers are lost,
i.e. a graceful degradation in application quality at the receiver
is achieved [6].
6. General Structure of UXP schemes
Fig. 1 already illustrated the structure of a systematic codeword,
which shall be represented by a single row and n successive columns
that contain the information and the parity bytes. This structure
shall now be extended by forming a transmission block (TB)
consisting of L codewords of length n bytes each, which amounts to a
total of L rows and n columns [7]: Each column shall represent the
payload of an RTP packet, i.e. the whole data of a TB is transmitted
via a sequence of n RTP packets all carrying a payload of length L
bytes.
The value of L should be chosen in such a way that the whole length
of the resulting IP packet (i.e. RTP payload plus sum of UXP, RTP,
UDP, and IP header) equals a multiple of the segment size on the
wireless link to avoid stuffing at the data link layer.
As depicted in Fig. 2, the rows of the block shall be partitioned
into T+1 different classes CA_i, where i=0...T, such that each class
contains exactly A_i=|CA_i| consecutive rows of the matrix, where
the A_i have to satisfy the following relationship:
A_0+A_1+...+A_T=L
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 7]
Internet Draft Unequal Erasure Protection February 2001
Transmission Block (TB)
T
<------->
/\ +-+-+-+-+-+-+-+-+-+ /\
| |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | A_T=3
| |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ |
L bytes | |&|&|&|&|&|*|*|*|*| \/
payload | +-+-+-+-+-+-+-+-+-+ /\
per packet | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ .
| |º|º|º|º|º|º|º|º|*| .
| +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/
<----------------->
n packets
&,%,$,º,# : info bytes belonging to a certain source coding layer in
decreasing order of importance
* : parity bytes gained from Reed-Solomon coding
Fig. 2: General structure for coding with unequal erasure protection
Furthermore, all rows in a particular class CA_i shall contain
exactly the same number of parity bytes, which is equal to the index
i of the class. For each row in a certain class CA_i, the same (n,n-
i) RS code shall be applied.
As can be observed from Fig. 2, class CA_T contains the largest
number of parity bytes per row, i.e. offers the highest erasure
protection capability in the block. Consequently, all base layer
data must be assigned to class CA_T, where the value of T should be
chosen according to the desired outage threshold of the base layer
given a certain packet erasure rate on the link.
All other classes CA_(T-1)...CA_0 shall be sequentially filled with
enhancement layer data in decreasing order of importance, where the
optimal choice for the size of each class (0 or more rows), i.e. the
structure of the redundancy profile, should depend on the quality-
of-service requirements for the various layers.
The following set of rules contains a compact description of all the
operations that must be performed for each transmission block:
1.) The total number of columns n of the TB shall be chosen
according to the actual delay constraints of the application.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 8]
Internet Draft Unequal Erasure Protection February 2001
2.) The maximum erasure correction capability T should be chosen
according to the desired outage threshold of the base layer given
the actual packet erasure rate on the link.
3.) The redundancy profile for the rest of the TB should depend on
the size and number of the various layers in the progressive source
stream, as well as the desired probability of successful decoding
for each of them (quality-of-service requirement).
4.) Beginning with the base layer, each layer in the progressive
source stream shall be assigned to exactly one class CA_T...CA_0 in
decreasing order of importance.
5.) For each nonempty class CA_i, i=T...0, the following steps have
to be performed:
a) All rows of this specific class shall be filled from left to
right and top to bottom with data bytes of the corresponding layer.
If the size of the layer is less than the available space for this
class, the empty positions may be filled with the first bytes of the
next layer (in decreasing order of importance), such that there is
no overhead due to stuffing.
b) For each row in the class, the required i parity-check bytes are
computed from the same set of codewords of an (n,n-i) RS code, and
filled in the empty positions at the end of each row. Thus, every
row in the class constitutes a valid codeword of the chosen RS code.
6.) If the total length of the progressively encoded source stream
exceeds the number of available info byte positions in the TB for
the chosen redundancy profile, the final bytes of the least
important enhancement layer shall be cut off until the remaining
parts fit completely into the TB.
7.) If the total length of the progressively encoded source stream
is less than the number of available info byte positions in the TB
for the chosen redundancy profile, byte-stuffing shall be applied to
the empty positions in the last class such that the stuffing value
does not influence the performance of the multimedia decoder at the
receiver.
8.) After having filled the whole TB with information and parity
bytes, each column is read out byte-wise from top to bottom and
mapped onto the payload part of one and only one RTP packet.
9.) The n resulting RTP packets shall be transmitted subsequently to
the remote host, starting with the leftmost one.
10.) At the corresponding protocol entity at the remote host, the
payload of all successfully received RTP packets belonging to the
same sending TB shall be filled into a similar receiving TB column-
wise from top to bottom and left to right.
11.) For every erased packet of a received TB, the respective column
in the TB shall be filled with a suitable erasure marker.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 9]
Internet Draft Unequal Erasure Protection February 2001
12.) Given the redundancy profile assigned by the sender, for each
row a decoding operation shall be performed by applying any suitable
algorithm for erasure decoding.
13.) For all rows for which the decoding operation has been
successful, the reconstructed data bytes are read out from left to
right and top to bottom, and appended to the reconstructed version
of the progressive data stream.
14.) For all rows for which the decoding operation has not been
successful, a sufficient number of suitable dummy symbols may be
added to the reconstructed data stream to inform the source decoder
about the missing symbols.
One can easily realize that the above rules describe an interleaver,
i.e. at the sender a single codeword of a TB is spread out over n
successive packets. Thus, each codeword of a transmitted TB
experiences the same number of erasures at exactly the same
positions.
Two important conclusions can be drawn from this:
a) Since the same RS code is applied to all rows contained in a
specific class, either all of them can be correctly decoded or not.
Hence, there exist no partly decodable classes at the receiver.
b) If decoding is successful for a certain class CA_i, all the
classes CA_(i+1)...CA_T can also be decoded, since they are
protected by at least one more parity byte per row. Together with
rule 4, it is therefore always ensured, that in case a decodable
enhancement layer exists, the base layer it depends on can also be
reconstructed!
Given the maximum erasure protection value T, the redundancy profile
for a TB of size (L x n) shall be denoted by a so-called erasure
protection vector AV of length (T+1), where
AV:=(A_0,A_1,...,A_(T-1),A_T)
From the above definition, it is easy to realize that the trivial
cases of no erasure protection and EXP are a subset of UXP:
a) no erasure protection at all: all application data is mapped onto
class CA_0, i.e. AV=(L,0,0,...,0).
b) EXP: all application data is mapped onto class CA_T, i.e.
AV=(0,0,...,0,A_T=L).
Hence, backward compatibility to currently standardized non-
progressive multimedia codecs is definitely achieved.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 10]
Internet Draft Unequal Erasure Protection February 2001
7. RTP payload structure
For every packet whose payload results from reading out a column of
the TB, the RTP header must be followed by an UXP header.
7.1. Specific settings in the RTP header
The timestamp of each RTP packet resulting from reading out a TB is
set to the time instant when the first byte of the progressive
source data stream has been written into the TB. This results in the
TS value being the same for all RTP packets belonging to a specific
TB.
The payload type is of dynamic type, and obtained through out-of-
band signaling similar to [1]. The signaling protocol must establish
a payload length to be associated with the payload type value. End
systems, which cannot recognize a payload type, must discard it.
All other fields in the RTP header are set to those values proposed
for regular multimedia transmission using the same source codecs,
but no erasure protection scheme enabled.
7.2. Structure of the UXP header
The UXP header shall consist of 2 octets, and is shown in Fig. 3:
0 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|X| block PT | block length n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fig. 3: Proposed UXP header
The fields in the header shall be defined as follows:
- X (bit 0): extension bit, reserved for future enhancements,
currently not in use -> default value: 0
- block PT (bits 1-7): regular RTP payload type to indicate the
primary source encoding of the media
- block length n (bits 8-15): indicates total number of RTP packets
resulting from one TB (which equals
the number of columns of the TB)
Based on the RTP sequence number and the repetition of the block
length n in each UXP header, the receiving entity is able to
recognize both TB boundaries and the actual position of lost packets
in the TB. Furthermore, the specific choice of equal TS values for
all RTP packets belonging to a TB allows for overcoming possible
sequence number overflow.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 11]
Internet Draft Unequal Erasure Protection February 2001
7.3. In-band signaling of the structure of the redundancy profile
To enable a dynamic adaptation to varying link conditions, the
actual redundancy profile used for a specific TB must be signaled to
the receiving entity. Since out-of-band signaling either results in
excessive additional control traffic, or prevents quick changes of
the profile between successive TBs, an in-band signaling procedure
is desired.
As without knowledge of the correct redundancy profile, the decoding
process cannot be applied to any of the erasure protection classes,
it has to be protected as least as strongly as the base layer data
against packet loss. Therefore, a new class CA_P is added to the
beginning of the TB, where the number of parity symbols is by
default set to the following value:
P=ceil(n/2)
Hence, up to 50% of the RTP packets can be lost, before the
redundancy profile cannot be recovered anymore. This seems to be a
reasonable value for the lowest point of operation over a lossy
link. Alternatively, p may be explicitly signaled during session
setup by means of SDP or H.245 protocol.
Consequently, since all other classes must have equal or less
erasure protection capability, the maximum allowable value for class
CA_T is now limited to T<=P.
The signaling of the erasure protection vector is accomplished by
means of descriptors. For each class CA_i with A_i>0, there is a
descriptor DP_i providing information about the size of class CA_i
(i.e. the value of A_i) and establishing a relationship between the
erasure protection of class CA_i and that of the first preceding
class CA_(i+j) with A_(i+j)>0, where j>0. A descriptor DP_i is
mapped onto one byte, which is sub-divided into two half-bytes (i.e.
the higher and the lower four bits). The first half-byte is of type
unsigned and contains the 4-bit representation of the decimal value
A_i. The second half-byte is of type signed and contains the
difference in erasure protection between class CA_i and class
CA_(i+j), i.e. the signed 4-bit representation of the decimal value
-j. Note that the erasure protection p and the size A_p=1 of class
CA_p are fixed.
Thus, the data to be filled into class CA_p shall consist of a
sequence of descriptors, where the number of descriptors is given by
the number of protection classes CA_i, 0<=i<=T, with A_i>0. When the
number of necessary descriptors exceeds the n-p information
positions, the remaining descriptors are assigned to the next non-
empty class CA_i providing the highest erasure protection. If the
number of descriptors is less than n-p, however, empty positions in
class CA_p may be filled up with the first bytes of the base layer
to avoid stuffing.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 12]
Internet Draft Unequal Erasure Protection February 2001
The transition from descriptors to payload data needs not to be
signaled to the decoder, since it can be determined by the decoder
through evaluation of the decoded descriptors and the a-priori
knowledge of the length L of the transmission block TB.
Nevertheless, it can also be signaled explicitly by the otherwise
unused descriptor 0x00.
The complete structure of the TB is now depicted in Fig. 4.
Transmission Block (TB)
P
<--------->
/\ +-+-+-+-+-+-+-+-+-+ /\
| |?|?|?|?|*|*|*|*|*| | A_P=1
| +-+-+-+-+-+-+-+-+-+ \/
| |&|&|&|&|&|*|*|*|*| /\
| +-+-+-+-+-+-+-+-+-+ | A_T=3
| |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ |
L bytes | |&|&|&|&|&|*|*|*|*| \/
payload | +-+-+-+-+-+-+-+-+-+ /\
per packet | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ .
| |º|º|º|º|º|º|º|º|*| .
| +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/
<----------------->
n packets
? : descriptors for in-band signaling of the redundancy
profile
&,%,$,º,# : info bytes belonging to a certain source coding layer
in decreasing order of importance
* : parity bytes gained from Reed-Solomon coding
Fig. 4: General structure for UXP with in-band signaling of the
redundancy profile
The following simple example is meant to illustrate the idea behind
using descriptors: Let an erasure protection vector of length T+1=7
be given as follows:
AV=(A_0,A_1,...,A_5,A_6)=(7,0,2,2,0,3,10)
Hence, the length L of the TB (including one row for the
descriptors) is equal to 7+2+2+3+10+1=25 (rows/bytes). If the width
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 13]
Internet Draft Unequal Erasure Protection February 2001
is assumed to be equal to 20 (columns/packets), then the erasure
protection of the descriptors is p=10.
The corresponding sequence of descriptors can be written as
DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A),
where the values of the descriptors are given in hexadecimal
notation.
Optional Concatenation of Transmission Blocks:
The following procedure may be applied if a single message block
would be too short to achieve an efficient mapping to a transmission
block with respect to the fixed payload length L and the desired
number of packets n. For example, intra-coded video frames (I-
frames) are usually much larger than the following predicted ones
(P-frames). In this case, a certain number z of successive small
message blocks should be each mapped to a transmission block with
length L(y) and width n, such that L(1)+L(2)+?+L(z)=L.
The resulting transmission blocks can then be easily concatenated to
form a super-TB of size L x n.
Since the second half-byte of the descriptors is of type signed, we
are able to signal both decreasing and increasing erasure protection
profiles within one single sequence of descriptors at the beginning
of the super-TB.
Again, we will give a simple example to illustrate this idea: Let
the erasure protection vectors for two concatenated TBs be given as
follows:
AV1=(A1_0,A1_1,...,A1_5,A1_6)=(0,0,2,2,0,3,10),
AV2=(A2_0,A2_1,...,A2_5,A2_6)=(0,0,2,2,0,3,10).
Hence, two single identical TBs will be concatenated to form a
super-TB of length L=2*(2+2+3+10)+1=35 (rows/bytes). If the width is
again assumed to be equal to 20 (columns/packets), then the erasure
protection of the descriptors is p=10. The corresponding sequence of
descriptors can now be written as
DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of
the descriptors are given in hexadecimal notation.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 14]
Internet Draft Unequal Erasure Protection February 2001
8. Security Considerations
The payload of the RTP-packets consists of an interleaved
multimedia- and parity-stream. Therefore, it is reasonable to
encrypt the resulting stream with one key rather than using
different keys for multimedia and parity-data. It should also be
noted that encryption of the multimedia data without encryption of
the parity-data could enable known-plaintext attacks.
The amount of parity bytes per TB should be chosen carefully if the
packet loss is due to network congestion. If the amount of parity
bytes per TB is raised to cope with increasing packet loss, this can
lead to increasing network congestion. Therefore, the amount of
parity bytes per TB MUST NOT be significantly increased as packet
loss increases due to network congestion.
9. References
[1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for
Generic Forward Error Correction", Request for Comments 2733,
Internet Engineering Task Force, Dec. 1999.
[2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan,
"Priority encoding transmission", IEEE Trans. Inform. Theory, vol.
42, no. 6, pp. 1737-1744, Nov. 1996.
[3] Shu Lin and Daniel J. Costello, Error Control Coding:
Fundamentals and Applications, Prentice-Hall, Inc., Englewood
Cliffs, N.J., 1983.
[4] W. Li: "Fine Granularity Scalability Using Bit-Plane Coding of
DCT Coefficients", ISO/IEC JTC1/SC29/WG11, Doc. MPEG98/M4204, Dec.
1998.
[5] G. Blaettermann, G. Heising, and D. Marpe: "A Quality Scalable
Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May 2000.
[6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V
coding for lossy packet networks - a principle approach", Tech.
Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999.
[7] Guenther Liebl, "Modeling, theoretical analysis, and coding for
wireless packet erasure channels", Diploma Thesis, Inst. for
Communications Engineering, Munich University of Technology, 1999.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 15]
Internet Draft Unequal Erasure Protection February 2001
10. Acknowledgments
Many thanks to Thomas Stockhammer, who initially came up with the
idea of unequal erasure protection to improve progressive video
transmission over lossy networks.
11. Author's Addresses
Guenther Liebl, Thomas Stockhammer
Institute for Communications Engineering (LNT)
Munich University of Technology
D-80290 Munich
Germany
Email: {liebl,tom}@lnt.e-technik.tu-muenchen.de
Minh-Ha Nguyen, Frank Burkert
Siemens AG - ICM D MP RD MCH 83/81
D-81675 Munich
Germany
Email: {minhha.nguyen,frank.burkert}@mch.siemens.de
Marcel Wagner, Juergen Pandel, Gero Baese
Siemens AG - Corporate Technology CT IC 2
D-81730 Munich
Germany
Email: {marcel.wagner,juergen.pandel,gero.baese}@mchp.siemens.de
Full Copyright Statement
"Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 16]
Internet Draft Unequal Erasure Protection February 2001
TASK FORCE DISCLAIMS ALL WARRANTIES; EXPRESS OR IMPLIED; INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF INFORMATION HEREIN
WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page 17]