MMUSIC Working Group M. Willekens
Internet-Draft Devoteam Telecom & Media
Intended status: Informational M. Garcia-Martin
Expires: January 13, 2009 Ericsson
P. Xu
Huawei Technologies
July 12, 2008
Multiple Packetization Times in the Session Description Protocol (SDP):
Problem Statement, Requirements & Solution
draft-garcia-mmusic-multiple-ptimes-problem-03.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 13, 2009.
Abstract
This document provides a problem statement and requirements with
respect to the presence of a single packetization time (ptime/
maxptime) attribute in SDP media descriptions that contain several
media formats (audio codecs). Furthermore, a best common practice
solution for the use of 'ptime/maxptime' is proposed based on
'static', 'dynamic' and 'indicated' values. Some methods already
proposed as ad-hoc solutions and background information is included
in an appendix.
Willekens, et al. Expires January 13, 2009 [Page 1]
Internet-Draft Multiple ptime in SDP July 2008
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4
3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. BCP solution proposal . . . . . . . . . . . . . . . . . . . . 6
4.1. Sending party RTP voice payload . . . . . . . . . . . . . 6
4.1.1. ptime(s) - Static . . . . . . . . . . . . . . . . . . 7
4.1.2. ptime(d) - Dynamic . . . . . . . . . . . . . . . . . . 7
4.1.3. ptime(i) - Indicated . . . . . . . . . . . . . . . . . 7
4.1.4. ptime/maxptime algorithm . . . . . . . . . . . . . . . 8
4.1.5. Algorithm and examples . . . . . . . . . . . . . . . . 9
4.1.5.1. Codec independent parameters . . . . . . . . . . . 9
4.1.5.2. Codec dependent parameters . . . . . . . . . . . . 10
4.1.5.3. Pseudocode algorithm . . . . . . . . . . . . . . . 10
4.1.5.4. Pseudocode examples . . . . . . . . . . . . . . . 10
4.2. Receiving party RTP voice payload . . . . . . . . . . . . 11
4.3. Procedures for the SDP offer/answer . . . . . . . . . . . 11
4.3.1. Procedures for an SDP offerer . . . . . . . . . . . . 11
4.3.2. Procedures for an SDP answerer . . . . . . . . . . . . 12
4.4. Advantages . . . . . . . . . . . . . . . . . . . . . . . . 12
5. Conclusion and next steps . . . . . . . . . . . . . . . . . . 12
6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8.1. Normative References . . . . . . . . . . . . . . . . . . . 13
8.2. Informative References . . . . . . . . . . . . . . . . . . 13
Appendix A. Related RFCs for ptime . . . . . . . . . . . . . . . 15
Appendix B. Ad-hoc solutions for multiple ptime . . . . . . . . . 17
B.1. Method 1 . . . . . . . . . . . . . . . . . . . . . . . . . 18
B.2. Method 2 . . . . . . . . . . . . . . . . . . . . . . . . . 18
B.3. Method 3 . . . . . . . . . . . . . . . . . . . . . . . . . 19
B.4. Method 4 . . . . . . . . . . . . . . . . . . . . . . . . . 19
B.5. Method 5 . . . . . . . . . . . . . . . . . . . . . . . . . 20
B.6. Method 6 . . . . . . . . . . . . . . . . . . . . . . . . . 20
B.7. Method 7 . . . . . . . . . . . . . . . . . . . . . . . . . 20
B.8. Method 8 . . . . . . . . . . . . . . . . . . . . . . . . . 21
B.9. Method 9 . . . . . . . . . . . . . . . . . . . . . . . . . 21
B.10. Method 10 . . . . . . . . . . . . . . . . . . . . . . . . 22
B.11. Method 11 . . . . . . . . . . . . . . . . . . . . . . . . 22
Appendix C. Background info . . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26
Intellectual Property and Copyright Statements . . . . . . . . . . 28
Willekens, et al. Expires January 13, 2009 [Page 2]
Internet-Draft Multiple ptime in SDP July 2008
1. Introduction
"Session Description Protocol" (SDP) [RFC4566] provides a protocol to
describe multimedia sessions for the purposes of session
announcement, session invitation, and other forms of multimedia
session initiation. A session description in SDP includes the
session name and purpose, the media comprising the session,
information needed to receive the media (addresses, ports, formats,
etc.) and some other information.
In the SDP media description part, the m-line contains the media type
(e.g. audio), a transport port, a transport protocol (e.g. RTP/AVP)
and a media format description which depends on the transport
protocol.
For the transport protocol RTP/AVP or RTP/SAVP, the media format sub-
field can contain a list of RTP payload type numbers. See "RTP
Profile for Audio and Video Conferences with Minimal Control"
[RFC3551], Table 4.
For example: "m=audio 49232 RTP/AVP 3 15 18" indicates the audio
encoders GSM, G728, and G729.
Further, the media description part can contain additional attribute
lines that complement or modify the media description line. Of
interest for this memo, are the 'ptime' and 'maxptime' attributes.
According to [RFC4566], the 'ptime' attribute gives the length of
time in milliseconds represented by the media in a packet, and the
'maxptime' gives the maximum amount of media that can be encapsulated
in each packet, expressed as time in milliseconds. These attributes
modify the whole media description line, which can contain an
extensive list of payload types. In other words, these attributes
are not specific to a given codec.
[RFC4566] also indicates that it should not be necessary to know
'ptime' to decode RTP or vat audio since the 'ptime' attribute is
intended as a recommendation for the encoding/packetization of audio.
However, once more, the existing 'ptime' attribute defines the
desired packetization time for all the payload types defined in the
corresponding media description line.
End-devices can sometimes be configured with different codecs and for
each codec a different packetization time can be indicated. However,
there is no clear way to exchange this type of information between
different user agents and this can result in lower voice quality,
network problems or performance problems in the end-devices.
Willekens, et al. Expires January 13, 2009 [Page 3]
Internet-Draft Multiple ptime in SDP July 2008
2. Problem Statement
The packetization time is an important parameter which helps in
reducing the packet overhead. Many voice codecs define a certain
frame length used to determine the coded voice filter parameters and
try to find a certain trade-off between the perceived voice quality,
measured by the Mean Option Score (MOS), and the required bitrate.
When a packet oriented network is used for the transfer, the packet
header induces an additional overhead. As such, it makes sense to
combine different voice frame data in one packet, up to a Maximum
Transmission Unit (MTU), to find a good balance between the required
network resources, end-device resources and the perceived voice
quality influenced by packet loss, packet delay, jitter. When the
packet size decreases, the bandwidth efficiency is reduced. When the
packet size increases, the packetization delay can have a negative
impact on the perceived voice quality.
The "RTP Profile for Audio and Video Conferences with Minimal
Control" [RFC3551], Table 1, indicates the frame size and default
packetization time for different codecs. The G728 codec has a frame
size of 2.5 ms/frame and a default packetization time of 20 ms/
packet. For G729 codec, the frame size is 10 ms/frame and a default
packetization time of 20 ms/packet.
When more and more audio streaming traffic is carried over IP-
networks, the quality as perceived by the end-user should be no worse
as the classical telephony services. For VoIP service providers, it
is very important that endpoints receive audio with the best possible
codec and packetization time. In particular, the packetization time
depends on the selected codec for the audio communication and other
factors, such as the Maximum Transmission Unit (MTU) of the network
and the type of access network technology.
As such, the packetization time is clearly a function of the codec
and the network access technology. During the establishment of a new
session or a modification of an existing session, an endpoint should
be able to express its preferences with respect to the packetization
time for each codec. This would mean that the creator of the SDP
prefers the remote endpoint to use certain packetization time when
sending media with that codec.
SDP [RFC4566] provides the means for expressing a packetization time
that affects all the payload types declared in the media description
line. So, there are no means to indicate the desired packetization
time on a per payload type basis. Implementations have been using
proprietary mechanisms for indicating the packetization time per
payload type, leading to interoperability problems.
Willekens, et al. Expires January 13, 2009 [Page 4]
Internet-Draft Multiple ptime in SDP July 2008
One of these mechanisms is the 'maxmptime' attribute, defined in
[ITU.V152], which indicates the supported packetization period for
all codec payload types.
Another one is the 'mptime' attribute, defined by "PacketCable"
[PKT.PKT-SP-EC-MGCP], which indicates a list of packetization period
values the endpoint is capable of using (sending and receiving) for
this connection.
While all have similar semantics, there is obviously no
interoperability between them, creating a nightmare for the
implementer who happens to be defining a common SDP stack for
different applications.
A few RTP payload format descriptions, such as:
[RFC3267], [RFC3016], and [RFC3952], indicate that the packetization
time for such payload should be indicated in the 'ptime' attribute in
SDP. However, since the 'ptime' attribute affects all payload
formats included in the media description line, it would not be
possible to create a media description line that contains all the
mentioned payload formats and different packetization times. The
solutions range from considering a single packetization time for all
payload types, or creating a media description line that contains a
single payload type.
However, once more, if several payload formats are offered in the
same media description line in SDP, there is no way to indicate
different packetization times per payload format.
3. Requirements
The main requirement is coming from the implementation and media
gateway community making use of hardware based solutions, e.g. DSP
or FPGA implementations with silicon constraints for the amount of
buffer space.
Some are making use of the ptime/codec information to make certain
QoS budget calculations. When the packetization time is known for a
codec with a certain frame size and frame data rate, the efficiency
of the throughput can be calculated.
Currently, the 'ptime' and 'maxptime' are "indication" attributes and
optional. When these parameters are used for resource reservation
and for hardware initializations, a negotiated value between the SDP
offerer and SDP answerer can become a requirement.
There could be different sources for the 'ptime/maxptime', i.e. from
Willekens, et al. Expires January 13, 2009 [Page 5]
Internet-Draft Multiple ptime in SDP July 2008
RTP/AVP profile, from end-user device configuration, from network
architecture, from receiver.
The codec and 'ptime/maxptime' in upstream and downstream can be
different.
4. BCP solution proposal
The basic idea of this proposal is to keep the packetization time
independent from the codec and to consider the main purpose of the
'ptime' as follows.
The 'ptime' is a parameter indicating the packetization time which is
an important parameter for the end-to-end delay of the voice signal
as indicated in the previous sections. It is defined as a media-
attribute in the SDP.
The only requirement for the use of the 'ptime' or 'maxptime' is the
total size of the message which should fit in the MTU and the
packetization time should be an integer multiple of the codec frame
size.
If the same session does require different kind of streams, e.g. in a
conference where some users have a narrowband connection and others
having a broadband connection, different media can be defined and
allocated to different ports. In that case, different m-lines can be
defined and another 'ptime' and 'maxptime' can be indicated.
The IETF RFCs are not clear when the 'ptime' or 'maxptime' in the SDP
are not an integer multiple of the frame size. What should be used
in that case? Making use of the default 'ptime', making use of the
'ptime' which is an integer multiple of the frame size and lower than
the indicated 'ptime'? In case of an indicated 'maxptime', taking a
value as close as possible to the indicated 'ptime' but lower as the
'maxptime'?
This proposal takes care about the IETF architectural principle of
"be strict when sending" and "be tolerant when receiving". Ref.
[RFC1958].
4.1. Sending party RTP voice payload
The transmitting side of a connection needs to know the packetization
time it can use for the RTP payload data, i.e. how many speech frames
it can include in the RTP packet. A trade-off between the
packetization delay and the transmission efficiency has to be made
and this can be a static or a dynamic process which involves all
Willekens, et al. Expires January 13, 2009 [Page 6]
Internet-Draft Multiple ptime in SDP July 2008
elements in the end-to-end chain.
As such, 3 different sources to determine the packetization time are
considered.
4.1.1. ptime(s) - Static
Static provided values in the end-device: default values or manually
defined values.
An end-device implementation must know:
1. all the codec specific parameters such as:
1. Sampling rate (e.g. 8000 Hz).
2. Amount of channels (e.g. 1).
3. Frame size in ms (e.g. 20 ms).
4. Amount of encoded bits per frame (e.g. 264 bits).
5. Amount of required octets per frame (e.g. G.723.1 with 6.4
kbps, has 189 bits for the encoded data resulting in a
datarate of 189/30 ms or 6.3 kbps. However, the packet data
is octet aligned and as such, 3 bits are added which results
in 24 octets/frame or a datarate of 6.4 kbps).
2. system specific parameters such as:
1. MTU supported by the network and by the protocol stack of the
end-device.
2. Packetization time (e.g. 60 ms) and the maximum packetization
time (e.g. 150 ms).
3. Supported codecs.
4.1.2. ptime(d) - Dynamic
Dynamic provided values defined by the network architecture.
The network can indicate, as part of the device management, its
supported codecs, the 'ptime' and 'maxptime'. These values can also
change based on the dynamic behavior of the network. During heavy
load on the network, the network architecture can decide to use lower
rate codecs (for bandwidth issues) and/or higher packetization times
(for packet processing performance). This dynamic change can be done
before, during or after a session.
4.1.3. ptime(i) - Indicated
Proposed indicated values coming from the receiving side.
The receiving side can indicate in the SDP the 'ptime' and 'maxptime'
value it wants to receive. This is an optional parameter for the
media, codec independent and considered as an indication only. It
should only be considered as a hint to the sending party.
Willekens, et al. Expires January 13, 2009 [Page 7]
Internet-Draft Multiple ptime in SDP July 2008
4.1.4. ptime/maxptime algorithm
Instead of indicating a 'ptime/maxptime' on a per-codec basis as done
in many different proposals, this draft proposes to make use of the
'ptime/maxptime' as a common parameter coming from different sources:
ptime(s), ptime(d), ptime(i) and maxptime(s), maxptime(d),
maxptime(i).
In function of the available information for the 'ptime' and
'maxptime', the packetization time which will be used for the
transmission "pt" is based on following algorithm.
1. Determine codec to be used, e.g. G723 based on local info or the
optional network info.
2. Determine coding data rate, e.g. 6.4 kbps based on local info or
the optional network info.
3. Based on the codec, the frame size in ms is known: fc = frame
size of the codec.
4. Determine the MTU size which can be used. Based on this value,
the codec frame size and datarate, a 'maxptime' related to the
codec "mc" can be calculated.
5. Check the ptime(s, d, i) and maxptime(s, d, i, mc). Take the
maximum value from the available set of ptime(s, d, i) which is
lower or equal than the minimum value in the set maxptime(s, d,
i, mc).
6. Normalize this 'ptime' value to the integer multiple of the frame
size lower or equal to this 'ptime' value and lower or equal to
the "mc" but not lower then the codec frame size.
Remark:
It's up to a local policy of the device, to determine which 'ptime/
maxptime' sources it will use in its calculation, e.g. it is possible
to disallow the treatment of the 'ptime' indicated by the other side.
This can easily be done by including/excluding the 'ptime/maxptime'
values from the vectors used in the calculation.
The formula to calculate the packetization time for the transmission
of voice packets in the RTP payload data has following input
parameters.
1. The packetization time made available from different sources.
When no value is known, the frame size of the voice codec is
used.
2. The maximum packetization time values made available from
different sources. When no value is known, the frame size of the
voice codec is used.
3. The frame size of the codec.
Willekens, et al. Expires January 13, 2009 [Page 8]
Internet-Draft Multiple ptime in SDP July 2008
4. The packetization time corresponding with the selected codec,
frame size, frame datarate and the network MTU. This
packetization time has to be larger or equal to the frame size.
At least one frame size should fit in the MTU!
The function has one output parameter: the packetization time which
has to be used for the transmission: "pt". It is the frame size of
the codec multiplied by the number of frames which have to be placed
in the RTP payload based on the provided 'ptime' and 'maxptime'
values. In the formula, the maximum packetization time related to
the MTU is added to the vector which contains one or more
packetization time values. The minimum value out of this set is
determined. For the 'ptime' set "p" which contains one or more
values, the values of the 'ptime' which is higher as the minimum
value of the 'maxptime' set "mp" is replaced by this value. Then the
maximum value out of this set is determined and used to calculate the
amount of voice frames which can be included with that packetization
time.
Some examples are provided. The first example is related to the G723
with a frame size of 30 ms. When the receiver has indicated a
'ptime' of 20 ms in the SDP, the RTP will be sent with one voice
frame of 30 ms.
In another example, a G711 codec with a default 'ptime' of 20 ms and
an indicated 'ptime' of 60 ms, 3 speech frames of 20 ms can be
transmitted in one RTP packet towards the receiver which has
indicated his ability to receive RTP packets with 60 ms packetization
time.
This "pt" is used to allocate the PCM buffer size where the voice
samples from the synchronous network interface are stored before
being passed in RTP packets towards the packet oriented network.
When the 'ptime' and 'maxptime' are lower as the frame size of the
codec, no packetization time for the transmission can be determined.
An invalid value (=0) is indicated by the algorithm. In that case,
the sender has to select another codec with a voice frame size which
is lower or equal to the 'ptime' or 'maxptime'.
4.1.5. Algorithm and examples
4.1.5.1. Codec independent parameters
o p = vector containing all provided packetization time values such
as static, dynamic, indicated values.
o mp = vector containing all provided maximum packetization time
values.
Willekens, et al. Expires January 13, 2009 [Page 9]
Internet-Draft Multiple ptime in SDP July 2008
At least, one "p" and "mp" value have to be provided. When no
static, dynamic or indicated values are known, the frame size of the
codec "fc" can be used.
4.1.5.2. Codec dependent parameters
o fc = frame size of the codec
o mc = max packetization time which corresponds with the selected
codec, frame size, frame datarate and the network MTU (mc > fc).
4.1.5.3. Pseudocode algorithm
pt(p,mp,fc,mc) := |mp <- stack(mp,mc)
|if cols(p)>0
| for i e 0..cols(p)-1
| p(i)<-min(mp) if p(i)>min(mp)
|otherwise
| p<-min(mp) if p>min(mp)
|nf<-1 if (nf<-floor(max(p)/fc)<=0) & (min(mp)>fc)
|fc.nf
Pseudocode algorithm
4.1.5.4. Pseudocode examples
ptime:=20 maxptime:=60 pt(ptime,maxptime,30,100)=30
ptime:=20 maxptime:=20 pt(ptime,maxptime,30,100)=0
ptime:=30 maxptime:=30 pt(ptime,maxptime,30,100)=30
ptime:=60 maxptime:=80 pt(ptime,maxptime,30,100)=60
ptime:=20 maxptime:=60 pt(ptime,maxptime,20,100)=20
ptime:=60 maxptime:=80 pt(ptime,maxptime,20,100)=60
ptime:=70 maxptime:=200 pt(ptime,maxptime,20,100)=60
ptime:=120 maxptime:=60 pt(ptime,maxptime,20,100)=60
ptime:=120 maxptime:=200 pt(ptime,maxptime,10,100)=100
ptime:=[40,50,20] maxptime:=200 pt(ptime,maxptime,10,100)=50
ptime:=[40,50,20] maxptime:=[40,50,20] pt(ptime,maxptime,10,100)=20
ptime:=[120,40] maxptime:=[150,200,100] pt(ptime,maxptime,10,100)=100
Pseudocode examples
Willekens, et al. Expires January 13, 2009 [Page 10]
Internet-Draft Multiple ptime in SDP July 2008
4.2. Receiving party RTP voice payload
The receiver has to make use of the information in the RTP to
determine the codec type, the frame rate and the total packetization
time of the voice payload data.
For the receiver, two parts in the data flow can be considered.
First, the packet has to be received from the packet oriented
network. At the other side, mostly a synchronous network is provided
where PCM voice samples are used.
This proposal describes a method how the receiver can handle unknown
packetization buffer requirements which also allows inband changes
for the codec datarate and packetization time.
As indicated, there are different sources for the 'maxptime' and it
is already described how a 'maxptime' value can be determined for
sending it in the SDP indication. The same 'maxptime' is used for
the allocation of the PCM buffer space where the voice samples
received in the RTP packets are stored before being transmitted
towards the synchronous network, after a de-jittering. An indication
is given to the DSP hardware about the actual packetization length
obtained from the received RTP packet. When the amount of samples
are stored in the buffer corresponding to the packetization length,
an interrupt is generated and the data is transmitted without having
to wait for another RTP packet to fill-up the remaining space.
4.3. Procedures for the SDP offer/answer
This section contains the procedures related to the calculation of
the 'ptime' and 'maxptime' attributes when they are used by protocols
following the SDP offer/answer model specified in [RFC3264].
4.3.1. Procedures for an SDP offerer
An SDP offerer may include a 'ptime' value and a 'maxptime' value in
the SDP. These values are merely an indication of the desired
packetization times. The same formula as for the "pt" is used to
determine the 'ptime' in the SDP. When the media line contains
different codec formats, the 'ptime' value is determined for the
first codec in the format list (i.e. the codec with the highest
priority). For the 'maxptime', the minimum value of the 'maxptime'
value set is used in the SDP and normalized to an integer multiple of
the frame size of the first codec in the list.
It's up to a local policy of the device, to determine which 'ptime/
maxptime' sources it will use in its calculation, e.g. it is possible
to disallow the treatment of a certain 'ptime'. This can easily be
Willekens, et al. Expires January 13, 2009 [Page 11]
Internet-Draft Multiple ptime in SDP July 2008
done by including/excluding the 'ptime/maxptime' values from the
vectors used in the calculation.
4.3.2. Procedures for an SDP answerer
An SDP answerer that receives an SDP offer may also determine the
value of 'ptime' value and the 'maxptime' value to be included in the
SDP answer. These parameters are determined in the same way as done
by the offerer. However, the "answerer" can use another local policy
to determine which 'time/maxptime' sources will be used in the
calculation.
4.4. Advantages
The new proposed method has following advantages:
1. Basic idea of the 'ptime' related RFCs is kept. No new
parameters have to be added and no new interpretations or
semantic reordering has to be done.
2. The new method is strict in sending and tolerant in receiving.
It sends with the maximum allowed 'ptime' lower or equal to the
minimal 'maxptime'.
3. Different sources for the 'ptime' and 'maxptime' are taken into
account, even more as done in the different current proposals
trying to negotiate end-to-end.
4. A local policy in the end-device can easily be adopted and
adapted without requiring changes in the end-to-end protocol.
5. The algorithm makes use of all the provided information about
'ptime', 'maxptime', codec frame size, MTU size and proposes the
most optimum 'ptime'.
6. The same algorithm is used at sending and receiving side, for SDP
indications and RTP packets.
7. The algorithm is small and straight-forward. Codec dependent and
codec independent parameters are clearly indicated.
5. Conclusion and next steps
This memo advocates for the need of a standardized mechanism to
indicate the packetization time on a per codec basis, allowing the
creator of SDP to include several payload formats in the same media
description line with different packetization times.
This memo encourage discussion in the MMUSIC WG mailing list in the
IETF. The ultimate goal is to define a standard mechanism that
fulfils the requirements highlighted in this memo.
The goal is finding a solution which does not require changes in
implementations which have followed the existing RFC guidelines and
Willekens, et al. Expires January 13, 2009 [Page 12]
Internet-Draft Multiple ptime in SDP July 2008
which are able to receive any packetization time.
6. Security Considerations
This memo discusses a problem statement and requirements. As such,
no protocol that can suffer attacks is defined.
7. IANA Considerations
This document does not request IANA to take any action.
8. References
8.1. Normative References
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
8.2. Informative References
[ITU.V152]
ITU-T, "Procedures for supporting voice-band data over IP
networks", ITU-T Recommendation V.152, January 2005.
[ITU.G114]
ITU-T, "One-way transmission time", ITU-T
Recommendation G.114, May 2005.
[PKT.PKT-SP-EC-MGCP]
PacketCable, "PacketCable Network-Based Call Signaling
Protocol Specification", PacketCable PKT-SP-EC-MGCP-I11-
050812, August 2005.
[PKT.PKT-SP-CODEC-MEDIA]
PacketCable, "Codec and Media Specification",
PacketCable PKT-SP-CODEC-MEDIA-I02-061013, October 2006.
[I-D.ietf-mmusic-sdp-capability-negotiation]
Andreasen, F., "SDP Capability Negotiation",
draft-ietf-mmusic-sdp-capability-negotiation-08 (work in
progress), December 2007.
Willekens, et al. Expires January 13, 2009 [Page 13]
Internet-Draft Multiple ptime in SDP July 2008
[RFC3890] Westerlund, M., "A Transport Independent Bandwidth
Modifier for the Session Description Protocol (SDP)",
RFC 3890, September 2004.
[RFC3108] Kumar, R. and M. Mostafa, "Conventions for the use of the
Session Description Protocol (SDP) for ATM Bearer
Connections", RFC 3108, May 2001.
[RFC4504] Sinnreich, H., Lass, S., and C. Stredicke, "SIP Telephony
Device Requirements and Configuration", RFC 4504,
May 2006.
[RFC3441] Kumar, R., "Asynchronous Transfer Mode (ATM) Package for
the Media Gateway Control Protocol (MGCP)", RFC 3441,
January 2003.
[RFC3952] Duric, A. and S. Andersen, "Real-time Transport Protocol
(RTP) Payload Format for internet Low Bit Rate Codec
(iLBC) Speech", RFC 3952, December 2004.
[RFC4060] Xie, Q. and D. Pearce, "RTP Payload Formats for European
Telecommunications Standards Institute (ETSI) European
Standard ES 202 050, ES 202 211, and ES 202 212
Distributed Speech Recognition Encoding", RFC 4060,
May 2005.
[RFC1958] Carpenter, B., "Architectural Principles of the Internet",
RFC 1958, June 1996.
[RFC2327] Handley, M. and V. Jacobson, "SDP: Session Description
Protocol", RFC 2327, April 1998.
[RFC3267] Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q. Xie,
"Real-Time Transport Protocol (RTP) Payload Format and
File Storage Format for the Adaptive Multi-Rate (AMR) and
Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs",
RFC 3267, June 2002.
[RFC3016] Kikuchi, Y., Nomura, T., Fukunaga, S., Matsui, Y., and H.
Kimata, "RTP Payload Format for MPEG-4 Audio/Visual
Streams", RFC 3016, November 2000.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551,
July 2003.
Willekens, et al. Expires January 13, 2009 [Page 14]
Internet-Draft Multiple ptime in SDP July 2008
Appendix A. Related RFCs for ptime
Many RFCs make references to the 'ptime/maxptime' attribute to give
some definitions, recommendations, requirements, default values.
[RFC4566] defines the 'ptime' and 'maxptime' as:
a=ptime:[packet time]
"This gives the length of time in milliseconds represented by the
media in a packet. This is probably only meaningful for audio
data, but may be used with other media types if it makes sense.
It should not be necessary to know ptime to decode RTP or vat
audio, and it is intended as a recommendation for the encoding/
packetization of audio. It is a media-level attribute, and it is
not dependent on charset."
a=maxptime:[maximum packet time]
"This gives the maximum amount of media that can be encapsulated
in each packet, expressed as time in milliseconds. The time SHALL
be calculated as the sum of the time the media present in the
packet represents. For frame-based codecs, the time SHOULD be an
integer multiple of the frame size. This attribute is probably
only meaningful for audio data, but may be used with other media
types if it makes sense. It is a media-level attribute, and it is
not dependent on charset."
"Additional encoding parameters MAY be defined in the future, but
codec-specific parameters SHOULD NOT be added. Parameters added
to an "a=rtpmap:" attribute SHOULD only be those required for a
session directory to make the choice of appropriate media to
participate in a session. Codec-specific parameters should be
added in other attributes (for example, "a=fmtp:")."
"Note: RTP audio formats typically do not include information
about the number of samples per packet. If a non-default (as
defined in the RTP Audio/Video Profile) packetization is required,
the 'ptime' attribute is used as given above."
Remark:
'maxptime' was introduced after the release of [RFC2327], and non-
updated implementations will ignore this attribute.
"SDP Offer/answer model" [RFC3264].
Describe requirements for the 'ptime' for the SDP offerer and SDP
answerer.
If the 'ptime' attribute is present for a stream, it indicates the
desired packetization interval that the offerer would like to
receive. The 'ptime' attribute MUST be greater than zero.
Willekens, et al. Expires January 13, 2009 [Page 15]
Internet-Draft Multiple ptime in SDP July 2008
The answerer MAY include a non-zero 'ptime' attribute for any media
stream. This indicates the packetization interval that the answerer
would like to receive. There is no requirement for the packetization
interval to be the same in each direction for a particular stream.
"SDP Transport independent bandwidth modifier" [RFC3890].
Indicates the 'ptime' as a possible candidate for the bandwidth but
it should be avoided for that purpose. The use of another parameter
is indicated as a proposed method.
"SDP Conversions for ATM bearer" [RFC3108].
It is not recommended to use the 'ptime' in ATM applications since
packet period information is provided with other parameters (e.g. the
profile type and number in the 'm' line, and the 'vsel', 'dsel' and
'fsel' attributes). Also, for AAL1 applications, 'ptime' is not
applicable and should be flagged as an error. If used in AAL2 and
AAL5 applications, 'ptime' should be consistent with the rest of the
SDP description.
The 'vsel', 'dsel' and 'fsel' attributes refer generically to codecs.
These can be used for service-specific codec negotiation and
assignment in non-ATM as well as for ATM applications.
The 'vsel' attribute indicates a prioritized list of one or more 3-
tuples for voice service. Each 3-tuple indicates a codec, an
optional packet length and an optional packetization period. This
complements the 'm' line information and should be consistent with
it.
The 'vsel' attribute refers to all directions of a connection. For a
bidirectional connection, these are the forward and backward
directions. For a unidirectional connection, this can be either the
backward or forward direction.
The 'vsel' attribute is not meant to be used with bidirectional
connections that have asymmetric codec configurations described in a
single SDP descriptor. For these, the 'onewaySel' attribute should
be used.
The 'vsel' line is structured with an encodingName, a packetLength
and a packetTime.
The packetLength is a decimal integer representation of the packet
length in octets. The packetTime is a decimal integer representation
of the packetization interval in microseconds. The parameters
packetLength and packetTime can be set to "-" when not needed. Also,
the entire 'vsel' media attribute line can be omitted when not
needed.
"SIP device requirements and configuration" [RFC4504].
In some cases, certain network architectures have constraints
influencing the end devices. The desired subset of codecs supported
by the device SHOULD be configurable along with the order of
preference. Service providers SHOULD have the possibility of
Willekens, et al. Expires January 13, 2009 [Page 16]
Internet-Draft Multiple ptime in SDP July 2008
plugging in own preferred codecs. The codec settings MAY include the
packet length and other parameters like silence suppression or
comfort noise generation. The set of available codecs will be used
in the codec negotiation according to [RFC3264].
Example: Codecs="speex/8000;ptime=20;cng=on,gsm;ptime=30"
"MGCP ATM package" [RFC3441].
Packet time changed ("ptime(#)"):
If armed via an R:atm/ptime, a media gateway signals a packetization
period change through an O:atm/ptime. The decimal number, in
parentheses, is optional. It is the new packetization period in
milliseconds. In AAL2 applications, the pftrans event can be used to
cover packetization period changes (and codec changes).
Voice codec selection (vsel): This is a prioritized list of one or
more 3-tuples describing voice service. Each vsel 3-tuple indicates
a codec, an optional packet length and an optional packetization
period.
"RTP payload for iLBC" [RFC3952].
The 'maxptime' SHOULD be a multiple of the frame size. This
attribute is probably only meaningful for audio data, but may be used
with other media types if it makes sense. It is a media attribute,
and is not dependent on charset. Note that this attribute was
introduced after [RFC2327], and non updated implementations will
ignore this attribute.
Parameter 'ptime' can not be used for the purpose of specifying iLBC
operating mode, due to fact that for the certain values it will be
impossible to distinguish which mode is about to be used (e.g., when
'ptime=60', it would be impossible to distinguish if packet is
carrying 2 frames of 30 ms or 3 frames of 20 ms, etc.).
"RTP payload for distributed speech recognition" [RFC4060].
If 'maxptime' is not present, 'maxptime' is assumed to be 80ms.
Note, since the performance of most speech recognizers are extremely
sensitive to consecutive FP losses, if the user of the payload format
expects a high packet loss ratio for the session, it MAY consider to
explicitly choose a 'maxptime' value for the session that is shorter
than the default value.
Appendix B. Ad-hoc solutions for multiple ptime
During last years, different solutions were already proposed and
implemented with the goal to make the 'ptime' in function of the
codec instead of the media, containing a list of codecs. The list of
given solutions indicates what kind of logical proposals were already
made to find a solution for the SDP interworking issues due to
implementation and RFC interpretations without imposing any
Willekens, et al. Expires January 13, 2009 [Page 17]
Internet-Draft Multiple ptime in SDP July 2008
preference for a certain solution.
In all these proposals, a semantic grouping of the codec specific
information is made by giving a new interpretation of the sequence of
the parameters or by providing new additional attributes.
REMARK:
All these methods are against the basic rule indicated in the RFCs
which state that a 'ptime' and 'maxptime' are media specific and NOT
codec specific. It does not solve the interworking issues! Instead,
it makes it worse due to many new interpretations and implementations
as indicated by following examples.
To avoid a further divergence, the implementation community is
strongly asking for a standardized solution.
B.1. Method 1
Write the rtpmap first, followed by the 'ptime' when it is related to
the codec indicated by that rtpmap.
This method tries to correlate a ptime to a specific codec but many
existing implementations will suffer from such a proposal. Some SDP
encoder implementations first write the media line, followed by the
rtpmap lines and then the other value attributes such as ptime and
fmtp. So, it is difficult to know to which payload type the 'ptime'
is related. In following example, it's hard to tell if ptime:20 is
related to payload 0 or 4 or both and the interpretation of this
information by the remote end is unknown. Implementations which are
fully compliant with the existing RFCs will suffer from such new
proposals.
m=audio 1234 RTP/AVP 4 0
a=rtpmap:4 G723/8000
a=rtpmap:0 PCMU/8000
a=ptime:20
a=fmtp:4 bitrate=6400
Method 1
B.2. Method 2
Grouping of all codec specific information together.
Most implementers are in favor of this proposal, i.e. writing the
value attributes associated with an rtpmap listed immediately after
it. But, this is also a new interpretation. Normally, the ptime
Willekens, et al. Expires January 13, 2009 [Page 18]
Internet-Draft Multiple ptime in SDP July 2008
refers to all payload types indicated in the m-line. All existing
implementations will also suffer from such a method.
m=audio 1234 RTP/AVP 4 0
a=rtpmap:4 G723/8000
a=fmtp:4 bitrate=6400
a=rtpmap:0 PCMU/8000
a=ptime:20
Method 2
B.3. Method 3
Use the 'ptime' for every codec after its rtpmap definition. This
makes the 'ptime' a required parameter for each payload type. It
looks obvious but not allowed according the existing RFCs. And will
the same construct be used for the 'maxptime'?
m=audio 1234 RTP/AVP 0 18 4
a=rtpmap:18 G729/8000
a=ptime:30
a=rtpmap:0 PCMU/8000
a=ptime:40
a=rtpmap:4 G723/8000
a=ptime:60
Method 3
B.4. Method 4
Create a new 'mptime' (multiple ptime) attribute that contains
different packetization times, each one mapped to its corresponding
payload type in the preceding 'm=' line. What will happen when the
other side sends a RTP stream with a different packetization time?
Should the elements in the mptime attribute be interpreted as
required values or preferred values? With this approach, the RFC
compliant implementations are also affected and have to consider to
the new mptime attribute.
m=audio 1234 RTP/AVP 0 18 4
a=mptime 40 30 60
Willekens, et al. Expires January 13, 2009 [Page 19]
Internet-Draft Multiple ptime in SDP July 2008
Method 4
B.5. Method 5
Use of a new 'x-ptime' attribute. However, SDP parsers complained
about x- headers. It was once indicated to better use something
without x- (e.g. 'xptime'). This is just another type of encoding of
method 4 and also doesn't solve anything.
m=audio 1234 RTP/AVP 0 8
a=x-ptime 20 30
Method 5
B.6. Method 6
Use of different m-lines with one codec per m-line.
However this is a misuse because different m-lines means different
audio streams and not different codec options. So, this is certainly
against the existing SDP concept.
m=audio 1234 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=ptime:40
m=audio 1234 RTP/AVP 18
a=rtpmap:18 G729/8000
a=ptime:30
m=audio 1234 RTP/AVP 4
a=rtpmap:4 G723/8000
a=ptime:60
Method 6
B.7. Method 7
Use of the 'ptime' in the 'fmtp' attribute
m=audio 1234 RTP/AVP 4 18
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=yes;ptime=20
a=maxptime:40
a=rtpmap 4 G723/8000
Willekens, et al. Expires January 13, 2009 [Page 20]
Internet-Draft Multiple ptime in SDP July 2008
a=fmtp:4 bitrate=6.3;annexa=yes;ptime=30
a=maxptime:60
Method 7
B.8. Method 8
Use of the vsel parameter as done for ATM bearer connections
Following example indicates first preference of G.729 or G.729a (both
are interoperable) as the voice encoding scheme. A packet length of
10 octets and a packetization interval of 10 ms are associated with
this codec. G726-32 is the second preference stated in this line,
with an associated packet length of 40 octets and a packetization
interval of 10 ms. If the packet length and packetization interval
are intended to be omitted, then this media attribute line contains
'-'.
a=vsel:G729 10 10000 G726-32 40 10000
a=vsel:G729 - - G726-32 - -
Method 8
B.9. Method 9
Use of [ITU.V152]'maxmptime' (maximum multiple ptime) attribute,
which contains different packetization times, each one maps to its
corresponding payload type described in the preceding 'm=' line to
indicate the supported packetization period for all codec payload
types. This attribute is a media-level attribute and defines a list
of maximum packetization time values, expressed in milliseconds, the
endpoint is capable of using (sending and receiving) for the
connection. When the maxmptime attribute is present, the ptime shall
be ignored according to the V.152 specification. When the maxptime
is absent, then the value of ptime attribute, if present, shall be
taken as indicating the packetization period for all codecs present
in the 'm=' line.
The specification doesn't specify what has to be done when a
'maxptime' is also present. Does the 'maxmptime' indicates the
absolute maximum which can be used as packetization time for a
certain codec or does it indicate the packetization time which has to
be used as preference. It's open to many different interpretations
certainly in interworking scenarios.
m=audio 3456 RTP/AVP 18 0 13 96 98 99
a=maxmptime:10 10 - - 20 20
Willekens, et al. Expires January 13, 2009 [Page 21]
Internet-Draft Multiple ptime in SDP July 2008
Method 9
B.10. Method 10
Use of PacketCable 'mptime' attribute. See "Codec and Media
Specification" [PKT.PKT-SP-CODEC-MEDIA] which gives a Note about the
'ptime': [RFC4566] defines the 'maxptime' SDP attribute and V.152
defines the 'maxmptime' SDP attribute. The precedence of these
attributes with respect to the 'ptime' and 'mptime' attributes is not
defined at this time."
Remark:
This method is the same as indicated by method 4. However, in the
[PKT.PKT-SP-CODEC-MEDIA] version from 9/2006, the mptime was removed
and the maxptime was added. The PacketCable seems to move away from
the need of having multiple packetization times in function of the
codec and treat it more in the direction of a maximum end-to-end
delay aspect.
B.11. Method 11
Use of SDP capabilities negotiation method. See
[I-D.ietf-mmusic-sdp-capability-negotiation] which describes how
additional capabilities can be negotiated, such as the different
supported ptimes. This could be a possible solution in certain
cases, but it also requires updates in implementations which followed
the basic ptime/maxptime concept to adapt themselves to more
restricted implementations. It also introduces additional complexity
by adding new parameters and new semantics.
Appendix C. Background info
The "Session Initiation Protocol" (SIP) is used to setup media
sessions. In the SIP INVITE message, a "Session Description
Protocol" (SDP) is used. In the SDP media description part, the
m-line contains the media type (e.g. audio), a transport port, a
transport protocol (e.g. RTP/AVP) and a media format description
depending on the transport protocol. For the transport protocol RTP/
AVP or RTP/SAVP, the media format sub-field can contain a list of RTP
payload type numbers.
Example: m=audio 49232 RTP/AVP 8 0 4
The "8 0 4" is the media format, indicating a list of possible codecs
indicated by static or dynamic numbers as defined in RFC 3551
[RFC3551].
In the above example, a list of static numbers is used:
8 = PCMA - G.711 PCM A-law
0 = PCMU - G.711 PCM u-law
Willekens, et al. Expires January 13, 2009 [Page 22]
Internet-Draft Multiple ptime in SDP July 2008
4 = G723 - G.723.1
The PCMA and PCMU are "sample-based" codecs while the G723 is a
"frame-based" codec. All of them make use of a sampling rate of 8
kHz or 0.125 ms/sample. PCMA and PMCU encode each sample in 8 bits
by making use of the A or u logarithmic companding laws resulting in
a datarate of 64 kbps. G723 however does not operate on single
samples, but on different samples combined together in a "frame". As
such, higher compression rates can be achieved. The G723 codec makes
use of 240 voice samples corresponding with 30 ms speech frame
duration. The codec compresses the data in the frame and encodes it
with 192 or 160 bits resulting in a datarate of 6.4 or 5.3 kbps.
G723 gives the advantage of a lower bit rate at the cost of increased
voice delay: 30 ms instead of 0,125 ms
The "International Telecommunication Union" (ITU) gives some
guidelines on acceptable end-to-end delays in [ITU.G114]. A delay up
to 150 ms is acceptable. Between 150 and 400 ms, there is impact on
the perceived voice quality but still acceptable. Above 400 ms it
becomes unacceptable. Echo cancellers are required for delays >25
ms.
In "time division multiplexing" (TDM) networks, the coding delay is
the biggest part contributing to the end-to-end delay. However, in
"Packet Oriented" networks, packetization delays are added to the
end-to-end delay and can become an issue. Each packet has a certain
header which contributes to the bandwidth usage, i.e. the total
required bit-rate. The more data can be packed together, the smaller
the influence of the header on the total payload and the higher the
transmission efficiency. However, combining more data in a packet
gives an increase of the end-to-end delay. As such, there is a
trade-off between bandwidth usage, amount of packet processing and
end-to-end delay. For a higher compression rate, more data in a
packet to improve the transmission efficiency gives a quality
reduction due to the increased end-to-end delay.
An example is indicated in following table where the G.711 (A or
u-Law) is compared with the G.723.1 for different packetization
delays. The headers consist of:
o RTP header: 12 bytes.
o UDP header: 8 bytes.
o IPv4 header: 20 bytes.
o MAC layer: 14 bytes.
o CRC: 4 bytes.
o Start frame + preamble: 20 bytes.
Willekens, et al. Expires January 13, 2009 [Page 23]
Internet-Draft Multiple ptime in SDP July 2008
Codec Packet Datarate Voice Headers Tot Payload Throughput
Delay Payload
ms kbps bytes bytes bytes % kbps
-----------------------------------------------------------------
G711 0.125 64 1 78 79 1.3 5056.0
2.5 64 20 78 98 20.4 313.6
5 64 40 78 118 33.9 188.8
10 64 80 78 158 50.6 126.4
20 64 160 78 238 67.2 95.2
30 64 240 78 318 75.5 84.8
90 64 720 78 798 90.2 70.9
200 64 1600 78 1678 95.4 67.1
-----------------------------------------------------------------
G723.1 30 6.4 24 78 102 23.5 27.2
60 6.4 48 78 126 38.1 16.8
90 6.4 72 78 150 48.0 13.3
150 6.4 120 78 198 60.6 10.6
300 6.4 240 78 318 75.5 8.5
-----------------------------------------------------------------
Packet delay & Throughput
For the same packetization delay of 30 ms, the datarate of the
G.723.1 is 10 times lower as for the G.711, but the payload
efficiency is reduced from 75.5 to 23.5%. The same efficiency for
the G.723.1 is obtained when the packetization delay is 300 ms!
While the packet efficiency is lower, the required bitrate on the
link for the G.723.1 is reduced from 84.8 kbps to 27.2 kbps. And
when different frames are packed together, e.g. 3 frames of 30 ms,
the packetization delay becomes 90 ms resulting in a lower amount of
packets which have to be routed and processed and resulting in an
improved throughput data rate of 13.3 kbps.
The used frame sizes for the different codecs are 0.125 ms (G.711),
2.5 ms (G728), 10 ms (G729); 20 ms (G726, GSM; GSM-EFR, QCELP, LPC)
and 30 ms (G723). All of them have a default 'ptime' of 20 ms, with
the exception of the G723 with a default 'ptime' of 30ms.
The media description part can contain additional attribute lines
which complement or modify the media description line: 'ptime' and
'maxptime' attributes.
Example:
m=audio 49232 RTP/AVP 8 0 4
a=ptime:20
a=maxptime:60
Willekens, et al. Expires January 13, 2009 [Page 24]
Internet-Draft Multiple ptime in SDP July 2008
RFC 35551 [RFC3551] defines the default packetization time for each
codec in Table 1. The PCMA and PCMU have 20 ms as default 'ptime'
and the G723 has a 30 ms default 'ptime'.
When, as in the example above, the 'ptime' value is 20, then it is a
wrong value for the G723 codec which requires at least a frame size
of 30 ms and as such requires a minimal packetization delay of 30 ms.
And this causes many different interworking problems between
different systems due to different interpretations of the relevant
RFCs resulting in bad voice quality or call setup failures.
In some APIs, the following functions are provided to interface with
the RTP and codec hardware layer for encoding voice samples, based on
a certain codec, in RTP packets.
1. Set the encoding parameters such as codec type, payload type (for
RTP), packetization rate. Mostly these parameters are
configuration parameters of the device. Either, these parameters
are manually provided based on guidelines from the network
architecture or are dynamically and automatically provided.
2. Next a transmit buffer has to be allocated. The lower layer
provides a function to calculate the required buffer size in
function of the encoding parameters.
3. A transmit buffer is allocated with the indicated size (as a
minimum) by the application layer.
4. The synchronous voice data which has to be encoded is passed to
the hardware layer which encodes the data (codec and
packetization) into the provided buffer.
5. The buffer with the RTP data is returned to the application which
can sent it out on the host network interface towards the packet
network.
For the receiving part, required API functions are:
1. Set the required decoding parameters such as codec type, payload
type, initial latency in frames, jitter buffer info. Please note
that packetization time is not required because every receiver
should be able to handle up to 200 ms, which is in fact the MTU
size for which the receiver should have the required resources.
2. The required buffer size which needs to be allocated is requested
at the hardware. This size is calculated based on the size of
the RTP header and the maximum allowed payload of 200 ms.
* The application however can decide to allocate smaller buffers if
the worst case is known for the expected RTP packetization time, i.e.
by making use of the 'maxptime' attribute.
Most implementations make use of a general purpose host processor
(GPP) in combination with a digital signal processor (DSP) for the
codec/packetization part. The host processor has the interface with
Willekens, et al. Expires January 13, 2009 [Page 25]
Internet-Draft Multiple ptime in SDP July 2008
the packet oriented world while the DSP has an interface with a real-
time synchronous network mostly with special buffer handling
mechanism to avoid too many interrupt handling.
Suppose a VoIP call making use of the G711 A or u-law. Most hardware
solutions are using a DSP to handle the realtime stuff. Most of
these DSPs have special build-in hardware functionality for PCM
samples. The DSP can be configured for A or u law and for a specific
clock rate. For every transmitted or received PCM sample, the
hardware can generate an interrupt. But this has of course is a big
burden on the system performance. As such, the DSPs also provide a
method to avoid this interrupt burden by providing a mechanism based
on an internal buffer. An interrupt is only generated when the
buffer is empty or full. The initialization of this DSP hardware for
a specific call is done at the SIP invite SDP negotiation time.
m=audio 1234 RTP/AVP 0 8 4
ptime=30
Example
So, if this SDP contains a PT=0,8,4 (i.e. G711u, G711A, G723) and a
'ptime' of 30, then this 'ptime' can be used to initialize the DSP
port with a buffer size for 30 ms PCM voice samples. When the
"offerer" sends a RTP packet for a G711u or G711A by making use of
the default value of 20 ms, then the DSP PCM port is waiting for 30ms
before sending out the buffer. Because only 20 ms are received in
the RTP packet, it has to wait for the next RTP packet before being
able to transmit the buffer causing a serious degradation of the
voice quality.
This could be the problem in DSP based solutions in media gateways
between IP and PSTN world but also for end user internet access
devices (IAD) providing the possibility to attach a normal analog
voice phone via a RJ11 jack (ATA - analog telephone adapter).
For this use case, certain implementers are making arguments in the
direction of a complete SDP negotiation mechanism. But this is in
conflict with the SDP paradigm where the 'ptime' is an optional
parameter and not bound to a specific codec but to the media itself.
Different proprietary solutions are now implemented causing even more
interworking issues.
Willekens, et al. Expires January 13, 2009 [Page 26]
Internet-Draft Multiple ptime in SDP July 2008
Authors' Addresses
Marc Willekens
Devoteam Telecom & Media
Herentals, Antwerp 2200
Belgium
Email: marc.willekens@devoteam.com
Miguel A. Garcia-Martin
Ericsson
Via de los Poblados 13
Madrid, 28033
Spain
Email: Miguel.A.Garcia@ericsson.com
Peili Xu
Huawei Technologies
Bantian
Longgang, Shenzhen 518129
China
Email: xupeili@huawei.com
Willekens, et al. Expires January 13, 2009 [Page 27]
Internet-Draft Multiple ptime in SDP July 2008
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Willekens, et al. Expires January 13, 2009 [Page 28]