Audio/Video Transport (avt)
Internet Draft H. Schulzrinne
Document: draft-ietf-avt-rfc2833bis-05.txt Columbia U.
S. Petrack
eDial
T. Taylor
Nortel Networks
Expires: April 2005 October 2004
RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
Status of this Memo
This document is an Internet-Draft and is subject to all provisions
of section 3 of RFC 3667. By submitting this Internet-Draft, each
author represents that any applicable patent or other IPR claims of
which he or she is aware have been or will be disclosed, and any of
which he or she become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This memo describes how to carry dual-tone multifrequency (DTMF)
signaling, other tone signals and telephony events in RTP packets.
This memo preserves the content standardized by RFC 2833, but
clarifies its use through reorganization of the text, addition of
tutorial content, and addition of normative text describing the
detailed application of the content.
Schulzrinne, Petrack Expires - July 2004 [Page 1]
RTP Events and Tones Payloads October 2004
Table of Contents
1. Introduction................................................5
1.1 Terminology..............................................5
1.2 Overview.................................................5
1.3 Potential Applications...................................6
1.4 Events, States, Tone Patterns, and Voice Encoded Tones...7
2. RTP Payload Format for Named Telephone Events...............8
2.1 Introduction.............................................8
2.2 Use of RTP Header Fields.................................9
2.2.1 Timestamp.............................................9
2.2.2 Marker Bit............................................9
2.3 Payload Format...........................................9
2.3.1 Event Field...........................................9
2.3.2 E ("End") Bit.........................................9
2.3.3 R Bit.................................................9
2.3.4 Volume Field.........................................10
2.3.5 Duration Field.......................................10
2.4 Optional MIME Parameters................................10
2.4.1 Relationship to SDP..................................11
2.5 Procedures..............................................11
2.5.1 Sending Procedures...................................11
2.5.1.1 Negotiation of Payloads...........................11
2.5.1.2 Transmission of Event Packets.....................12
2.5.1.3 Long Duration Events..............................13
2.5.1.4 Retransmission of Final Packet....................13
2.5.1.5 Packing Multiple Events Into One Packet...........13
2.5.1.6 RTP Sequence Number...............................14
2.5.2 Receiving Procedures.................................14
2.5.2.1 Indication of Receiver Capabilities using SDP.....14
2.5.2.2 Playout of Tone Events playout....................14
2.5.2.3 Long Duration Events..............................16
2.5.2.4 Multiple Events In a Packet.......................16
2.5.2.5 Soft States.......................................17
2.6 Reliability.............................................17
2.6.1 Intra-Event Updates..................................17
2.6.2 Multi-Event Redundancy...............................17
3. Specification of Codepoints For Telephone Events..............18
3.1 DTMF Events.............................................19
3.2 Data Modem and Fax Events...............................20
3.2.1 V.8bis Events........................................21
3.2.2 V.21 Events..........................................26
3.2.3 V.8 Events...........................................27
3.2.4 V.25 Events..........................................29
3.2.5 T.30 Events..........................................31
3.2.6 V.18 Events..........................................34
3.3 Basic Subscriber Line Events............................38
Schulzrinne, Petrack Expires - April 2005 [Page 2]
RTP Events and Tones Payloads October 2004
3.4 Extended Subscriber Line Events.........................44
3.5 Trunk Events............................................47
3.5.1 Signalling System No. 5..............................49
3.5.2 North American R1....................................52
3.5.3 MFC R2 signaling.....................................53
3.5.4 ABCD Transitional Signaling For Digital Trunks.......55
3.5.5 Continuity Tones.....................................56
3.5.6 Trunk Unavailable Event..............................57
3.5.7 Metering Pulse Event.................................57
4. RTP Payload Format for Telephony Tones........................57
4.1 Introduction............................................57
4.2 Examples of Common Telephone Tone Signals...............58
4.3 Use of RTP Header Fields................................60
4.3.1 Timestamp............................................60
4.3.2 Marker Bit...........................................60
4.3.3 Payload Format.......................................60
4.3.4 Optional MIME Parameters.............................61
4.4 Procedures..............................................61
4.4.1 Sending Procedures...................................62
4.4.2 Receiving Procedures.................................63
5. Application Considerations....................................63
5.1 Combining Tones and Named Events........................63
5.2 Simultaneous Generation of Audio and Events.............64
5.3 Strategies For Handling FAX and Modem Signals...........64
5.4 Examples................................................66
5.4.1 Use of RFC 2198 Redundancy With Named Events.........66
5.4.2 Combined Tone and Telephone-event Payloads...........68
6. MIME Registration.............................................70
6.1 audio/telephone-event...................................70
6.2 audio/tone..............................................71
7. Security Considerations.......................................72
8. IANA Considerations...........................................72
9. Changes Since RFC 2833........................................72
10. Acknowledgements...........................................74
11. Authors ..................................................74
12. References.................................................75
12.1 Normative References....................................75
12.2 Informative References..................................77
Appendix A: Detailed Delta Between This Document And RFC 2833....81
Schulzrinne, Petrack Expires - April 2005 [Page 3]
RTP Events and Tones Payloads October 2004
List of Tables
Table 1: DTMF named events.......................................20
Table 2: Events for V.8bis signals...............................24
Table 3: Events for V.21 signals.................................27
Table 4: Events for V.8 signals..................................29
Table 5: Events for V.25 signals.................................31
Table 6: Events for T.30 signals.................................34
Table 7: Events for V.18 interworking............................36
Table 8: Basic subscriber line events............................44
Table 9: Extended subscriber line events.........................47
Table 10: Trunk signalling events................................49
Table 11: SS No. 5 Register Signals..............................51
Table 12: North American R1 and MF Register Signals..............53
Table 13: R2 Forward Register Signals............................55
Table 14: R2 Backward Register Signals...........................55
Table 15: Examples of telephony tones............................59
Table 16: RTP packets for example................................68
Table A-1: Source of text in the present document................81
Table A-2: Text in RFC 2833 dropped from the present document....89
Schulzrinne, Petrack Expires - April 2005 [Page 4]
RTP Events and Tones Payloads October 2004
1. Introduction
1.1 Terminology
In this document, the key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
and "OPTIONAL" are to be interpreted as described in RFC 2119 [N-1]
and indicate requirement levels for compliant implementations.
Normative references appear as [N-n], while informative references
appear as [I-n]. All references are at the end of this memo.
This document uses the following abbreviations:
DTMF Dual Tone Multifrequency
IVR Integrated Voice Response unit
PSTN Public Switched (circuit) Telephone Network
1.2 Overview
This memo defines two RTP [N-4] payload formats, one for carrying
dual-tone multifrequency (DTMF) digits and other line and trunk
signals as events (section 2), and a second one to describe general
multi-frequency tones in terms only of their frequency and cadence
(section 4). Separate RTP payload formats for telephony tone signals
are desirable since low-rate voice codecs cannot be guaranteed to
reproduce these tone signals accurately enough for automatic
recognition. In addition, tone properties such as the phase reversals
in the ANSam tone will not survive speech coding. Defining separate
payload formats also permits higher redundancy while maintaining a
low bit rate. Finally, some telephony events such as "on-hook" occur
out-of-band and cannot be transmitted as tones.
The remainder of this section provides the motivation for defining
the payload types described in this document. Section 2 defines the
payload format and associated procedures for use of named events.
Section 3 describes the events for which codepoints are defined in
this document. Section 4 describes the payload format and associated
procedures for tone representations. Section 5 deals with
achievement of reliable delivery through redundancy and the use of
combined payloads. Section 6 provides the MIME media type
registrations for the two payload formats, and also defines the IANA
requirements for registration of codepoints for named telephone
events. Section 7 deals with security considerations.
Schulzrinne, Petrack Expires - April 2005 [Page 5]
RTP Events and Tones Payloads October 2004
1.3 Potential Applications
The payload formats described here may be useful in a number of
different scenarios.
On the sending side, there are two basic possibilities: either the
sending side is an end system which originates the signals itself, or
it is a gateway with the task of propagating incoming telephone
signals into the Internet.
On the receiving side there are more possibilities. The first is
that the receiver must propagate tone signalling accurately into the
PSTN for machine consumption. One example of this is a gateway
passing DTMF tones to an IVR. In this scenario, frequencies,
amplitudes, tone durations, and the durations of pauses between tones
are all significant, and individual tone signals must be delivered
reliably and in order.
In the second scenario, the receiver must play out tones for human
consumption. Typically, rather than a series of tone signals each
with its own meaning, the content will consist of a single sequence
of tones and possibly silence, played out continuously or repeated
cyclically for some period of time. Often the end of the tone
playout will be triggered by an event fed back in the other
direction, using either in- or out-of-band means. Examples of this
are dial tone or busy tone.
The relationship between locality and the tones to be played out is a
complicating factor in this scenario. In the phone network, tones
are generated at different places, depending on the switching
technology and the nature of the tone. This determines, for example,
whether a person making a call to a foreign country hears her local
tones she is familiar with or the tones as used in the country
called.
For analog lines, dial tone is always generated by the local switch.
ISDN terminals may generate dial tone locally and then send a Q.931
[I-7] SETUP message containing the dialed digits. If the terminal
just sends a SETUP message without any Called Party digits, then the
switch does digit collection, provided by the terminal as KEYPAD
messages, and provides dial tone over the B-channel. The terminal can
either use the audio signal on the B-channel or can use the Q.931
messages to trigger locally generated dial tone.
Ringing tone (also called ringback tone) is generated by the local
switch at the callee, with a one-way voice path opened up as soon as
the callee's phone rings. (This reduces the chance of clipping the
called party's response just after answer. It also permits pre-answer
announcements or in-band call-progress indications to reach the
Schulzrinne, Petrack Expires - April 2005 [Page 6]
RTP Events and Tones Payloads October 2004
caller before or in lieu of a ringing tone.) Congestion tone and
special information tones can be generated by any of the switches
along the way, and may be generated by the caller's switch based on
ISUP messages received. Busy tone is generated by the caller's
switch, triggered by the appropriate ISUP message, for analog
instruments, or the ISDN terminal.
In the third scenario, an end system is directly connected to the
Internet and does not need to generate tone signals again, so that
time alignment and power levels are not relevant. These systems rely
on PSTN gateways or Internet end systems to generate DTMF events and
do not perform their own audio waveform analysis. An example of such
a system is an Internet interactive voice-response (IVR) system.
In circumstances where exact timing alignment between the audio
stream and the DTMF digits or other events is not important and data
is sent unicast, such as the IVR example mentioned earlier, it may be
preferable to use a reliable control protocol rather than RTP
packets. In those circumstances, this payload format would not be
used.
Note that in a number of these cases it is possible that the gateway
or end system will be both a sender and receiver of telephone
signals. Sometimes the same class of signals will be sent as
received -- in the case of "RTP trunking" or voiceband data, for
instance. In other cases, such as that of an end system serving
analogue lines, the signals sent will be in a different class from
those received.
1.4 Events, States, Tone Patterns, and Voice Encoded Tones
This document provides the means for in-band transport over the
Internet of two broad classes of signalling information: in-band
tones or tone sequences, and signals sent out-of-band in the PSTN.
Three methods, two of which are defined by this document, are
available for carrying tone signals; only one of the three can be
used to carry out-of-band PSTN signals. Depending on the
application, it may be desirable to carry the signalling information
in more than one form at once. Section 5 discusses when and how this
should be done.
1) The gateway or end system can upspeed to a higher-bandwidth codec
such as G.711 [I-3] when tone signals are to be conveyed.
Alternatively, for FAX or modem signals respectively, a
specialized transport such as T.38 [I-8], RFC 2793 [I-1], or
V.150.1 modem relay [I-19] may be used.
2) The sending gateway can simply measure the frequency components of
the voice band signals and transmit this information to the RTP
Schulzrinne, Petrack Expires - April 2005 [Page 7]
RTP Events and Tones Payloads October 2004
receiver using the tone representation defined in this document
(section 4). In this mode, the gateway makes no attempt to discern
the meaning of the tones, but simply distinguishes tones from
speech signals. An end system may use the same approach using
configured rather than measured frequencies.
All tone signals in use in the PSTN and meant for human
consumption are sequences of simple combinations of sine waves,
either added or modulated. (There is at least one tone, however,
the ANSam tone [N-20] used for indicating data transmission over
voice lines, that makes use of periodic phase reversals.)
3) As a third option, a gateway can recognize the tones and translate
them into a name, such as ringing or busy tone or DTMF digit '0'
(section 2). The receiver then produces a tone signal or other
indication appropriate to the signal. Generally, since the
recognition of signals at the sender often depends on their on/off
pattern or the sequence of several tones, this recognition can
take several seconds. On the other hand, the gateway may have
access to the actual signaling information that generates the
tones and thus can generate the RTP packet immediately, without
the detour through acoustic signals.
The use of named events is the only feasible method for
transmitting out-of-band PSTN signals as content within RTP
sessions.
2. RTP Payload Format for Named Telephone Events
2.1 Introduction
The RTP payload format for named telephone events is designated as
"telephone-event", the MIME type as "audio/telephone-event". In
accordance with current practice, this payload format does not have a
static payload type number, but uses a RTP payload type number
established dynamically and out-of-band. The default clock frequency
is 8000 Hz, but the clock frequency can be redefined when assigning
the dynamic payload type.
Named telephone events are carried as part of the audio stream, and
MUST use the same sequence number and time-stamp base as the regular
audio channel to simplify the generation of audio waveforms at a
gateway. The named telephone events payload type can be considered to
be a very highly-compressed audio codec, and is treated the same as
other codecs.
Schulzrinne, Petrack Expires - April 2005 [Page 8]
RTP Events and Tones Payloads October 2004
2.2 Use of RTP Header Fields
2.2.1 Timestamp
The RTP timestamp reflects the measurement point for the current
packet. The event duration described in section 2.5 extends forwards
from that time. For events that span multiple RTP packets, the RTP
timestamp identifies the beginning of the event, i.e., several RTP
packets may carry the same timestamp. For long-lasting events that
have to be split into subevents (see below, section 2.5.1.3), the
timestamp indicates the beginning of the subevent.
2.2.2 Marker Bit
The RTP marker bit indicates the beginning of a new event. For long-
lasting events that have to be split into subevents (see below,
section 2.5.1.3), only the first subevent will have the marker bit
set.
2.3 Payload Format
The payload format for named telephone events is shown in Figure 1.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| event |E|R| volume | duration |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: Payload Format for Named Events
2.3.1 Event Field
The event field is a number between 0 and 255 identifying a specific
telephony event. An IANA registry of codepoints for this field has
been established (see IANA Considerations, section 8). The initial
content of this registry consists of the events defined in section 3.
2.3.2 E ("End") Bit
If set to a value of one, the "end" bit indicates that this packet
contains the end of the event. For long-lasting events that have to
be split into subevents (see below, section 2.5.1.3), only the final
packet for the final subevent will have the "E" bit set.
2.3.3 R Bit
This field is reserved for future use. The sender MUST set it to
zero, the receiver MUST ignore it.
Schulzrinne, Petrack Expires - April 2005 [Page 9]
RTP Events and Tones Payloads October 2004
2.3.4 Volume Field
For DTMF digits and other events representable as tones, this field
describes the power level of the tone, expressed in dBm0 after
dropping the sign. Power levels range from 0 to -63 dBm0. Thus,
larger values denote lower volume. This value is defined only for
events for which the documentation indicates that volume is
applicable. For other events, the sender MUST set volume to zero and
the receiver MUST ignore the value.
2.3.5 Duration Field
The duration field indicates the duration of the event or subevent
being reported, in timestamp units, expressed as an unsigned integer.
For a non-zero value, the event or subevent began at the instant
identified by the RTP timestamp and has so far lasted as long as
indicated by this parameter. The event may or may not have ended. If
the event duration exceeds the maximum representable by the duration
field, the event is split into several contiguous subevents as
described below (section 2.5.1.3).
The special duration value of zero is reserved to indicate that the
event lasts "forever", i.e., is a state and is considered to be
effective until updated. A sender MUST NOT transmit a zero duration
for events other than those defined as states. The receiver SHOULD
ignore an event report with zero duration if the event is not a
state.
Events defined as states MAY contain a non-zero duration, indicating
that the sender intends to refresh the state before the time duration
has elapsed ("soft state").
For a sampling rate of 8000 Hz, the duration field is sufficient
to express event durations of up to approximately 8 seconds.
2.4 Optional MIME Parameters
As indicated in the MIME registration for named events in section
6.1, the telephone-event MIME type supports two optional parameters:
the "events" parameter, and the "rate" parameter.
The "events" parameter lists the events supported by the
implementation. Events are listed as one or more comma-separated
elements. Each element can either be a single integer or an integer
followed by a hyphen and a larger integer, representing a range of
consecutive event codepoints. No white space is allowed in the
argument. The integers designate the event numbers supported by the
implementation.
Schulzrinne, Petrack Expires - April 2005 [Page 10]
RTP Events and Tones Payloads October 2004
The "rate" parameter describes the sampling rate, in Hertz, and hence
the units for the RTP timestamp and event duration fields. The number
is written as a floating point number or as an integer. If omitted,
the default value is 8000 Hz.
2.4.1 Relationship to SDP
The recommended mapping of MIME optional parameters to SDP is given
in section 3 of RFC 3555 [N-6]. The "rate" MIME parameter for the
named event payload type follows this convention: it is expressed as
usual as the <clock rate> component of the a=rtpmap: attribute line.
The "events" MIME parameter deviates from the convention suggested in
RFC 3555 because it omits the string "events=" before the list of
supported events.
a=fmtp:<format> <list of values>
The list of values has the format described above for the MIME
parameter. The list does not have to be sorted.
For example, if the payload format uses the payload type number 100,
and the implementation can handle the DTMF tones (events 0 through
15) and the dial and ringing tones, it would include the following
description in its SDP message:
m=audio 12345 RTP/AVP 100
a=rtpmap:100 telephone-event/8000
a=fmtp:100 0-15,66,70
The following sample media type definition corresponds to the SDP
example above:
audio/telephone-event;events="0-15,66,67";rate="8000"
2.5 Procedures
This section defines the procedures associated with the named event
payload type. Additional procedures may be specified in the
documentation associated with specific event codepoints.
2.5.1 Sending Procedures
2.5.1.1 Negotiation of Payloads
Negotiation of payloads between sender and receiver is achieved by
out-of-band means, using SDP, for example.
Schulzrinne, Petrack Expires - April 2005 [Page 11]
RTP Events and Tones Payloads October 2004
The sender SHOULD indicate what events it supports, using the
optional "events" parameter associated with the telephone-events MIME
type. If the sender receives an "events" parameter from the
receiver, it MUST restrict the set of events it sends to those listed
in the received "events" parameter. For backward compatibility, if
no "events" parameter is received, the sender SHOULD assume support
for the DTMF events 0-15 but for no other events.
2.5.1.2 Transmission of Event Packets
DTMF digits and named telephone events are carried as part of the
audio stream, and MUST use the same sequence number and time-stamp
base as the regular audio channel to simplify the generation of audio
waveforms at a gateway.
An audio source SHOULD start transmitting event packets as soon as it
recognizes an event, and continue to send updates until the event has
ended. The update packet MUST have the same RTP timestamp value as
the initial packet for the event, but the duration MUST be increased
to reflect the total cumulative duration since the beginning of the
event.
The first packet for an event MUST have the "M" bit set. The final
packet for an event MUST have the "E" bit set, but setting of the "E"
bit MAY be deferred until the final packet is retransmitted (see
section 2.5.1.4). Intermediate packets for an event MUST NOT have
either the "M" bit or the "E" bit set.
Sending of a packet with the "E" bit set is OPTIONAL if the packet
reports two events which are defined as mutually exclusive states, or
if the final packet for one state is immediately followed by a packet
reporting a mutually exclusive state. (For events defined as states,
the appearance of a mutually exclusive state implies the end of the
previous state.) [Is this exception really worth the bother?]
A source has wide latitude as to how often it sends event updates. A
natural interval is the spacing between non-event audio packets.
(Recall that a single RTP packet can contain multiple audio frames
for frame-based codecs and that the packet interval can vary during a
session.) Alternatively, a source MAY decide to use a different
spacing for event updates, called an event period, with a value of
50 ms RECOMMENDED.
DTMF digits and events are sent incrementally to avoid having the
receiver wait for the completion of the event. Since some tones are
two seconds long, this would incur a substantial delay. The
transmitter does not know if event length is important and thus needs
to transmit immediately and incrementally. If the receiver
application does not care about event length, the incremental
Schulzrinne, Petrack Expires - April 2005 [Page 12]
RTP Events and Tones Payloads October 2004
transmission mechanism avoids delay. Some applications, such as
gateways into the PSTN, care about both delays and event duration.
For robustness, the sender SHOULD retransmit "state" events
periodically.
Timing information is contained in the RTP timestamp, allowing
precise recovery of inter-event times. Thus, the sender does not
need to maintain precise or consistent time intervals between event
packets.
2.5.1.3 Long Duration Events
If an event persists beyond the maximum duration expressible in the
duration field (0xFFFF), the sender MUST send a packet reporting this
maximum duration but MUST NOT set the "E" bit in this packet. The
sender MUST then begin reporting a new "subevent" with the RTP
timestamp set to the time at which the previous subevent ended and
the duration set to the cumulative duration of the new subevent. The
"M" bit of the first packet reporting the new subevent MUST NOT be
set. The sender MUST repeat this procedure as required until the end
of the complete event has been reached. The final packet for the
complete event MUST have the "E" bit set (either on initial
transmission or on retransmission as described below).
2.5.1.4 Retransmission of Final Packet
The final packet for each event and for each subevent SHOULD be sent
a total of three times at the interval used by the source for
updates. (If a new event is recognized during the retransmissions and
RFC 2198 [N-2] is in use, the old event will be part of the
redundancy in the RFC 2198 payloads.) This ensures that the duration
of the event or subevent can be recognized correctly even if an
instance of the last packet is lost.
A sender MAY delay setting the "E" bit until retransmitting the last
packet for a tone, rather than setting the bit on its first
transmission. This avoids having to wait to detect whether the tone
has indeed ended. Once the sender has set the "E" bit for a packet,
it MUST continue to set the "E" bit for any further retransmissions
of that packet.
2.5.1.5 Packing Multiple Events Into One Packet
Multiple named events can be packed into a single RTP packet if and
only if the events are consecutive and contiguous, i.e., occur
without overlap and without pause between them, and if the last event
packed into a packet occurs quickly enough to avoid excessive delays
at the receiver.
Schulzrinne, Petrack Expires - April 2005 [Page 13]
RTP Events and Tones Payloads October 2004
This approach is similar to having multiple frames of frame-based
audio in one RTP packet.
The constraint that packed events not overlap implies that events
designated as states can be followed in a packet only by other state
events which are mutually exclusive to them. The constraint itself
is needed so that the beginning time of each event can be calculated
at the receiver.
In a packet containing events packed in this way, the RTP timestamp
MUST identify the beginning of the first event or subevent in the
packet. The "M" bit MUST be set (since the packet records the
beginning of at least one event). The "E" bit and duration for each
event in the packet MUST be set using the same rules as if that event
were the only event contained in the packet.
For events with a duration shorter than a typical packet interval,
for example, V.21 bits (section 3.2.2), it is RECOMMENDED that
multiple events are represented by a single RFC 2198 [N-2] packet, as
described in section 5.
2.5.1.6 RTP Sequence Number
The RTP sequence number MUST be incremented by one in each successive
RTP packet sent. Incrementing applies to retransmitted as well as
initial instances of event reports, to permit the receiver to detect
lost packets for RTCP receiver reports.
2.5.2 Receiving Procedures
2.5.2.1 Indication of Receiver Capabilities using SDP
Receivers can indicate which named events they can handle, for
example, by using the Session Description Protocol (RFC 2327 [N-3]).
SDP descriptions using the event payload MUST contain an fmtp format
attribute that lists the event values that the receiver can process.
2.5.2.2 Playout of Tone Events
In the gateway scenario, an Internet telephony gateway connecting a
packet voice network to the PSTN recreates the DTMF or other tones
and injects them into the PSTN. Since, for example, DTMF digit
recognition takes several tens of milliseconds, the first few
milliseconds of a digit will arrive as regular audio packets. Thus,
careful time and power (volume) alignment between the audio samples
and the events is needed to avoid generating spurious digits at the
receiver. Playout when audio packets continue to arrive as the event
proceeds is discussed further in section 5.2 below.
Schulzrinne, Petrack Expires - April 2005 [Page 14]
RTP Events and Tones Payloads October 2004
Receiver implementations MAY use different algorithms to create
tones, including the two described here. Note that not all
implementations have the need to recreate a tone; some may only care
about recognizing the events.
In the first algorithm, the receiver simply places a tone of the
given duration in the audio playout buffer at the location indicated
by the timestamp. As additional packets are received that extend the
same tone, the waveform in the playout buffer is extended
accordingly. (Care has to be taken if audio is mixed, i.e., summed,
in the playout buffer rather than simply copied.) Thus, if a packet
in a tone lasting longer than the packet interarrival time gets lost
and the playout delay is short, a gap in the tone may occur.
Alternatively, the receiver can start a tone and play it until it
receives a packet with the "E" bit set, the next tone, distinguished
by a different timestamp value or a given time period elapses. This
is more robust against packet loss, but may extend the tone beyond
its original duration if all retransmissions of the last packet in an
event are lost. Limiting the time period of extending the tone is
necessary to avoid that a tone "gets stuck". This algorithm is not a
license for senders to set the duration field to zero; it MUST be set
to the current duration as described, since this is needed to create
accurate events if the first event packet is lost, among other
reasons.
Regardless of the algorithm used, the tone SHOULD NOT be extended by
more than three packet interarrival times. A slight extension of tone
durations and shortening of pauses is generally harmless.
If a receiver has extended a tone by the maximum extension duration
and started playing silence, it MUST NOT resume playing the tone when
later packets for that event arrive, as this would cause spurious
events to be detected downstream.
If a receiver receives an event packet for an event which it is not
currently playing out and the packet does not have the "M" bit set,
earlier packets for that event have evidently been lost. The
receiver MAY determine on the basis of retained history and the
timestamp and event code of the current packet that it corresponds to
an event already played out and lapsed. In that case further reports
for the event MUST be ignored, as indicated in the previous
paragraph. If this is not so, the receiver MAY attempt to play the
event out to the complete duration indicated in the event report.
The appropriate behaviour will depend on the event type concerned,
and requires consideration of the relationship of the event to audio
Schulzrinne, Petrack Expires - April 2005 [Page 15]
RTP Events and Tones Payloads October 2004
media flows and whether correct event duration is essential to the
correct operation of the media session.
A receiver SHOULD NOT rely on a particular event packet spacing, but
instead MUST use the event timestamps and durations to determine
timing and duration of playout.
The receiver MUST calculate jitter for RTCP receiver reports based on
all packets with a given timestamp. Note: The jitter value should
primarily be used as a means for comparing the reception quality
between two users or two time-periods, not as an absolute measure.
If a zero volume is indicated for an event for which the volume field
is defined, then the receiver MAY reconstruct the volume from the
volume of non-event audio or MAY use the nominal value specified by
the ITU Recommendation or other document defining the tone. This
ensures backwards compatibility with RFC 2833, where the volume field
was defined only for DTMF events.
2.5.2.3 Long Duration Events
If an event report is received with duration equal to the maximum
duration expressible in the duration field (0xFFFF) and the "E" bit
for the report is not set, the event report may mark the end of a
subevent generated according to the procedures of section 2.5.1.3.
If another report for the same event type is received, the receiver
MUST compare the RTP timestamp for the new event with the sum of the
RTP timestamp of the previous report plus the duration (0xFFFF). The
receiver uses the absence of a gap between the events to detect that
it is receiving a single long-duration event.
The total duration of a long duration event is (obviously) the sum of
the durations of the subevents used to report it. This is equal to
the duration of the final subevent (as indicated in the final packet
for that subevent), plus 0xFFFF multiplied by the number of subevents
preceding the final subevent.
2.5.2.4 Multiple Events In a Packet
The procedures of section 2.5.1.5 require that if multiple events are
reported in the same packet, they are contiguous and non-overlapping.
As a result, it is not strictly necessary for the receiver to know
the start times of the events following the first one in order to
play them out -- it needs only to respect the duration reported for
each event. Nevertheless, if knowledge of the start time for a given
event after the first one is required, it is equal to the sum of the
start time of the preceding event plus the duration of the preceding
event.
Schulzrinne, Petrack Expires - April 2005 [Page 16]
RTP Events and Tones Payloads October 2004
2.5.2.5 Soft States
If the duration of a soft state event expires, the receiver SHOULD
consider the value of the state to be "unknown" unless otherwise
indicated in the event documentation (e.g., in section 3).
2.6 Reliability
The named event mechanism uses three complementary redundancy
mechanisms to deal with lost packets:
Intra-event updates:
Events that last longer than one event period (e.g., 50 ms) are
updated periodically, so that the receiver can reconstruct the
event and its duration if it receives any of the update packets,
albeit with delay. This mechanism is described in section 2.6.1
and is most helpful for longer events.
Repeat last event packet:
As described in section 2.5.1.4, the last event packet is
transmitted a total of three times if there is no subsequent
event. This mechanism is applicable for widely-spaced events.
Multi-event redundancy:
Section 2.6.2 describes how a summary of earlier events MAY be
carried in RFC 2198 redundancy payloads. This is particularly
useful for sequences of short events, e.g., digits dialed by a
modem or autodialer or in-band tone signaling sequences (section
3.2 or 3.5).
2.6.1 Intra-Event Updates
During an event, the RTP event payload format provides incremental
updates on the event. The error resiliency afforded by this mechanism
depends on whether the first or second algorithm in section 2.5.2.2
is used and on the playout delay at the receiver. For example, if the
receiver uses the first algorithm and only places the current
duration of tone signal in the playout buffer, for a playout delay of
120 ms and a packet gap of 50 ms, two packets in a row can get lost
without causing a premature end of the tone generated.
2.6.2 Multi-Event Redundancy
The audio redundancy mechanism described in RFC 2198 [N-2] MAY be
used to recover from packet loss across events. For the suggested
packet gap of 50 ms, the effective data rate is r times 64 bits (32
Schulzrinne, Petrack Expires - April 2005 [Page 17]
RTP Events and Tones Payloads October 2004
bits for the redundancy header and 32 bits for the telephone-event
payload) plus 8 bits for the primary encoding every 50 ms or (r times
1280 + 160) bits/second, where r is the number of redundant events
carried in each packet. The value of r is an implementation trade-
off, with a value of 5 suggested.
The timestamp offset in this redundancy scheme has 14 bits, so that
it allows a single packet to "cover" 2.048 seconds of telephone
events at a sampling rate of 8000 Hz. Including the starting time of
previous events allows precise reconstruction of the tone sequence at
a gateway. The scheme is resilient to consecutive packet losses
spanning this interval of 2.048 seconds or $r$ digits, whichever is
less. Note that for previous digits, only an average loudness can be
represented.
An encoder MAY treat the event payload as a highly-compressed version
of the current audio frame. In that mode, each RTP packet during an
event would contain the current audio codec rendition (say, G.723.1
[I-4] or G.729 [I-5] of this digit as well as the representation
described in section 2, plus any previous events seen earlier.
This approach allows dumb gateways that do not understand this
format to function. See also the discussion in section 1.
The payload format described here achieves a higher redundancy even
in the case of sustained packet loss than the method proposed for the
Voice over Frame Relay Implementation Agreement [I-20]. In short,
senders generate updates at regular intervals, thus ensuring that
each event is transmitted multiple times. RFC 2198 [N-2] is used to
recover events where all packets sent during the event have been
lost.
3. Specification of Codepoints For Telephone Events
This document defines five classes of named events:
1) DTMF tones (section 3.1);
2) data and fax-related tones (section 3.2);
3) standard subscriber line tones and events (section 3.3);
4) additional subscriber line tones and events (section 3.4); and
5) trunk signalling events (section 3.5).
Schulzrinne, Petrack Expires - April 2005 [Page 18]
RTP Events and Tones Payloads October 2004
The tables listing the event codepoints for each class indicate
whether the respective events are states, tones, or other. For tone
events, the tables indicate whether the volume field is applicable or
must be set to 0. Notes to the tables indicate which states are
mutually exclusive.
3.1 DTMF Events
DTMF signalling [N-13] is typically generated by a telephone set or
possibly by a PBX. DTMF digits may be consumed by entities such as
gateways or application servers in the IP network, or by entities
such as telephone switches or IVRs in the circuit switched network.
The DTMF events support two possible applications at the sending end,
and two at the receiving end. In the first application at the
sending end, the Internet telephony gateway detects DTMF on the
incoming circuits and sends the RTP payload described here instead of
regular audio packets. The gateway likely has the necessary digital
signal processors and algorithms, as it often needs to detect DTMF,
e.g., for two-stage dialing. Having the gateway detect tones relieves
the receiving Internet end system from having to do this work and
also avoids having low bit-rate codecs like G.723.1 [I-4] render DTMF
tones unintelligible. In the second application, an Internet end
system such as an "Internet phone" can emulate DTMF functionality
without concerning itself with generating precise tone pairs and
without imposing the burden of tone recognition on the receiver.
A similar distinction occurs at the receiving end. In the gateway
scenario, an Internet telephony gateway connecting a packet voice
network to the PSTN recreates the DTMF tones or other telephony
events and injects them into the PSTN. In the end system scenario,
the DTMF events are consumed by the receiving entity itself.
Table 1 shows the DTMF-related named event codepoints within the
telephone-event payload format. The DTMF digits 0-9 and * and # are
commonly supported. DTMF digits A through D are less frequently
encountered, typically in special applications such as military
networks.
ITU-T Recommendation Q.24 [N-14], Table A-1, indicates that the
legacy switching equipment in the countries surveyed expects a
minimum recognizable signal duration of 40 ms, a minimum pause
between signals of 40 ms, and a maximum signalling rate of 8 to 10
digits per second depending on the country.
Schulzrinne, Petrack Expires - April 2005 [Page 19]
RTP Events and Tones Payloads October 2004
Event Encoding Type Volume?
(decimal)
0--9 0--9 tone yes
* 10 tone yes
# 11 tone yes
A--D 12--15 tone yes
Table 1: DTMF named events
3.2 Data Modem and Fax Events
This section summarizes the control events and tones that can appear
on a subscriber line serving a fax machine or modem. Their purpose
is to support negotiation, start-up and takedown of FAX and modem
sessions and transitions between operating modes. The actual FAX and
modem content are carried by other payload types (e.g, G.711 [I-3],
T.38 [I-8], or, in specific circumstances, V.150.1 [I-19] modem
relay, RFC 2793 [I-1], or CLEARMODE [I-2]. The events are organized
into several groups, corresponding to the ITU-T Recommendation in
which they are defined.
NOTE: implementors SHOULD NOT rely on the descriptions of the various
modem protocols described below without consulting the original
references (generally ITU-T Recommendations). The descriptions are
provided in this document to give a context for the use of the events
defined here. They frequently omit important details needed for
implementation.
The typical application of these events is to allow the Internet to
serve as a bridge between terminals operating on the PSTN. This
application is characterized as follows:
- each gateway will act both as sender and as receiver;
- time constraints apply to the exchange of signals, making the
early identification and reporting of events desirable so that
receiver playout can proceed in timely fashion;
- transfer of the events must be reliable.
Schulzrinne, Petrack Expires - April 2005 [Page 20]
RTP Events and Tones Payloads October 2004
In some cases, an implementation may simply ignore certain events,
such as fax tones, that do not make sense in a particular
environment. Section 2.4.1 specifies how an implementation can use
the SDP "fmtp" parameter within an SDP description to indicate its
inability to understand a particular event or range of events.
Regardless of which events they support, implementations MUST be
prepared to send and receive data signals using payload types other
than telephone-event, simultaneously with the use of the latter.
This is discussed further in section 5.3.
A further word on time constraints is in order. Time constraints
governing the duration of tones do not pose a problem when using the
telephone-events payload type: the payload specifies the duration and
the receiving gateway can play out the tones accordingly. Problems
come when time constraints are specified for the duration of silence
between tones. A silent period of "at least x ms" is not a problem -
- event notifications can be received late, but they can still be
played out at their specified durations.
The problem comes with requirements of silence for "exactly" some
period or for "at most" some period. The most general constraint of
the latter type has to do with the operation of echo suppressors
(ITU-T Rec. G.164 [N-10] and echo cancellers (ITU-T Rec. G.165 [N-
11]). These devices may re-activate after as little as 100 ms of no
signal on the line. As a result, in any situation where echo
suppressors or cancellers must be disabled for signalling to work,
tone events must be reported quickly enough to ensure that these
devices do not become renabled. This principle is reflected in the
succeeding sections.
3.2.1 V.8bis Events
Recommendation V.8bis [N-21] is a general procedure for two endpoints
to establish each others' capabilities and to transition between
different operating modes, both at call startup and after the call
has been established. It supports many of the same terminals as V.8
[N-20] (see below), but allows more detailed parameter negotiation.
It lacks support for some of the older V-series modems defined in
V.8, but adds capabilities for simultaneous voice and data, H.324 [I-
6] multilink, and T.120 [I-10] conferencing. The ability to change
operating modes in mid-call (e.g., to provide alternating voice and
data) is unavailable in V.8.
V.8bis distinguishes between signals and messages. The V.8bis
signals: ESi/ESr, MRe/MRd, and CRe/CRd -- consist of tones, as
described in the next paragraph. The V.8bis messages: MS, CL, CLR,
ACK(1), ACK(2), NAK(1), NAK(2), NACK(3), and NACK(4) -- consist of
sequences of bits transported over V.21 [N-23] modulation.
Schulzrinne, Petrack Expires - April 2005 [Page 21]
RTP Events and Tones Payloads October 2004
Signals are intended to be comprehensible at the receiver even in the
presence of voice content. They consist of two tone segments. The
first segment consists of a dual frequency tone held for 400 ms, and
has the function of preparing the receiver and any in-line echo
suppressor or canceller for what follows. The specific frequencies
depend only on whether the signal is from the initiator or the
responder in a transaction. The second segment follows immediately
after the first, and is a single tone held for 100 ms. The frequency
used indicates the specific signal of the six signals defined. The
complete V.8bis strategy for dealing with echo suppressors or
cancellers is described in Rec. V.8bis Appendix III. The only silent
period constraints imposed are of the "at least" type, posing no
difficulties for the use of the telephone-events payload.
V.8bis messages can be transmitted only when voice content is absent.
The V.8bis protocol uses signals to ensure that the connection is
operating in non-voice mode before passing messages. At the physical
level, V.8bis messages use V.21 [N-23] frequency-shift signalling to
transfer message content. V.21 is described in the next section.
V.8bis uses V.21 in half-duplex mode, assigning the lower channel to
the initiator and the upper channel to the responder.
The V.21 signals are preceded by a 100 ms preamble of 1650 Hz tone
(the V.21 upper channel mark tone), which must be omitted if the
preceding signal was ESi or ESr. (The second segment of ESr is also
1650 Hz.) The sender MAY report this preamble tone either as a
single extended V.21 upper channel "1" event, or as a series of "1"
events of normal duration. It is not necessary to provide an event
report before the preamble has completed, since the receiver will
still be playing out the preceding V.8bis signal when this happens
(see below).
The events associated with V.8bis signals are described further
below. No events are defined for V.8bis messages, only for the
individual bits transmitted using V.21, so a brief description
follows:
- the V.8bis CL message describes the sending terminal's
capabilities
- the CLR message also describes capabilities, but indicates that
the sender wants to receive a CL in return
- the MS establishes a particular operating mode
- the ACK and NAK messages are used to terminate the message
transactions.
Schulzrinne, Petrack Expires - April 2005 [Page 22]
RTP Events and Tones Payloads October 2004
The V.8bis messages are organized as a sequence of octets. The first
two to five octets are HDLC flags. Then comes a message type
identifier (four bits), a V.8bis version identifier (four bits), zero
to two more octets of identifying information, followed by zero or
more information field parameters in the form of bit maps. An
individual bit map is one to five octets in length. Up to 64 octets
of non-standard information may also be present. The information
fields are followed by a checksum and one to three HDLC flags.
Applications supporting V.8bis signalling using the telephone-events
payload MUST transfer V.8bis messages in the form of sequences of
bits, using the V.21 bit events defined in the next section. The
transmitted information MUST include the complete contents of the
message: the initial HDLC flags, the information field, the checksum,
and the terminating HDLC flags.
Transmission MUST also include the extra "0" bits added according to
the procedures of Rec. V.8bis clause 7.2.8 to prevent false
recognition of HDLC flags at the receiver. Implementors should note
that these extra "0" bits mean that in general V.8bis messages as
transmitted on the wire will not come out to an even multiple of
octets. Sending implementations MAY choose to vary the packetization
interval to include exactly one octet of information plus any extra
"0" bits inserted into that octet; the resulting variation will be
insignificant compared with the amount of buffering caused by the
preceding V.8bis signal (see below).
The power levels of the V.8bis and V.21 signals are subject to
national regulation. Thus it seems suitable to model V.8bis events
as tones for which the volumes SHOULD be specified by the sender. If
the receiver is rendering the V.8bis tones as audio content for
onward transmission, the receiver MAY use the volumes contained in
the event reports, or MAY modify the volumes to match downstream
national requirements.
Table 2 summarizes the V.8bis signal codepoints defined in this
document. The individual signal events are described following the
table. The sender SHALL set the RTP timestamp for these events to
indicate the time at which the beginning of segment 1 was detected.
The sender SHOULD send an interim report for the event as soon as it
has been identified. The end of the event SHALL be indicated when
the end of segment 2 has been detected. Note: since the sender
cannot identify the specific event until segment 2 has been detected,
the receiver will receive the first report of the event more than 400
ms after it has begun. This has the implication that the receiver
MUST be able to buffer more than 400 ms of the V.21 events which
follow (i.e., more than 120 events at the nominal V.21 rate of 300
bits/s).
Schulzrinne, Petrack Expires - April 2005 [Page 23]
RTP Events and Tones Payloads October 2004
Event Frequencies (Hz) Encoding Type Volume?
(decimal)
Segment 1 Segment 2
CRdi 1375 + 2002 1900 41 tone yes
CRdr 1529 + 2225 1900 42 tone yes
CRe 1375 + 2002 400 43 tone yes
ESi 1375 + 2002 980 44 tone yes
ESr 1529 + 2225 1650 45 tone yes
MRdi 1375 + 2002 1150 46 tone yes
MRdr 1529 + 2225 1150 47 tone yes
MRe 1375 + 2002 650 48 tone yes
Table 2: Events for V.8bis signals
CRdi:
V.8bis [N-21] Capabilities Request (CRd) signal when used to
initiate a transaction (Rec. V.8bis Table 7, transactions 2,3,8,
and 9). This signal requests that the remote station transition
from telephony mode to an information transfer mode and requests
the transmission of a capabilities list message by the remote
station. CRdi is sent by the calling station at call startup
(transactions 2 and 3), the initiating station subsequent to call
startup (also transactions 2 and 3), or by the answering station
in response to MRd at call startup if the answering station
originally issued an MRe and now wants to know the calling
station's capabilities (transactions 8 and 9).
CRe:
V.8bis [N-21] Capabilities Request (CRe) signal, used specifically
by an automatic answering station to initiate V.8bis signalling
(Rec. V.8bis Table 7, transactions 2, 3, 12, and 13). Like CRdi,
this signal requests that the remote station transition from
telephony mode to an information transfer mode and requests the
transmission of a capabilities list message by the remote station.
Schulzrinne, Petrack Expires - April 2005 [Page 24]
RTP Events and Tones Payloads October 2004
CRdr:
V.8bis [N-21] Capabilities Request (CRd) signal when used by the
calling station as a response to MRe or CRe during call startup to
allow the calling station to control the outcome of the message
transaction (Rec. V.8bis Table 7, transactions 10-13). Like CRdi
and CRe, this signal requests that the remote station transition
from telephony mode to an information transfer mode and requests
the transmission of a capabilities list message by the remote
station.
ESi:
V.8bis [N-21] Escape Signal (ESi). This signal requests that the
remote station transition from telephony mode to an information
transfer mode. ESi is used to precede a message which initiates a
V.8bis transaction if the transaction is not initiated by MRx or
CRx (Rec. V.8bis Table 7, transactions 4, 5, and 6). It is
intended to allow the responding station to detect the arrival of
an initiating signal in the presence of local voice or other
audio. PSTN connections with network echo suppressors may be
accommodated by inserting a 1.5 s silent interval between the ESi
signal and the transmission of the MS, CL or CLR message.
ESr:
V.8bis [N-21] Escape Signal (ESr) has the same meaning as ESi, but
is used as a response to MRe or CRe to prepare the way for an MS,
CL, or CLR message (Rec. V.8bis Table 7, transactions 1, 2, and
3). Used in this way, it turns off any announcement being
generated by the automatic answering station during message
transmission.
MRdi:
V.8bis [N-21] Mode Request (MRd) signal when used to initiate a
transaction (Rec. V.8bis Table 7, transaction 1). This signal
requests that the remote station transition from telephony mode to
an information transfer mode and requests the transmission of a
mode select message by the remote station. In particular, signal
MRdi is sent by the initiating station during the course of a
call, or by the calling station at call establishment.
MRe:
V.8bis [N-21] Mode Request (MRe) signal, sent by an automatic
answering station during call setup.signal. Like MRdi, this signal
requests that the remote station transition from telephony mode to
Schulzrinne, Petrack Expires - April 2005 [Page 25]
RTP Events and Tones Payloads October 2004
an information transfer mode and requests the transmission of a
mode select message by the remote station.
MRdr:
V.8bis [N-21] Mode Request (MRd) signal when used to respond to an
MRe in order to give the calling station control over the outcome
of the message transaction (Rec. V.8bis Table 7, transactions 7,
8, and 9). It has the same meaning as MRdi and MRe.
3.2.2 V.21 Events
V.21 [N-23] is a modem protocol offering data transmission at a
maximum rate of 300 bits/s. Two channels are defined, supporting
full duplex data transmission if required. One channel uses
frequencies 980 Hz for "1" and 1180 Hz for "0"; the other channel
uses frequencies 1650 Hz for "1" and 1850 Hz for "0". The modem can
operate synchronously or asynchronously.
V.21 is used by other protocols (e.g., V.8bis, V.18, T.30) for
transmission of control data, and is also used in its own right
between text terminals. The telephone-events payload type SHOULD NOT
be used to carry user data as opposed to control data -- other
payload types such as G.711 [I-3], RFC 2793 [I-1], or V.150.1 [I-19]
modem relay are more suitable for that purpose. The V.21 events are
summarized in Table 3.
Sending implementations MUST report a completed event for every bit
transmitted (i.e., rather than at transitions between "0" and "1").
Implementations SHOULD pack multiple events into one packet, using
the procedures of section 2.5.1.5. Eight to ten bits is a reasonable
packetization interval.
Reliable transmission of V.21 events is important, to prevent data
corruption. Reporting an event per bit rather than per transition
increases reporting redundancy and thus reporting reliability, since
each event completion is retransmitted three times as described in
section 2.5.1.4. To reduce the number of packets required for
reporting, implementations SHOULD carry the retransmitted events
using RFC 2198 [N-2] redundancy encoding.
Schulzrinne, Petrack Expires - April 2005 [Page 26]
RTP Events and Tones Payloads October 2004
Event Frequency Encoding Type Volume?
Hz (decimal)
V.21 channel 1, 1180 37 tone yes
"0" bit
V.21 channel 1, 980 38 tone yes
"1" bit
V.21 channel 2, 1850 39 tone yes
"0" bit
V.21 channel 2, 1650 40 tone yes
"1" bit
Table 3: Events for V.21 signals
3.2.3 V.8 Events
V.8 [N-20] is an older general negotiation and control protocol,
supporting startup for the following terminals: H.324 [I-6]
multimedia, V.18 [-22] text, T.101 [I-9] videotext, T.30 [N-19] send
or receive FAX, and a long list of V-series modems including V.34 [I-
15], V.90 [I-16], V.91 [I-17], and V.92 [I-18]. In contrast to
V.8bis [N-21], in V.8 only the calling terminal can determine the
operating mode.
V.8 does not use the same terminology as V.8bis. Rather, it defines
four signals which consist of bits transferred by V.21 [N-23] at 300
bits/s: the call indicator signal (CI), the call menu signal (CM),
the CM terminator (CJ), and the joint menu signal (JM). In addition,
it uses tones defined in V.25 [N-25] and T.30 [N-19] (described
below), and one tone (ANSam) defined in V.8 itself. The calling
terminal sends using the V.21 low channel; the answering terminal
uses the high channel.
The basic protocol sequence is subject to a number of variations to
accommodate different terminal types. A pure V.8 sequence is as
follows:
1) After an initial period of silence, the calling terminal transmits
the V.8 CI signal. It repeats CI at least three times, continuing
with occasional pauses until it detects ANSam tone. The CI
indicates whether the calling terminal wants to function as H.324,
V.18, T.30 send, T.30 receive, or a V-series modem.
Schulzrinne, Petrack Expires - April 2005 [Page 27]
RTP Events and Tones Payloads October 2004
2) The answering terminal transmits ANSam after detecting CI. ANSam
will disable any G.164 [N-10] echo suppressors on the circuit
after 400 ms and any G.165 [N-11] echo cancellors after one second
of ANSam playout.
3) On detecting ANSam, the calling terminal pauses at least half a
second, then begins transmitting CM to indicate detailed
capabilities within the chosen mode.
4) After detecting at least two identical sequences of CM, the
answering terminal begins to transmit JM, indicating its own
capabilities (or offering an alternative terminal type if it
cannot support the one requested).
5) After detecting at least two identical sequences of JM, the
calling terminal completes the current octet of CM, then transmits
CJ to acknowledge the JM signal. It pauses exactly 75 ms, then
starts operating in the selected mode.
6) The answering terminal transmits JM until it has detected CJ. At
that point it stops transmitting JM immediately, pauses exactly 75
ms, then starts operating in the selected mode.
The CI, CM, and JM signals all consist of a fixed sequence of ten "1"
bits followed by a signal-dependent pattern of ten synchronization
bits, followed by one or more octets of variable information. Each
octet is preceded by a "0" start bit and followed by a "1" stop bit.
The combination of the synchronization pattern and V.21 channel
uniquely identifies the message type. The CJ signal consists of
three successive octets of all zeros with stop and start bits but
without the preceding "1"s and synchronizing pattern of the other
signals.
If both gateways support V.21 bit events (section 3.2.2), the sending
gateway for a given message MUST report each instance of a CM, JM,
and CJ signal respectively as a series of V.21 bit events. A
packetization interval of 10 events per packet is suggested, since
V.8 signals are organized in this way. If either gateway does not
support the CI event in Table 4, the complete CI message MUST also be
signalled as a series of V.21 bit events.
If both gateways support the CI event in Table 4, the sender MUST
send a first report of this event no later than when the last bit of
the synchronization pattern for CI (ten '1's followed by
'00000 00001') has been recognized. The beginning of the event
according to the RTP timestamp MUST be the time at which the first of
the ten '1's was detected. The event completion MUST be indicated
only after the end of the complete CI message has been reached. In
addition to this indication of the CI event, the sender MUST report
Schulzrinne, Petrack Expires - April 2005 [Page 28]
RTP Events and Tones Payloads October 2004
the content of the call function octet which follows the
synchronization pattern, including stop and start bits, as a series
of V.21 bit events.
The overlapping nature of V.8 signalling means that there is no risk
of silence exceeding 100 ms once ANSam has disabled any echo control
circuitry. However, the 75 ms pause before entering operation in the
selected data mode will require both the calling and the answering
gateways to recognize the completion of CJ, so they can change from
playout of telephone-events to playout of the data-bearing payload
after the 75 ms period.
Event Frequency Encoding Type Volume?
Hz (decimal)
ANSam 2100 x 15 34 tone yes
/ANSam 2100 x 15 35 tone yes
phase rev.
CI (V.21 bits) 53 tone yes
Table 4: Events for V.8 signals
Modified answer tone ANSam consists of a sinewave signal at 2100 Hz
with phase reversals at an interval of 450 ms, amplitude-modulated by
a sine wave at 15 Hz. The modulated envelope ranges in amplitude
between 0.8 and 1.2 times its average amplitude. The average
transmitted power is governed by national regulations. Thus it makes
sense to indicate the volume of the signal. The ANSam phase
reversals are allowed only if echo canceller disabling is required.
The sender MUST report ANSam as soon as it is recognized, providing
updates at reasonable intervals as it continues. However, an ANSam
event packet SHOULD NOT be sent until it is possible to discriminate
between an ANSam event and an ANS event (see V.25 events, below). If
a phase reversal is detected, the sender MUST report completion of
the ANSam event and beginning of the /ANSam event at the time that
the reversal was detected. If another phase reversal is detected,
the sender MUST report the end of the /ANSam event and the beginning
of an ANSam event, continuing in this way until the tone is removed.
3.2.4 V.25 Events
V.25 [N-25] is a start-up protocol antedating V.8 [N-20] and V.8bis
[N-21]. It specifies the exchange of two tone signals:
Schulzrinne, Petrack Expires - April 2005 [Page 29]
RTP Events and Tones Payloads October 2004
CT:
"The calling tone consists of a series of interrupted bursts of
1300 hz tone, on for a duration of not less than 0.5 s and not
more than 0.7 s and off for a duration of not less than 1.5 s and
not more than 2.0 s." [N-25]. Modems not starting with the V.8
call initiation signal often use this tone.
ANS:
Answering tone. This 2100 Hz tone is used to disable echo
suppression for data transmission [N-25], [N-19]. For fax
machines, Recommendation T.30 [N-30] refers to this tone as called
terminal identification (CED) answer tone. ANS differs from V.8
ANSam in that the latter varies in amplitude due to modulation by
a 15 Hz signal.
V.25 specifically includes procedures for disabling echo suppressors
as defined by ITU-T Rec. G.164 [N-10]. However, G.164 echo
suppressors have now for the most part been replaced by G.165 [N-11]
echo cancellers, which require phase reversals in the disabling tone
(see ANSam above). As a result, V.25 was modified in July, 2001 to
say that phase reversal in the ANS tone is required if echo
cancellers are to be disabled.
One possible V.25 sequence is as follows:
1) The calling terminal starts generating CT as soon as the call is
connected.
2) The called terminal waits in silence for 1.8 to 2.5 s after
answer, then begins to transmit ANS continuously. If echo
cancellers are on the line the phase of the ANS signal is reversed
every 450 ms. ANS will not reach the calling terminal until the
echo control equipment has been disabled. Since this takes about
a second it can only happen in the gap between one burst of CT and
the next.
3) Following detection of ANS, the calling terminal may stop
generating CT immediately or wait until the end of the current
burst to stop. In any event, it must wait at least 400 ms (at
least 1 s if phase reversal of ANS is being used to disable echo
cancellers) after stopping CT before it can generate the calling
station response tone. This tone is modem-specific, not specified
in V.25.
4) The called terminal plays out ANS for 2.6 to 4.0 seconds or until
it has detected calling station response for 100 ms. It waits 55-
95 ms (nominal 75 ms) in silence. (Note that the upper limit of
Schulzrinne, Petrack Expires - April 2005 [Page 30]
RTP Events and Tones Payloads October 2004
95 ms is rather close to the point at which echo control may
reestablish itself.) If the reason for ANS termination was
timeout rather than detection of calling station response, the
called terminal begins to play out ANS again to maintain disabling
of echo control until the calling station responds.
The events defined for V.25 signalling are shown in Table 5. The
gateway at the calling end SHOULD use a packetization interval
smaller than the nominal duration of a CT burst, to ensure that CT
playout at the called end precedes the sending of ANS from that end.
The gateway at the called end MUST report ANS as soon as it is
recognized, providing updates at reasonable intervals as it
continues. However, an ANS event packet SHOULD NOT be sent until it
is possible to discriminate between an ANS event and an ANSam event
(see V.8 events, above). If a phase reversal is detected, the sender
MUST report completion of the ANS event and beginning of the /ANS
event at the time that the reversal was detected. If another phase
reversal is detected, the sender MUST report the end of the /ANS
event and the beginning of an ANS event, continuing in this way until
the tone is removed.
Event Frequency Encoding Type Volume?
Hz (decimal)
Answer tone 2100 32 tone yes
(ANS)
/ANS 2100 rev 33 tone yes
CT 1300 49 tone yes
Table 5: Events for V.25 signals
3.2.5 T.30 Events
ITU-T Recommendation T.30 [N-19] defines the procedures used by Group
III FAX terminals. The pre-message procedures for which the events
of this section are defined are used to identify terminal
capabilities at each end and negotiate operating mode. Post-message
procedures are also included, to handle cases such as multiple
document transmission. FAX terminals support a wide variety of
protocol stacks, so T.30 has a number of options for control
protocols and sequences.
T.30 defines two tone signals used at the beginning of a call. The
CNG signal is sent by the calling terminal. It is a pure 1100 Hz
Schulzrinne, Petrack Expires - April 2005 [Page 31]
RTP Events and Tones Payloads October 2004
tone played in bursts: 0.5 s on, 3 s off. It continues until timeout
or until the calling terminal detects a response.
The called terminal waits in silence for at least 200 ms. It then
may return CED tone, which is identical to V.25 ANS, or else V.8
ANSam if it has V.8 capability. If ANSam is returned and the calling
terminal has V.8 capability, it transmits CI to begin a V.8
negotiation. Otherwise, the calling and called terminals enter the
T.30 negotiation phase.
In the negotiation phase the terminals exchange binary messages using
V.21 signals, high channel frequencies only. Each message is
preceded by a one-second (nominal) preamble consisting entirely of
HDLC flag octets (0x7E). This flag has the function of preparing
echo control equipment for the message which follows.
The pre-transfer messages exchanged using the V.21 coding are:
Digital Identification Signal (DIS):
Characterizes the standard ITU-T capabilities of the called
terminal.
Digital Transmit Command (DTC):
The digital command response to the standard capabilities
identified by the DIS signal.
Digital Command Signal (DCS):
The digital set-up command responding to the standard capabilities
identified by the DIS signal.
Confirmation To Receive (CFR):
A digital response confirming that the entire pre-message
procedure has been completed and the message transmissions may
commence.
If the calling terminal wishes to transmit a document, the three
messages exchanged are DIS (from the called terminal), DCS, and CFR.
If it wishes to receive, the sequence changes to DIS, DTC, DCS, and
CFR. Each message may consist of multiple frames, each bounded by
HDLC flags. The messages are organized as a series of octets, but
like V.8bis, T.30 calls for the insertion of extra "0" bits to
prevent spurious recognition of HDLC flags.
T.30 also provides for the transmission of control messages after
document transmission has completed (e.g., to support transmission of
Schulzrinne, Petrack Expires - April 2005 [Page 32]
RTP Events and Tones Payloads October 2004
multiple documents). The transition back from the modem used for
document transmission (V.17 [I-11], V.27ter [I-13], V.29 [I-14], V.34
[I-15]) to V.21 signalling is preceded by 75 ms (nominal) of
silence). Control message transmission is preceded by the preamble
described above.
Before CFR the transmitting terminal sends a training signal
consisting of a steady string of V.21 high channel zeros (1850 Hz
tones) for 1.5 s. The sender MAY report this training signal either
as a single extended V.21 upper channel "0" event, or as a series of
"0" events of normal duration. The event(s) MUST be reported as soon
as the training signal is recognized, with updates at reasonable
intervals thereafter.
Applications supporting T.30 signalling using the telephone-events
payload MUST transfer T.30 messages in the form of sequences of bits,
using the V.21 bit events defined in section 3.2.2. The transmitted
information MUST include the complete contents of the message: the
initial HDLC flags, the information field, the checksum, and the
terminating HDLC flags.
Transmission MUST also include the extra "0" bits added to prevent
false recognition of HDLC flags at the receiver. Implementors should
note that these extra "0" bits mean that in general T.30 messages as
transmitted on the wire will not come out to an even multiple of
octets. Sending implementations MAY choose to vary the packetization
interval to include exactly one octet of information plus any extra
"0" bits inserted into that octet.
The events defined for T.30 signalling are shown in Table 6. The CED
and /CED events represent exactly the same tone signals as V.8 ANS
and /ANS, and are given the same codepoints; they are reproduced here
only for convenience. For reporting of CNG, the gateway at the
calling end SHOULD use a packetization interval smaller than the
nominal duration of a CNG burst, to ensure that CED has time to
disable echo control before it times out.
The gateway at the called end MUST report CED as soon as it is
recognized, providing updates at reasonable intervals as it
continues. However, a CED event packet SHOULD NOT be sent until it is
possible to discriminate between a CED event and an ANSam event (see
V.8 events, above). If a phase reversal is detected, the sender MUST
report completion of the CED event and beginning of the /CED event at
the time that the reversal was detected. If another phase reversal
is detected, the sender MUST report the end of the /CED event and the
beginning of an CED event, continuing in this way until the tone is
removed.
Schulzrinne, Petrack Expires - April 2005 [Page 33]
RTP Events and Tones Payloads October 2004
The sending gateway SHOULD report the V.21 preamble flag event as
soon as it is identified, with updates at intervals which are
multiples of one octet of transmision time (nominally 26.4 ms) until
it completes. The preamble is reported as a single event; reports of
the individual bits making it up MUST NOT be sent. The end of the
event is reported when a pattern of V.21 bits other than an HDLC flag
is observed. This means that the V.21 preamble event absorbs the
initial HDLC flags of the following message.
Event Frequency Encoding Type Volume?
Hz (decimal)
CNG (Calling 1100 36 tone yes
tone)
CED (Called 2100 32 tone yes
tone)
/CED 2100 33 tone yes
ph. rev.
V.21 preamble (V.21 bits) 54 tone yes
flag
Table 6: Events for T.30 signals
3.2.6 V.18 Events
ITU-T Recommendation V.18 [N-22] defines a terminal for text
conversation, possibly in combination with voice. What follows is a
description of the use of telephone events for V.18 startup. In all
cases, once the startup procedures have been completed, the gateways
SHOULD use another payload type to transfer the content of the text
conversation.
V.18 is intended to interoperate with a variety of legacy text
terminals, so its start-up sequence can consist of a series of
stimuli designed to determine what is at the other end. Two V.18
terminals talking to each other will use V.8bis to negotiate startup,
and continue at the physical level with V.21 at 300 bits/s carrying
7-bit characters bounded by start and stop bits. The V.18 terminal
is also designed to interoperate with:
- Baudot [I-21], a five bit character encoding nominally operating
at 45.45 or 50 bits/s with frequencies 1800 Hz = "0", 1400 Hz =
"1";
Schulzrinne, Petrack Expires - April 2005 [Page 34]
RTP Events and Tones Payloads October 2004
- Q.23 [N-13] (DTMF), which uses combinations of "*" and "#" as
escapes to achieve a full repertoire of characters; these
combinations are documented in V.18 Annex B;
- EDT, which is V.21 [N-23] operating at 110 bits/s in half-duplex
mode (lower channel only); characters are 7 bit IA5 plus initial
start bit, trailing parity bit, and two stop bits;
- Bell 103 mode (documented in Recommendation V.18 Annex D), which
is structurally similar to V.21, but uses different frequencies:
lower channel, 1070 Hz = "0", 1270 Hz = "1"; upper channel, 2025
Hz = "0", 2225 Hz = "1"; characters are US ASCII framed by one
start bit, one trailing parity bit, and one stop bit;
- V.23 [I-12] based videotex, in Minitel and Prestel versions. V.23
offers a forward channel operating at 1200 bits/s if possible
(2100 Hz = "0", 1300 Hz = "1") or otherwise at 600 bits/s (1700 Hz
= "0", 1300 Hz = "1"), and a 75 bits/s backward channel which is
transmitting 390 Hz (continuous "1"s) except when "0" is to be
transmitted (450 Hz);
- a non-V.18 text terminal using V.21 [N-23] at 300 bits/s.
Characters are 7 bit national (e.g., US ASCII) with a start bit,
parity, and one stop bit.
The startup sequences for all these different terminal types are
naturally quite different. The V.18 initial startup sequence
addresses itself to V.8-capable terminals and V.21 terminals and, by
the combination of signals, to V.23 videotex terminals. During the
initial startup sequence the V.18 terminal listens for frequency
responses characterizing the other terminal types. If it does not
make contact in the preliminary step it probes for each type
specifically. By the nature of the application, V.18 has been
designed to provide an extremely robust startup capability.
More on the details of V.18 startup below. The point to make here is
that gateways intending to serve V.18 MUST be prepared to transfer
information using payload types other than telephone-events from the
start of the session. Events have been defined as shown in Table 7
to allow the sending gateway to indicate the nature of the modulated
content it is receiving. However, the alternative payload type used
to transfer the content may (for example, in the case of RFC 2793) be
independent of the type of modulation received at the sending
gateway. A receiving gateway MUST NOT rely on the receipt of a V.18-
related event to control playout at its end if content is available
in another payload type.
Schulzrinne, Petrack Expires - April 2005 [Page 35]
RTP Events and Tones Payloads October 2004
Note that none of the codepoints in Table 7 were defined in RFC 2833.
ANS2225 was defined in earlier versions of this document, but the
other events are new in version-04.
Event Bit Rate Frequency Encoding Type Volume?
bits/s Hz (decimal)
ANS2225 N/A 2225 52 tone yes
V21L110 110 980/1180 55 other no
B103L300 300 1070/1270 56 other no
V23Main 600/1200 1700-2100 57 other no
/1300
V23Back 75 450/390 58 other no
Baud4545 45.45 1800/1400 59 other no
Baud50 50 1800/1400 60 other no
Table 7: Events for V.18 interworking
ANS2225:
This 2225 Hz answer tone is described in ITU-T Recommendation
V.18, Annex D [N-22] for Bell 103 class modems operating in the
text telephone mode. It is also referred to in ITU-T
Recommendation V.22 [N-24]. This is a pure tone with no amplitude
modulation and no semantics attached to phase reversals, if there
are any. It is necessary to accommodate it for completeness, and
for compliance with various legal ordinances. A distinct codepoint
must be allocated to this event since it must be differentiated
from the normal, 2100 Hz answer tone when reproduced at the far-
end gateway.
V21L110:
indicates that the sending device has detected V.21 modulation
operating in the lower channel at 110 bits/s.
B103L300:
indicates that the sending device has detected Bell 103 class
modulation operating in the low channel at 300 bits/s.
Schulzrinne, Petrack Expires - April 2005 [Page 36]
RTP Events and Tones Payloads October 2004
V23Main:
indicates that the sending device has detected V.23 modulation
operating in the high speed channel.
V23Back:
indicates that the sending device has detected V.23 modulation
operating in the 75 bit/s back-channel.
Baud4545:
indicates that the sending device has detected Baudot modulation
operating at 45.45 bits/s.
Baud50:
indicates that the sending device has detected Baudot modulation
operating at 50 bits/s.
The V.18 startup procedure for the calling terminal requires it to
transmit a V.18 sequence in the following cycle:
1) Silence for one second.
2) Repeat the following steps three times:
i) Four repetitions of V.8 CI on the V.21 low channel, without
preamble. If using telephone-events, the sending gateway
SHOULD report each CI as the combination of a CI event as
defined in Table 4, overlapped by a series of V.21 low channel
bit events expressing the final octet with its start and stop
bits. The final octet for a V.18 text terminal is defined in
V.8 to be '01000 00101'.
ii) Silence for two seconds.
3) Play out the XCI signal, a three second string of V.23 bit
patterns defined in clause 3.13 of Recommendation V.18 and using
the V.23 1200 bits/s upper channel. The sending gateway MUST
provide the pattern using an alternate payload type, but MAY also
send the V23Main event defined in Table 7 for the duration of XCI
playout. The receiving gateway MUST be prepared to play out the
pattern from that alternate payload type without relying on
receipt of the V23Main event.
The second and third steps are repeated until a response is detected.
The following responses are possible:
Schulzrinne, Petrack Expires - April 2005 [Page 37]
RTP Events and Tones Payloads October 2004
- 2100 Hz modulated (ANSam) as defined in ITU-T Recommendation V.8;
this would indicate a V.8-capable terminal. The V.18 terminal
completes a V.8 negotiation to start up. The gateways MUST use
the events as defined for V.8 to sustain this negotiation.
- 2100 Hz (ANS) as defined in ITU-T V.25; this could indicate a
V.18, V.21 (300 bits/s), or V.23 terminal. The calling V.18
terminal transmits a 40-bit pattern (TXP) using the V.21 low
channel and monitors the frequencies returned. The calling end
gateway SHOULD send the TXP pattern as a sequence of V.21 low
channel bit events. An answering V.18 terminal will return TXP,
so the calling end gateway MUST be prepared to play the
corresponding V.21 sequence back to the calling terminal.
- 2225 Hz; this indicates a Bell 103 class terminal in answer mode.
The gateway at the answering end MUST report this as the ANS2225
defined in this section. The event begins when the 2225 Hz tone
is detected. Event updates should be provided at reasonable
intervals until the tone is taken away.
- 1300 Hz; provided this persists for at least 1.7 s, it indicates a
V.23-based terminal operating at 600 or 1200 bits/s. The calling
terminal will enter V.23 mode, transmitting on the 75 bits/s V.23
back-channel. The gateway at the answering end 1300 Hz tone MAY
also report the V23Main event. When the calling V.18 terminal
responds, the gateway at the calling end MAY also report the
V23Back event.
- 1650 Hz; if this persists at least 500 ms, it indicates a V.21
(300 bits/s) terminal. The calling V.18 terminal will enter into
that mode of operation.
- 1400 or 1800 Hz; this indicates a Baudot terminal. The calling
terminal will determine the line rate and enter into Baudot mode.
Either gateway MAY send the Baud4545 or Baud50 event as applicable
if and when it identifies the nature of the signals being passed.
- DTMF tones; these indicate a DTMF terminal. The calling terminal
will enter DTMF mode.
- 980 or 1180 Hz; these indicate a V.21-based terminal running at
either 110 or 300 bits/s, and using the low channel. The calling
terminal does timing to make the distinction. Note that this is
very difficult, and in practice the sending gateway is often
informed in advance (e.g. through provisioning) what line speed is
being used. If it observes continuous 980 Hz for at least 1.5s,
the calling terminal enters V.21 (300 bit/s) mode using the high
channel for transmission. The gateway at the answering end SHOULD
NOT use V.21 events to report the initial signals from the
Schulzrinne, Petrack Expires - April 2005 [Page 38]
RTP Events and Tones Payloads October 2004
answering terminal. The tones payload type defined in this
document MAY be used instead. A gateway receiving V.21 signals at
110 bits/s MAY report the V21L110 event once it has made a
definitive determination of the line speed.
- 1270 Hz; this indicates a Bell 103 terminal operating in calling
mode (lower channel). The V.18 terminal enters Bell 103 mode
using the higher channel. The gateways MUST transmit the Bell 103
modem content using an alternative payload type, and MAY report
the B103L300 or B103H300 event as applicable to the modulation
received from the terminal at their end.
- 390 Hz (only when sending XCI); this indicates a V.23 terminal
using the 75 bits/s channel. The V.18 terminal enters V.23 mode
using the high-speed (1200 bits/s) channel. The gateway at the
answering end MAY report the V2375 event. The gateway at the
calling end MAY report the V231200 event.
Similar logic governs the actions taken by a V.18 terminal operating
in answer mode.
3.3 Basic Subscriber Line Events
Table 8 summarizes the basic events applicable to a subscriber line.
All of them except for the two line states "On Hook" and "Off Hook"
and the "Flash" event are propagated from the network toward the
line. There are two typical applications for these events:
1) A gateway to which the line is attached reports line states and
"Flash" to a call controller (possibly indirectly through another
device) and propagates tones and ringing in the other direction.
In this application, the gateway is being controlled in-band
through the use of telephony-events rather than through a separate
media gateway control protocol such as Megaco/H.248.
2) Tones and media are being passed between two gateways in the
middle of the media path, where both ends of the call are in the
PSTN. In this application, only a limited subset of the events in
Table 8 are applicable. These are indicated by a "Yes" in the
"Mid-path?" column of Table 8.
It is rather difficult to define the "On Hook" state, since it is
still possible to transmit information (ringing, information for
displays) when the line is on hook. Moreover, an ISDN set can still
signal while it is on hook. A working definition is that "On Hook"
is a state where the terminal will not originate media, will not
present media other than display information to the user, and will
accept only a limited set of signals. "Off Hook" is a state where
Schulzrinne, Petrack Expires - April 2005 [Page 39]
RTP Events and Tones Payloads October 2004
these restrictions are lifted. The line states "On Hook" and "Off
Hook" are mutually exclusive.
The "Flash" event indicates a brief transition from off hook to on
hook and back to off hook. By definition, the transition is too
brief to end the current call. Depending on what services are
subscribed to on the line, the "Flash" event may be interpreted as a
service invocation. The "Flash' event MUST NOT be sent when the
state is "On Hook". Its duration is from the point at which "On
Hook" was first observed until the line returns to "Off Hook". The
gateway MUST NOT report both "Flash" and the "On Hook" - "Off Hook"
pair, and MUST NOT report "Flash" until the event is complete.
The time threshold for distinguishing true "On Hook" - "Off Hook"
from "Flash" is a matter of national standards.
ITU Recommendation E.182 [N-9] defines when certain tones should be
used. The specification of the actual tones varies from country to
country. An useful source for this purpose is Supplement 2 to ITU-T
Recommendation E.180 [N-8] and its successor documents. E.182 [N-9]
defines the following standard tones that are heard by the caller:
Dial tone:
The exchange is ready to receive address information.
PABX internal dial tone:
The PABX is ready to receive address information.
Special dial tone:
Same as dial tone, but the caller's line is subject to a specific
condition, such as call diversion or a voice mail is available
(e.g., "stutter dial tone").
Second dial tone:
The network has accepted the address information, but additional
information is required.
Ring:
This named signal event causes the recipient to generate
an alerting signal ("ring"). The actual tone or other
indication used to render this named event is left up to
the receiver. (This differs from the ringing tone, below,
heard by the caller.)
Schulzrinne, Petrack Expires - April 2005 [Page 40]
RTP Events and Tones Payloads October 2004
Ringing tone:
The call has been placed to the callee and a calling signal
(ringing) is being transmitted to the callee. This tone is also
called "ringback" and is heard by the caller to confirm call
progress.
Special ringing tone:
A special service, such as call forwarding or call waiting, is
active at the called number.
Busy tone:
The called telephone number is busy.
Congestion tone:
Facilities necessary for the call are temporarily unavailable.
Calling card service tone:
The calling card service tone consists of 60 ms of the sum of 941
Hz and 1477 Hz tones (DTMF '#'), followed by 940 ms of 350 Hz and
440 Hz (U.S dial tone), decaying exponentially with a time
constant of 200 ms.
Special information tone:
The callee cannot be reached, but the reason is neither "busy" nor
"congestion". This tone should be used before all call failure
announcements, for the benefit of automatic equipment.
Comfort tone:
The call is being processed. This tone may be used during long
post-dial delays, e.g., in international connections.
Hold tone:
The caller has been placed on hold.
Record tone:
The caller has been connected to an automatic answering device and
is requested to begin speaking.
Caller waiting tone:
Schulzrinne, Petrack Expires - April 2005 [Page 41]
RTP Events and Tones Payloads October 2004
The called station is busy, but has call waiting service.
Pay tone:
The caller, at a payphone, is reminded to deposit additional
coins.
Positive indication tone:
The supplementary service has been activated.
Negative indication tone:
The supplementary service could not be activated.
Off-hook warning tone:
The caller has left the instrument off-hook for an extended period
of time and is not engaged in a call.
The following tones can be heard by either calling or called party
during a conversation:
Call waiting tone:
Another party wants to reach the subscriber.
Warning tone:
The call is being recorded. This tone is not required in all
jurisdictions.
Intrusion tone:
The call is being monitored, e.g., by an operator.
CPE alerting signal (CAS):
A tone used to alert a device to an arriving in-band FSK data
transmission. A CPE alerting signal is a combined 2130 and 2750 Hz
tone, both with tolerances of 0.5% and a duration of 80 to 85 ms.
The CPE alerting signal is used with ADSI services and Call
Waiting ID services [N-26].
The following tone is heard by operators:
Payphone recognition tone:
Schulzrinne, Petrack Expires - April 2005 [Page 42]
RTP Events and Tones Payloads October 2004
The person making the call or being called is using a payphone
(and thus it is ill-advised to allow collect calls to such a
person).
Event Mid- Encoding Type Volume?
path? (decimal)
------------ --------- ------- ------
-----
Off-Hook no 64 state no
On-Hook no 65 state no
Flash no 16 other no
Dial tone no 66 tone yes
PABX internal no 67 tone yes
dial tone
Special dial tone no 68 tone yes
Second dial tone yes 69 tone yes
Ringing tone yes 70 tone yes
Special ringing yes 71 tone yes
tone
Busy tone yes 72 tone yes
Congestion tone yes 73 tone yes
Special yes 74 tone yes
information tone
Comfort tone yes 75 tone yes
Hold tone yes 76 tone yes
Record tone yes 77 tone yes
Caller waiting yes 78 tone yes
tone
Call waiting tone no 79 tone yes
Schulzrinne, Petrack Expires - April 2005 [Page 43]
RTP Events and Tones Payloads October 2004
Pay tone yes 80 tone yes
Positive yes 81 tone yes
indication tone
Negative yes 82 tone yes
indication tone
Warning tone yes 83 tone yes
Intrusion tone yes 84 tone yes
Calling card yes 85 tone yes
service tone
Payphone yes 86 tone yes
recognition tone
CPE alerting no 87 tone yes
signal (CAS)
Off-hook warning no 88 tone yes
tone
Ring no 89 other no
Table 8: Basic subscriber line events
3.4 Extended Subscriber Line Events
Table 9 summarizes a number of additional tones that can appear on a
subscriber line. All of these are directed toward the line. As in
the previous section, some of these may be initiated only by the call
controller controlling the line concerned, while others may be
initiated elsewhere along the call path. Unfortunately, it has been
difficult to locate documentation for the usage of some of these
events, even though reasonable guesses can be made based on their
names -- most do not appear in ITU-T standards.
Depending on the available user interfaces, an implementation MAY
render all tones in Table 8 the same or, preferably, use the tones
conveyed by a concurrent "tone" payload or other RTP audio payload.
Alternatively, it MAY provide a textual representation.
Acceptance tone:
No description available.
Schulzrinne, Petrack Expires - April 2005 [Page 44]
RTP Events and Tones Payloads October 2004
Confirmation tone:
Used to indicate that an exchange has received information from an
access line or has processed a request received from an access
line, such as the activation/deactivation of line services. In
North America, this is implemented as a dual-frequency tone, 350
and 440 Hz, played for 100 ms with a pause of 100 ms followed by
the tone for another 300 ms.
Recall dial tone:
Sometimes referred to as "stutter dial tone". Recall dial tone is
used to indicate that an exchange is ready to accept address
information or other information from an access line. In North
America, this is implemented as a dual-frequency tone, 350 and 440
Hz, played as three 100 ms bursts separated by pauses of 100 ms.
End of three party service tone:
No description available.
Facilities tone:
No description available.
Line lockout tone:
A tone or silence played out to a line after an extended period of
off-hook state where the line is not involved in a call.
Typically line lockout follows a period of playout of off-hook
warning tone (see previous section).
Number unobtainable tone:
A tone played out to indicate that the dialled number is not in
service. The tone may precede an announcement. In North America,
the tone is implemented as a dual- frequency tone, 480 and 620 Hz,
played as two 500 ms bursts separated by pauses of 500 ms.
Offering tone:
No description available.
Permanent signal tone:
A tone played out to a line if the phone has gone off-hook, has
received dial tone, but no digits have been entered within a
timeout period (of the order of ten to twenty seconds). If the
phone remains off-hook, the permanent signal tone will typically
Schulzrinne, Petrack Expires - April 2005 [Page 45]
RTP Events and Tones Payloads October 2004
give way after a further timeout to off-hook warning tone (see
previous section).
Preemption tone:
No description available.
Queue tone:
No description available.
Refusal tone:
No description available.
Route tone:
No description available.
Valid tone:
No description available.
Waiting tone:
No description available.
Warning tone (end of period):
No description available.
Warning Tone (PIP tone):
No description available.
Schulzrinne, Petrack Expires - April 2005 [Page 46]
RTP Events and Tones Payloads October 2004
Event Mid- Encoding Type Volume?
path? (decimal)
------------ ----- ------- ---- ----
Acceptance tone yes 96 tone yes
Confirmation tone yes 97 tone yes
Dial tone, recall ?? 98 tone yes
End of three party yes 99 tone yes
service tone
Facilities tone yes 100 tone yes
Line lockout tone no 101 tone yes
Number yes 102 tone yes
unobtainable tone
Offering tone yes 103 tone yes
Permanent signal no 104 tone yes
tone
Preemption tone yes 105 tone yes
Queue tone yes 106 tone yes
Refusal tone yes 107 tone yes
Route tone yes 108 tone yes
Valid tone yes 109 tone yes
Waiting tone yes 110 tone yes
Warning tone (end yes 111 tone yes
of period)
Warning Tone (PIP yes 112 tone yes
tone)
Table 9: Extended subscriber line events
Schulzrinne, Petrack Expires - April 2005 [Page 47]
RTP Events and Tones Payloads October 2004
3.5 Trunk Events
Trunks (or circuits) in the PSTN are the media paths between
telephone switches. They may carry media corresponding to any of the
events described in the previous sections except the non-mid-path
line events. They may also carry signals corresponding to the events
defined in this section. These events support an application where
PSTN signalling is carried between two gateways without being
interworked to signalling in the IP network: the "RTP trunk"
application.
In the "RTP trunk" application, RTP is used to replace a normal
circuit-switched trunk between two nodes. This is particularly of
interest in a telephone network that is still mostly circuit-
switched. In this case, each end of the RTP trunk encodes audio
channels into the appropriate encoding, such as G.723.1 [I-4] or
G.729 [I-5]. However, this encoding process destroys in-band
signaling information which is carried using the least-significant
bit ("robbed bit signaling") and may also interfere with in-band
signaling tones, such as the MF (multi- frequency) digit tones.
This section defines events related to four different signalling
systems. Three of these are based on the exchange of multi-frequency
tones. The fourth operates on digital trunks only, and makes use of
low-order bits stolen from the encoded media. In addition, this
section defines tone events for continuity testing of the media path.
Note: implementors are warned that the descriptions of signalling
systems given below are incomplete. They are provided to give
context to the related event definitions, but omit many details
important to implementation.
Table 10 lists all of the events defined for trunk signalling. This
table was updated considerably in the present document compared with
RFC 2833. Sending implementations conforming to this document MUST
NOT send any of the event codes marked "Reserved" or "Unassigned" in
the table. Receiving implementations MUST ignore event codes marked
"Reserved" or "Unassigned". In a typical application, the gateways
may exchange roles from one call to the next: they must be capable of
either sending or receiving each signal in the table.
Schulzrinne, Petrack Expires - April 2005 [Page 48]
RTP Events and Tones Payloads October 2004
Event Frequency Encoding Type Volume?
Hz (decimal)
------------------- ---------- ----------- ------ ----
MF 0...9 (Table 11) 128...137 tone yes
MF Code 11 / KP3P / 700+1700 138 tone yes
ST3P
MF KP/KP1 1100+1700 139 tone yes
MF KP2/ST2P 1300+1700 140 tone yes
MF ST 1500+1700 141 tone yes
MF Code 12/STP 900+1700 142 tone yes
Reserved 143
ABCD signaling (see 144...159 state no
below)
Reserved 160...166
Continuity check- 2000 167 tone yes
tone
Continuity verify- 1780 168 tone yes
tone
Reserved 169...173
Metering pulse 174 other no
Trunk unavailable 175 other no
MFC Forward 1...15 (Table 13) 176...190 tone yes
MFC Backward 1...15 (Table 14) 191...205 tone yes
Table 10: Trunk signalling events
3.5.1 Signalling System No. 5
Signalling System No. 5 (SS No. 5) is defined in ITU-T
Recommendations Q.140 through Q.180 [N-15]. It has two systems of
signals: "line signalling", to acquire and release the trunk, and
Schulzrinne, Petrack Expires - April 2005 [Page 49]
RTP Events and Tones Payloads October 2004
"register signalling", to pass digits forward from one switch to the
next.
No. 5 line signalling uses tones at two frequencies: 2400 and 2600
Hz. The tones are used singly for most signals, but together for the
Clear-forward and Release-guard. (This reduces the chance of an
accidental call release due to carried media content looking like one
of the frequencies.) The specific signal indicated by a tone depends
on the stage of call set-up at which it is applied.
No events are defined in support of No. 5 line signalling. However,
implementations MAY use the ABCD events described in section 3.5.4
and shown in Table 10 to propagate SS No. 5 line signals. If they do
so, they MUST use the following mappings. These mappings are based
on an underlying mapping equating A=1 to presence of 2400 Hz signal
and B=1 to presence of 2600 Hz signal in the indicated direction.
- neither signal present: event code 144;
- 2400 Hz present: event code 145;
- 2600 Hz present: event code 146;
- both 2400 and 2600 Hz present: event code 147.
The initial event report for each signal SHOULD be generated at the
time of recognition as indicated in ITU-T Recommendation Q.141, Table
1 (i.e. 40 ms for "seizing" and "proceed-to-send", 125 ms for all
other signals). The packetization interval following the initial
report SHOULD be chosen with considerations of reliable transmission
given first priority. Note that the receiver must supply its own
volume values for converting these events back to tones. Moreover,
the receiver MAY extend the playout of "seizing" until it has
received the first report of a KP event (see below), so that it has
better control of the interval between ending of the seizing signal
and start of KP playout.
The KP has to be sent beginning 80 +/- 20 ms after the SS No. 5
"seizing" signal has stopped.
No. 5 register signalling uses pairs of tones to convey digits and
signals framing them. The tone combinations and corresponding
signals are shown in the Table 11. All signals except KP1 and KP2
are sent for a duration of 55 ms. KP1 and KP2 are sent for a
duration of 100 ms. Inter-signal pauses are always 55 ms.
Schulzrinne, Petrack Expires - April 2005 [Page 50]
RTP Events and Tones Payloads October 2004
Lower Upper Frequency (Hz)
Frequency
(Hz) 900 1100 1300 1500 1700
700 Digit 1 Digit 2 Digit 4 Digit 7 Code 11
900 Digit 3 Digit 5 Digit 8 Code 12
1100 Digit 6 Digit 9 KP1
1300 Digit 0 KP2
1500 ST
Table 11: SS No. 5 Register Signals
The KP signals are used to indicate start of digit signalling. KP1
indicates a call expected to terminate in a national network served
by the switch to which the signalling is being sent. KP2 indicates a
call that is expected to transit through the switch to which the
signalling is being sent, to another international exchange. The end
of digit signalling is indicated by the ST signal. Code 11 or Code
12 following a country code (and possibly another digit) indicates a
call to be directed to an operator position in the destination
country. A Code 12 may be followed by other digits indicating a
particular operator to whom the call is to be directed.
Implementations using the telephone-events payload to carry SS No. 5
register signalling MUST use the following events from Table 10 to
convey the register signals shown in Table 11:
- event code 128 to convey Digit 0
- event codes 129-137 to convey Digits 1 through 9 respectively
- event code 139 to convey KP1
- event code 140 to convey KP2
- event code 141 to convey ST
- event code 138 to convey Code 11
- event code 142 to convey Code 12.
The sending implementation SHOULD send an initial event report for
the KP signals as soon as they are recognized, and MUST send an event
report for all of these signals as soon as they have completed.
Schulzrinne, Petrack Expires - April 2005 [Page 51]
RTP Events and Tones Payloads October 2004
These events support an application where the receiving gateway is
intended to capture the received digits for processing. To meet
timing requirements in the case where signalling is to be propagated
from one PSTN segment to another, implementations SHOULD use another
payload type, such as the tones payload type also defined in this
document, to pass both line and register signals. The alternative is
to use the ABCD events for line signals as described earlier.
3.5.2 North American R1
The MF signaling system R1 is mainly used in North America. R1 is
defined in ITU-T Recommendations Q.310-Q.332 [N-16]. Like SS No. 5,
R1 has both line and register signals. The line signals (not
counting Busy and Reorder) are implemented on analogue trunks through
the application of a 2600 Hz tone, and on digital trunks by using
digital channels obtained by bit stealing of the eighth bit of each
channel every sixth frame. Interpretation of the line signals is
state-dependent (as with SS No. 5).
Implementations MAY use the "off-hook" state, event code 64 from
Table 8 to indicate that 2600 Hz tone is playing (binary "1" is
indicated), and "on-hook" state, event code 65, to indicate that no
2600 Hz tone is playing (binary "0" is indicated). If this system is
used, idle state MUST be indicated by a state of "on-hook" at both
ends.
R1 has a signal capacity of 15 codes for forward inter-register
signals but no backward inter-register signals. Each code or digit
is transmitted by a tone pair from a set of 6 frequencies. The R1
register signals consist of KP, ST, and the digits "0" through "9".
The frequencies allotted to the signals are shown in Table 12. Note
that these frequencies are the same as those allotted to the
similarly-named SS No. 5 register signals, except that KP uses the
frequency combination corresponding to KP1 in SS No. 5. Table 12
also shows additional signals used in North American practice: KP',
KP2P, KP3P, STP or ST', ST2P, and ST3P. [Reference to be added when
verified -- probably Telcordia GR-506.]
Lower Upper Frequency (Hz)
Frequency
(Hz) 900 1100 1300 1500 1700
700 Digit 1 Digit 2 Digit 4 Digit 7 KP3P or ST3P
900 Digit 3 Digit 5 Digit 8 KP' or STP
1100 Digit 6 Digit 9 KP
Schulzrinne, Petrack Expires - April 2005 [Page 52]
RTP Events and Tones Payloads October 2004
1300 Digit 0 KP2P or ST2P
1500 ST
Table 12: North American R1 and MF Register Signals
Implementations using the telephone-events payload to carry North
American R1 register signalling MUST use the following events from
Table 10 to convey the register signals shown in Table 12:
- event code 128 to convey Digit 0
- event codes 129-137 to convey Digits 1 through 9 respectively
- event code 139 to convey KP
- event code 141 to convey ST
- event code 142 to convey KP' or STP
- event code 140 to convey KP2P or ST2P
- event code 138 to convey KP3P or ST3P.
Unlike SS No. 5, R1 allows a large tolerance for the time of onset of
register signalling following the recognition of start-dialling line
signal. This means that sending implementations MAY wait to send a
KP event report until the KP has completed.
3.5.3 MFC R2 signaling
International MFC R2 is described in ITU-T Recommendations Q.400-
Q.490 [N-17], but there are many national variants. R2 line signals
are continuous, out-of-band, link by link, and channel associated.
R2 (inter)register signals are multifrequency, compelled, in-band,
end to end, and also channel associated. R2 line signals may be
analog, one-bit digital using the A bit in the 16th channel, or
digital using both A and B bits.
No events are defined in support of R2 line signalling. However,
implementations MAY use the ABCD events described in section 3.5.4
and shown in Table 10 to propagate these signals. If they do so,
they MUST use the following mappings.
For the analog R2 line signals shown in Table 1 of ITU-T
Recommendation Q.411, implementations MUST map the R2 signals as
Schulzrinne, Petrack Expires - April 2005 [Page 53]
RTP Events and Tones Payloads October 2004
follows. This mapping is based on an underlying mapping of A bit = 1
when tone is present.
- event code 144 (Table 10) is used to indicate the Q.411 "tone-off"
condition
- event code 145 (Table 10), is used to indicate the Q.411 "tone-on"
condition.
The digital R2 line signals as described by ITU-T Recommendation
Q.421 are carried in two bits, A and B. The mapping between A and B
bit values and event codes SHALL be the same in both directions and
SHALL follow the principles for A and B bit mapping specified in
section 3.5.4.
In R2 signaling, the signaling sequence is initiated from the
outgoing exchange by sending a line "seizing" signal. After line
"seizing" signal (and "seizing acknowledgment" signal in R2D), the
signaling sequence continues using MF register signals. ITU-T
Recommendation Q.441 classifies the forward MF register signals into
Groups I and II, the backward MF register signals into Groups A and
B. These groups are significant with respect both to what sort of
information they convey and where they can occur in the signalling
sequence.
R2 is a compelled tone signaling protocol, meaning that one tone is
played until an "acknowledgment or directive for the next tone" is
received which indicates that the original tone should cease. In R2
signaling, the signaling sequence is initiated from the outgoing
exchange by sending a forward Group I signal. The first forward
signal is typically the first digit of the called number. The
incoming exchange typically replies with a backward Group A-1
indicating to the outgoing exchange to send the next digit of the
called number.
The tones have meaning, however, the meaning varies depending on
where the tone occurs in the signaling. The meaning may also depend
on the country. Thus, to avoid an unmanageable number of events, this
document simply provides means to indicate the 15 forward and 15
backward MF R2 tones (i.e., using event codes 176-190 and 191-205
respectively as shown in Table 10). The frequency pairs for these
tones are shown in Tables 13 and 14.
Schulzrinne, Petrack Expires - April 2005 [Page 54]
RTP Events and Tones Payloads October 2004
Lower Upper Frequency (Hz)
Frequency
(Hz) 1500 1620 1740 1860 1980
1380 Fwd 1 Fwd 2 Fwd 4 Fwd 7 Fwd 11
1500 Fwd 3 Fwd 5 Fwd 8 Fwd 12
1620 Fwd 6 Fwd 9 Fwd 13
1740 Fwd 10 Fwd 14
1860 Fwd 15
Table 13: R2 Forward Register Signals
Lower Upper Frequency (Hz)
Frequency
(Hz) 1140 1020 900 780 660
1020 Bkwd 1
900 Bkwd 2 Bkwd 3
780 Bkwd 4 Bkwd 5 Bkwd 6
660 Bkwd 7 Bkwd 8 Bkwd 9 Bkwd 10
540 Bkwd 11 Bkwd 12 Bkwd 13 Bkwd 14 Bkwd 15
Table 14: R2 Backward Register Signals
3.5.4 ABCD Transitional Signaling For Digital Trunks
ABCD is a 4-bit signaling system used by digital trunks. For N-state
(N<=16) signaling, the first N values are used. ABCD signaling events
are all mutually exclusive states. The most recent state transition
determines the current state.
The T1 ESF (extended super frame format) allows 2, 4, and 16 state
signalling bit options. These signalling bits are named A, B, C, and
D. Signalling information is sent as robbed bits in frames 6, 12, 18,
and 24 when using ESF T1 framing. A D4 superframe only transmits 4-
state signalling with A and B bits. On the CEPT E1 frame, all
Schulzrinne, Petrack Expires - April 2005 [Page 55]
RTP Events and Tones Payloads October 2004
signalling is carried in timeslot 16, and two channels of 16-state
(ABCD) signalling are sent per frame.
Since this information is a state rather than a changing signal,
implementations SHOULD use the following triple-redundancy mechanism,
similar to the one specified in ITU-T Rec. I.366.2 [N-12], Annex L.
At the time of a transition, the same ABCD information is sent 3
times at an interval of 5 ms. If another transition occurs during
this time, then this continues. After a period of no change, the ABCD
information is sent every 5 seconds.
As shown in Table 10, the 16 possible states are represented by event
codes 144 to 159 respectively. Implementations using these event
codes MUST map them to and from the ABCD information based on the
following principles:
1) State numbers are derived from the used subset of ABCD bits by
treating them as a single binary number, where the A bit is the
low-order bit. As examples: if only A and B bits are used, A=0,
B=1, the state number is 2 (binary 10); if all four bits are used,
A=0, B=1, C=0, D=1, then the state number is 10 (binary 1010).
2) State numbers map to event codes by order of increasing value
(i.e., state number 0 maps to event code 144, ..., state number 15
maps to event code 159).
3.5.5 Continuity Tones
Continuity tones are used for testing circuit continuity during call
setup. Two basic procedures are used. In international practice,
clause 7 of ITU- T Recommendation Q.724 [N-18] describes a procedure
applicable to four-wire trunk circuits, where a single 2000 +/- 20 Hz
check-tone is transmitted from the initiating telephone switch. The
remote switch sets up a loopback, and continuity check passes if the
sending switch can detect the tone on the return path. Q.724 clause
8 describes the procedure for two-wire trunk circuits. The two-wire
procedure involves two tones: a 2000 Hz tone sent in the forward
direction, and a 1780 +/- 20 Hz tone sent in response.
If implementations use the telephone-events payload type to propagate
continuity check-tones, they MUST map these tones to event codes as
follows:
- For four-wire continuity testing, the 2000 Hz check-tone is mapped
to event code 167.
- For two-wire continuity testing, the initial 2000 Hz check-tone Hz
tone is mapped to event code 167. The 1780 Hz continuity verify
tone is mapped to event code 168.
Schulzrinne, Petrack Expires - April 2005 [Page 56]
RTP Events and Tones Payloads October 2004
3.5.6 Trunk Unavailable Event
This event indicates that the trunk is unavailable for service. The
length of the downtime is indicated in the duration field. The
duration field is set to a value that allows adequate granularity in
describing downtime. A value of 1 second is RECOMMENDED. When the
trunk becomes unavailable, this event is sent with the same timestamp
three times at an interval of 20 ms. If the trunk persists in the
unavailable state at the end of the indicated duration, then it is
retransmitted, preferably with the same redundancy scheme.
Unavailability of the trunk might result from a failure or an
administrative action. This event is used in a stateless manner to
synchronize trunk unavailability between equipment connected through
provisioned RTP trunks. It avoids the unnecessary consumption of
bandwidth in sending a continuous stream of RTP packets with a fixed
payload for the duration of the downtime, as would be required in
certain E1-based applications. In T1-based applications, trunk
conditioning via the ABCD transitional events can be used instead.
3.5.7 Metering Pulse Event
The metering pulse event may be used to transmit meter pulsing for
billing purposes. Since the metering pulse is a discrete event, each
metering pulse event report MUST have both the 'M' and 'E' bits set.
Meter pulsing is normally transmitted by out-of-band means while
conversation is in progress. Senders MUST therefore be prepared to
transmit both the telephone-event and audio payload types
simultaneously. Metering pulse events MUST be retransmitted as
recommended in section 2.5.1.4. It is RECOMMENDED that the
retransmission interval be the lesser of 50 ms and the pulsing rate,
but no less than audio packetization rate.
4. RTP Payload Format for Telephony Tones
4.1 Introduction
As an alternative to describing tones and events by name, as
described in section 2, it is sometimes preferable to describe them
by their waveform properties. In particular, recognition is faster
than for naming signals since it does not depend on recognizing
durations or pauses.
There is no single international standard for telephone tones such as
dial tone, ringing (ringback), busy, congestion ("fast-busy"),
special announcement tones or some of the other special tones, such
as payphone recognition, call waiting or record tone. However, ITU-T
Schulzrinne, Petrack Expires - April 2005 [Page 57]
RTP Events and Tones Payloads October 2004
Recommendation E.180 [N-7] notes that across all countries, these
tones share a number of characteristics:
- Telephony tones consist of either a single tone, the addition of
two or three tones or the modulation of two tones. (Almost all
tones use two frequencies; only the Hungarian "special dial tone"
has three.) Tones that are mixed have the same amplitude and do
not decay.
- In-band tones for telephony events are in the range of 25 Hz
(ringing tone in Angola) to 2600 Hz (the tone used for line
signalling in SS No. 5 and R1). The in-band telephone frequency
range is limited to 3400 Hz. R2 defines a 3825 Hz out-of-band
tone for line signalling on analogue trunks. (The piano has a
range from 27.5 to 4186 Hz.)
- Modulation frequencies range between 15 (ANSam tone) to 480 Hz
(Jamaica). Non-integer frequencies are used only for frequencies
of 16 2/3 and 33 1/3 Hz. (These fractional frequencies appear to
be derived from AC power grid frequencies.)
- Tones that are not continuous have durations of less than four
seconds.
- ITU Recommendation E.180 [N-7] notes that different telephone
companies require a tone accuracy of between 0.5 and 1.5%. The
Recommendation suggests a frequency tolerance of 1%.
4.2 Examples of Common Telephone Tone Signals
As an aid to the implementor, Table 15 summarizes some common tones.
The rows labeled "ITU ..." refer to ITU-T Recommendation E.180 [N-7].
Note that there are no specific guidelines for these tones. In the
table, the symbol "+" indicates addition of the tones, without
modulation, while "*" indicates amplitude modulation. The meaning of
these tones is described in section 3.3.
Tone Name Frequency On Period Off Period
(s) (s)
CNG 1100 0.5 3.0
V.25 CT 1300 0.5 2.0
CED 2100 3.3 --
ANS 2100 3.3 --
Schulzrinne, Petrack Expires - April 2005 [Page 58]
RTP Events and Tones Payloads October 2004
ANSam 2100*15 3.3 --
V.21 "0" bit, 1180 0.00333 --
channel 1
V.21 "1" bit, 980 0.00333 --
channel 1
V.21 "0" bit, 1850 0.00333 --
channel 2
V.21_"1"_bit, 1650 0.00333 --
channel 2
------------- ---------- --------- ----------
ITU dial tone 425 -- --
U.S. dial 350+440 -- --
tone
ITU ringing 425 0.67-1.5 3-5
tone
U.S._ringing_ 440+480 2.0 4.0
tone
ITU busy tone 425
U.S. busy 480+620 0.5 0.5
tone
ITU 425
congestion
tone
U.S. 480+620 0.25 0.25
congestion
tone
Table 15: Examples of telephony tones
Schulzrinne, Petrack Expires - April 2005 [Page 59]
RTP Events and Tones Payloads October 2004
4.3 Use of RTP Header Fields
4.3.1 Timestamp
The RTP timestamp reflects the measurement point for the current
packet. The event duration described in section 4.3.3 extends
forwards from that time.
4.3.2 Marker Bit
The tones payload type uses the marker bit to distinguish the first
RTP packet reporting a given instance of a tone from succeeding
packets for that tone. The marker bit SHOULD be set to 1 for the
first packet, and to 0 for all succeeding packets relating to the
same tone.
4.3.3 Payload Format
Based on the characteristics described above, this document defines
an RTP payload format called "tone" that can represent tones
consisting of one or more frequencies. (The corresponding MIME type
is "audio/tone".) The default timestamp rate is 8000 Hz, but other
rates may be defined. Note that the timestamp rate does not affect
the interpretation of the frequency, just the durations.
In accordance with current practice, this payload format does not
have a static payload type number, but uses a RTP payload type number
established dynamically and out-of-band.
The payload format is shown in Figure 2.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| modulation |T| volume | duration |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
......
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Schulzrinne, Petrack Expires - April 2005 [Page 60]
RTP Events and Tones Payloads October 2004
Figure 2: Payload Format for Tones
The payload contains the following fields:
modulation:
The modulation frequency, in Hz. The field is a 9-bit unsigned
integer, allowing modulation frequencies up to 511 Hz. If there is
no modulation, this field has a value of zero.
T:
If the "T" bit is set (one), the modulation frequency is to be
divided by three. Otherwise, the modulation frequency is taken as
is.
This bit allows frequencies accurate to 1/3 Hz, since modulation
frequencies such as 16 2/3 Hz are in practical use.
volume:
The power level of the tone, expressed in dBm0 after dropping the
sign, with range from 0 to -63 dBm0. (Note: A preferred level
range for digital tone generators is -8 dBm0 to -3 dBm0.)
duration:
The duration of the tone, measured in timestamp units. The tone
begins at the instant identified by the RTP timestamp and lasts
for the duration value. The value of zero is not permitted and
tones with such a duration SHOULD be ignored.
The definition of duration corresponds to that for sample-based
codecs, where the timestamp represents the sampling point for the
first sample.
frequency:
The frequencies of the tones to be added, measured in Hz and
represented as a 12-bit unsigned integer. The field size is
sufficient to represent frequencies up to 4095 Hz, which exceeds
the range of telephone systems. A value of zero indicates silence.
A single tone can contain any number of frequencies. If the
number of frequencies it contains is odd, padding SHALL be added
to bring the packet to a 32-bit boundary. (RFC 3550 [N-4]
requires that padding be set to all zeroes.)
R:
Schulzrinne, Petrack Expires - April 2005 [Page 61]
RTP Events and Tones Payloads October 2004
This field is reserved for future use. The sender MUST set it to
zero, the receiver MUST ignore it.
4.3.4 Optional MIME Parameters
The "rate" parameter describes the sampling rate, in Hertz. The
number is written as a floating point number or as an integer. If
omitted, the default value is 8000 Hz.
4.4 Procedures
This section defines the procedures associated with the tones payload
type.
4.4.1 Sending Procedures
As indicated by the examples in Table 15, the duration of an
individual tone may range from a few milliseconds to a number of
seconds. Timing considerations dictate some general guidelines for
how these two extremes should be handled by the sender. For tones
directed to human listeners, timing is not critical, within a
tolerance of 100 ms or so at either beginning or end. For tones
directed to remote equipment, the most critical aspect of timing is
intra-stream time relationships -- that is, the individual tone
durations and the interval between tones for a related sequence of
them. The timing of the start of playout of a related sequence is
less critical within limits.
In the case of longer-duration tones, implementations SHOULD expect
to generate multiple RTP packets for the same tone instance. The
considerations just enumerated suggest that a packetization interval
in the order of 50 ms may be acceptable, in terms of the initial
delay it imposes on remote playback. Implementations MAY adjust the
packetization interval to suit the nature of the tones being played
out. The packetization interval SHOULD remain constant until the
tone ends in order not to distort playout times through buffer under-
runs.
The RTP timestamp MUST be updated for each packet generated (in
contrast, for instance, to the timestamp for packets carrying
telephone- events). The first RTP packet for a tone SHOULD have the
marker bit set to 1. Subsequent packets for the same tone SHOULD
have the marker bit set to 0, and the RTP timestamp in each
subsequent packet MUST equal the sum of the timestamp and the
duration in the preceding packet. A final RTP packet SHOULD be
generated as soon as the end of the tone is detected, without waiting
for the latest packetization interval to elapse.
Schulzrinne, Petrack Expires - April 2005 [Page 62]
RTP Events and Tones Payloads October 2004
If the tones are meant for machine consumption, the intervals between
them are potentially critical. Implementations may be aware of this
situation, or may infer it from a heuristic such as that the tones
are less than a second in duration. In this situation, it is
RECOMMENDED that if a tone follows another tone within a period of
100 ms or less, the new tone should be reported as soon as it has
been identified. The suggested 50 ms packetization interval should
be applied to subsequent reports for the same tone.
The above advice applies to tones lasting in the order of 25 ms or
more. Shorter tones, which are likely to be from modems, SHOULD be
reported in batches. The tones payload format requires that each
tone be reported in a separate RTP packet, but it is RECOMMENDED that
multiple RTP packets be reported in the same UDP packet. Individual
tones should be given their actual durations (i.e., from transition
point to transition point) rather than reporting a new tone at each
bit boundary.
4.4.2 Receiving Procedures
Receiving implementations play out the tones as received. When
playing out successive tone reports for the same tone (marker bit is
zero, the RTP timestamp is contiguous with that of the previous RTP
packet, and payload content is identical), the receiving
implementation SHOULD continue the tone without change or a break.
5. Application Considerations
5.1 Combining Tones and Named Events
Gateways which send signalling events via RTP MAY send both named
signals (section named) and the tone representation (section tones)
as a single RTP session, using the redundancy mechanism defined in
RFC 2198 [N-2] to interleave the two representations. It is generally
a good idea to send both, since it allows the receiver to choose the
appropriate rendering.
If a gateway cannot present a tone representation, it SHOULD also
send the audio tones as regular RTP audio packets using either the
codec used for regular speech signals or a codec that is known to
carry such signals successfully (e.g., PCMU).
Some low-rate codecs cannot accurately represent certain
tones, such as DTMF.
Schulzrinne, Petrack Expires - April 2005 [Page 63]
RTP Events and Tones Payloads October 2004
5.2 Simultaneous Generation of Audio and Events
A source can choose between four approaches:
Events and audio:
The source sends events and encoded audio packets (e.g., PCMU or
the codec used for speech signals) for the same time instant. In
that mode, events are treated as redundant encodings for the
encoded audio stream.
Events only:
The source does not send encoded audio while event tones are
active and only sends named events, without any redundancy beyond
the periodic updates of longer-lasting events.
Events only, with redundancy:
The source does not send encoded audio while event tones are
active. It only sends named events, but uses RFC 2198 [N-2]
redundancy, with named events as both primary and redundant
encodings.
Events and audio, with redundancy:
During an event, the source sends both named events and audio,
using RFC 2198 [N-2] to interleave audio data, current and
redundant named events.
The choices above do not affect the event redundancy mechanism
described in section 2.6.
Note that a period covered by a named event may overlap in time with
a period of audio encoded by other means. This is likely to occur at
the onset of a tone and is necessary to avoid possible errors in the
interpretation of the reproduced tone at the remote end.
Implementations supporting this payload format MUST be prepared to
handle the overlap. It is RECOMMENDED that gateways only render the
encoded tone since the audio may contain spurious tones introduced by
the audio compression algorithm. However, it is anticipated that
these extra tones in general should not interfere with recognition at
the far end.
5.3 Strategies For Handling FAX and Modem Signals
As described in section 3.2, the typical data application involves a
pair of gateways interposed between two terminals, where the
terminals are in the PSTN. The gateways are likely to be serving a
Schulzrinne, Petrack Expires - April 2005 [Page 64]
RTP Events and Tones Payloads October 2004
mixture of voice and data traffic, and need to adopt payload types
appropriate to the media flows as they occur. If voice compression
is in use for voice calls, this means that the gateways need the
flexibility to switch to other payload types when data streams are
recognized.
Within the established IETF framework, this implies that the gateways
must negotiate the potential payloads (voice, telephone-events,
tones, voice-band data, T.38 FAX [I-8], and possibly RFC 2793 [I-1]
text and CLEARMODE [I-2] octet streams) as separate payload types.
From a timing point of view, this is most easily done at the
beginning of a call, but results in an over-allocation of resources
at the gateways and in the intervening network.
One alternative is to use named events to buy time while out-of-band
signals are exchanged to update to the new payload type applicable to
the session. Thanks to the events defined in section 3.2, this is a
viable approach for sessions beginning with V.8, V.8bis, T.30, or
V.25 control sequences.
Named data-related events also allow gateways to optimize their
operation when data signals are received in a relatively general
form. One example is the use of V.8-related events to deduce that
the voice-band data being sent in a G.711 payload comes from a
higher-speed modem and therefore requires disabling of echo
cancellors.
All of the control procedures described in section 3.2 eventually
give way to data content. As mentioned above, this content will be
carried by other payload types. Receiving gateways MUST be prepared
to switch to the other payload type within the time constraints
associated with the respective applications. (For several of the
procedures documented below, the sender provides 75 ms of silence
between the initial control signalling and the sending of data
content.) In some cases (V.8bis [N-21], T.30 [N-19]), further
control signalling may happen after the call has been established.
A possible strategy is to send both telephone-events and the data
payload in an RFC 2198 redundancy arrangement. The receiving gateway
then propagates the data payload whenever no event is in progress.
For this to work, the data payload and events (when present) MUST
cover exactly the same time period; otherwise spurious events will be
detected downstream.
Note that there are a number of cases where no control sequence will
precede the data content. This is true, for example, for a number of
legacy text terminal types. In such instances, the events defined in
section 3.2.6 in particular MAY be sent to help the remote gateway
optimize its handling of the alternative payload.
Schulzrinne, Petrack Expires - April 2005 [Page 65]
RTP Events and Tones Payloads October 2004
5.4 Examples
5.4.1 Use of RFC 2198 Redundancy With Named Events
A typical RTP packet, where the user is just dialing the last digit
of the DTMF sequence "911", is shown in Figure 3. The first digit was
200 ms long (1600 timestamp units) and started at time 0, the second
digit lasted 250 ms (2000 timestamp units) and started at time 800 ms
(6400 timestamp units), the third digit was pressed at time 1.4 s
(11,200 timestamp units) and the packet shown was sent at 1.45 s
(11,600 timestamp units). The frame duration is 50 ms. To make the
parts recognizable, Figure 3 ignores byte alignment. Timestamp and
sequence number are assumed to have been zero at the beginning of the
first digit. In this example, the dynamic payload types 96 and 97
have been assigned for the redundancy mechanism and the telephone
event payload, respectively.
Schulzrinne, Petrack Expires - April 2005 [Page 66]
RTP Events and Tones Payloads October 2004
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
| 2 |0|0| 0 |0| 96 | 13 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
| 11200 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
| 0x5234a8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| block PT | timestamp offset | block length |
|1| 97 | 11200 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| block PT | timestamp offset | block length |
|1| 97 | 11200 - 6400 = 4800 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| Block PT |
|0| 97 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| digit |E R| volume | duration |
| 9 |1 0| 7 | 1600 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| digit |E R| volume | duration |
| 1 |1 0| 10 | 2000 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| digit |E R| volume | duration |
| 1 |0 0| 20 | 400 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Example RTP packet after dialing "911"
Table 16 shows all packets up to and including the packet shown in
the figure. The last three columns describe the duration fields in
the event payloads. The timestamp offset is not shown. We assume here
that the digits happen to start on a 50 ms multiple, which is
somewhat unlikely.
Schulzrinne, Petrack Expires - April 2005 [Page 67]
RTP Events and Tones Payloads October 2004
Time (s) Event RTP Timestamp Duration
Seq "9" "1" "1"
0.00 "9" - - - - -
starts
0.05 0 0 400 - -
0.10 1 0 800 - -
0.15 2 0 1,200 - -
0.20 "9" ends 3 0 1,600 - -
0.25 4 0 1,600 - -
0.30 5 0 1,600 - -
0.80 "1" - - - - -
starts
0.85 6 6,400 1,600 400 -
0.90 7 6,400 1,600 800 -
0.95 8 6,400 1,600 1,200 -
1.00 9 6,400 1,600 1,600 -
1.05 "1" ends 10 6,400 1,600 2,000 -
1.10 11 6,400 1,600 2,000 -
1.15 12 6,400 1,600 2,000 -
1.40 "1" - - - - -
starts
1.45 13 11,200 1,600 2,000 400
Table 16: RTP packets for example
5.4.2 Combined Tone and Telephone-event Payloads
The payload formats in sections 2 and 4 can be combined into a single
payload using the method specified in RFC 2198 [N-2]. Figure 4_shows
an example. In that example, the RTP packet combines two "tone" and
one "telephone-event" payloads. The payload types are chosen
arbitrarily as 97 and 98, respectively, with a sample rate of 8000
Hz. Here, the redundancy format has the dynamic payload type 96.
The packet represents a snapshot of U.S. ringing tone, 1.5 seconds
(12,000 timestamp units) into the second "on" part of the 2.0/4.0
second cadence, i.e., a total of 7.5 seconds (60,000 timestamp units)
into the ring cycle. The 440 + 480 Hz tone of this second cadence
started at RTP timestamp 48,000. Four seconds of silence preceded it,
but since RFC 2198 only has a fourteen-bit offset, only 2.05 seconds
(16383 timestamp units) can be represented. Even though the tone
sequence is not complete, the sender was able to determine that this
is indeed ringback, and thus includes the corresponding named event.
Schulzrinne, Petrack Expires - April 2005 [Page 68]
RTP Events and Tones Payloads October 2004
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| V |P|X| CC |M| PT | sequence number |
| 2 |0|0| 0 |0| 96 | 31 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
| 48000 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
| 0x5234a8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| block PT | timestamp offset | block length |
|1| 98 | 16383 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| block PT | timestamp offset | block length |
|1| 97 | 16383 | 8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| Block PT |
|0| 97 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| event=ring |0|0| volume=0 | duration=28383 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| modulation=0 |0| volume=63 | duration=16383 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0| frequency=0 |0 0 0 0| frequency=0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| modulation=0 |0| volume=5 | duration=12000 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0| frequency=440 |0 0 0 0| frequency=480 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Combining tones and events in a single RTP packet
Schulzrinne, Petrack Expires - April 2005 [Page 69]
RTP Events and Tones Payloads October 2004
6. MIME Registration
6.1 audio/telephone-event
MIME media type name: audio
MIME subtype name: telephone-event
Required parameters: none.
Optional parameters:
The "events" parameter lists the events supported by the
implementation. Events are listed as one or more comma-separated
elements. Each element can either be a single integer or two
integers separated by a hyphen. No white space is allowed in the
argument. The integers designate the event numbers supported by
the implementation.
The "rate" parameter describes the sampling rate, in Hertz. The
number is written as a floating point number or as an integer. If
omitted, the default value is 8000 Hz.
Encoding considerations:
This type is only defined for transfer via RTP [N-4].
Security considerations:
See the "Security Considerations" section (section 7) in this
document.
Interoperability considerations: none
Published specification: This document.
Applications which use this media:
The telephone-event audio subtype supports the transport of events
occuring in telephone systems over the Internet.
Additional information:
1. Magic number(s): N/A
2. File extension(s): N/A
3. Macintosh file type code: N/A
Schulzrinne, Petrack Expires - April 2005 [Page 70]
RTP Events and Tones Payloads October 2004
6.2 audio/tone
MIME media type name: audio
MIME subtype name: tone
Required parameters: none
Optional parameters:
The "rate" parameter describes the sampling rate, in Hertz. The
number is written as a floating point number or as an integer. If
omitted, the default value is 8000 Hz.
Encoding considerations:
This type is only defined for transfer via RTP [N-4].
audio/tone MIME body parts contain binary data. A content-
transfer-encoding of "binary" is strongly encouraged for messaging
environments which support binary transport. A content-transfer-
encoding of base-64 (and the associated transformation) is
strongly encouraged for messaging environments which do not
support binary transfer.
Security considerations:
See the "Security Considerations" section (section 7) in this
document.
Interoperability considerations: none
Published specification: This document.
Applications which use this media: The tone audio subtype supports
the transport of pure composite tones, for example those commonly
used in the current telephone system to signal call progress.
Additional information:
1. Magic number(s): N/A
2. File extension(s): N/A
3. Macintosh file type code: N/A
Schulzrinne, Petrack Expires - April 2005 [Page 71]
RTP Events and Tones Payloads October 2004
7. Security Considerations
RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP
specification (RFC 3550 [N-4]), and any appropriate RTP profile (for
example RFC 3551 [N-5]). This implies that confidentiality of the
media streams is achieved by encryption. Because the data
compression used with this payload format is applied end-to-end,
encryption may be performed after compression so there is no conflict
between the two operations.
This payload type does not exhibit any significant non-uniformity in
the receiver side computational complexity for packet processing to
cause a potential denial-of-service threat.
In older networks employing in-band signaling and lacking appropriate
tone filters, the tones in section 3.5 may be used to commit toll
fraud.
Additional security considerations are described in RFC 2198 [N-2].
A security review of this payload format found no additional
considerations.
8. IANA Considerations
This document defines two new RTP payload formats, named telephone-
event and tone, and associated Internet media (MIME) types,
audio/telephone-event and audio/tone. It also defines a number of
codepoints for events.
Within the audio/telephone-event type, events MUST be registered with
IANA. Registrations are subject to approval by the current chair of
the IETF audio/video transport working group, or by an expert
designated by the transport area director if the AVT group has
closed.
The meaning of new events MUST be documented either as an RFC or an
equivalent standards document produced by another standardization
body, such as ITU-T.
9. Changes Since RFC 2833
9.1 Changes Before The -04 Version
- RFC 2833 had assigned only two code points to the three MF signals
S1, S2 and S3. S3 has been moved to code point 174.
Schulzrinne, Petrack Expires - April 2005 [Page 72]
RTP Events and Tones Payloads October 2004
- The test tone descriptions were confusing; now, there are just two
test tone entries, for the 2010 Hz and 1780 Hz tone.
- MFC R2 forward and backward tones were added to the trunk event
list.
- Added the "trunk unavailable" event (Rajesh Kumar).
- Clarified that the duration timestamp is unsigned and that events
exceeding the maximum duration expressible in the duration field
should be split into several events, i.e., with a new start time.
- Distinguished states from events. States are sent with an
estimated duration, and can be superseded if the state changes
before the duration has expired. A special duration value of 0
indicates an infinite duration.
- Clarified how very long events that exceed the maximum expressable
duration value should be handled.
9.2 Changes In The -04 Version
- Updated RTP and AVP RFC references.
- The -04 version includes a major reorganization of prior text and
addition of extensive material, both tutorial and normative, on
the use of the named events. A detailed description of the
changes is provided below in Appendix A.
- Removed the conformance statements present in section 3.3 in
previous versions.
- Moved the flash event from the "DTMF" to the "Basic Line Event"
category.
- Added a number of V.18 event codepoints.
9.3 Changes In The -05 Version
- Added a complete accounting for the changes from RFC 2833.
- Added a backward compatibility provision in 2.5.1.1.
- Deleted second paragraph of 2.5.2.2 because it also appears in
section 5.2.
- Added text to 2.5.2.2 regarding missed Marker bit
Schulzrinne, Petrack Expires - April 2005 [Page 73]
RTP Events and Tones Payloads October 2004
- Deleted redundant text in the middle of the third paragraph of
section 3.1.
- Added a note to the "980 or 1180 Hz" outcome for V.18 in section
3.2.6 indicating that the distinction between 110 and 300 bit/s
V.21 is very difficult.
- Added the Metering Pulse event to Table 10 and new section 3.5.7
to document it.
- Transposed Table 14 to fix the assignment of R2 backward register
signals to frequencies.
- Corrected the statement before Table 7 (V.18 event codepoints) to
indicate that all V.18 codepoints are new since RFC 2833.
- Editorial fixes (mostly correction of cross-references). Fixed
duplicate section number for 2.5.2.3.
- Updated boilerplate to meet current requirements.
- Added the current editor as an author.
10. Acknowledgements
The suggestions of the Megaco working group are gratefully
acknowledged. Detailed advice and comments were provided by Hisham
Abdelhamid, Flemming Andreasen, Fred Burg, Steve Casner, Dan
Deliberato, Fatih Erdin, Bill Foster, Mike Fox, Mehryar Garakani,
Gunnar Hellstrom, Rajesh Kumar, Terry Lyons, Steve Magnell, Zarko
Markov, Kai Miao, Satish Mundra, Kevin Noll, Vern Paxson, Oren Peleg,
Colin Perkins, Raghavendra Prabhu, Moshe Samoha, Todd Sherer, Adrian
Soncodi, Yaakov Stein, Mira Stevanovic, Alex Urquizo and Herb
Wildfeur.
11. Authors
Henning Schulzrinne
Dept. of Computer Science
Columbia University
1214 Amsterdam Avenue
New York, NY 10027
USA
electronic mail: schulzrinne@cs.columbia.edu
Scott Petrack
eDial
USA
Schulzrinne, Petrack Expires - April 2005 [Page 74]
RTP Events and Tones Payloads October 2004
electronic mail: sf scott.petrack@edial.com
Tom Taylor
Nortel Networks
1852 Lorraine Ave.
Ottawa, Ontario
Canada K1H 6Z8
Phone: +1 613 763-1496
E-mail: taylor@nortelnetworks.com
12. References
12.1 Normative References
[N-1] S. Bradner, "Key words for use in RFCs to indicate
requirement levels", RFC 2119, Internet Engineering Task
Force, Mar. 1997.
[N-2] C. E. Perkins, I. Kouvelas, O. Hodson, V. J. Hardman, M.
Handley, J. C. Bolot, A. Vega-Garcia, and S. Fosse-Parisis,
"RTP payload for redundant audio data", RFC 2198, Internet
Engineering Task Force, Sept. 1997.
[N-3] M. Handley and V. Jacobson, "SDP: session description
protocol", RFC 2327, Internet Engineering Task Force, Apr.
1998.
[N-4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson,
"RTP: a transport protocol for real-time applications", RFC
3550, Internet Engineering Task Force, Jul. 2003.
[N-5] H. Schulzrinne, "RTP profile for audio and video conferences
with minimal control", RFC 3551, Internet Engineering Task
Force, Jul. 2003.
[N-6] S. Casner, P. Hoschka, "MIME Type Registration of RTP Payload
Formats", RFC 3555, Internet Engineering Task Force, Jul.
2003.
[N-7] International Telecommunication Union, "Technical
characteristics of tones for the telephone service",
Recommendation E.180/Q.35, ITU-T, Geneva, Switzerland, Mar.
1998.
Schulzrinne, Petrack Expires - April 2005 [Page 75]
RTP Events and Tones Payloads October 2004
[N-8] International Telecommunication Union, "Various tones used in
national networks", Recommendation Supplement 2 to
Recommendation E.180, ITU-T, Geneva, Switzerland, Jan. 1994.
This publication has now been replaced by a list periodically
updated and available through the "International Numbering
Resources" link on the ITU-T home page. The latest version
(dated Feb. 2003) appears as an annex to ITU-T Operational
Bulletin 781. While a price is posted for the Operational
Bulletin, the list itself is available at no charge through
the "Lists annexed..." link on the Operational Bulletin page.
[N-9] International Telecommunication Union, "Application of tones
and recorded announcements in telephone services",
Recommendation E.182, ITU-T, Geneva, Switzerland, Mar. 1998.
[N-10] International Telecommunication Union, "Echo suppressors",
Recommendation G.164, ITU-T, Geneva, Switzerland, Nov. 1988.
[N-11] International Telecommunication Union, "Echo cancellers",
Recommendation G.165, ITU-T, Geneva, Switzerland, Mar. 1993.
[N-12] International Telecommunication Union, "AAL type 2 service
specific convergence sublayer for trunking", Recommendation
I.366.2, ITU-T, Geneva, Switzerland, Feb. 1999.
[N-13] International Telecommunication Union, "Technical features of
push-button telephone sets", Recommendation Q.23, ITU-T,
Geneva, Switzerland, Nov. 1988.
[N-14] International Telecommunication Union, "Multifrequency push-
button signal reception", Recommendation Q.24, ITU-T,
Geneva, Switzerland, Nov. 1988.
[N-15] International Telecommunication Union, "Specifications for
signaling system no. 5", Recommendation Q.140-Q.180, ITU-T,
Geneva, Switzerland, Nov. 1988.
[N-16] International Telecommunication Union, "Specifications of
Signalling System R1", Recommendation Q.310-Q.332, ITU-T,
Geneva, Switzerland, Nov. 1988.
[N-17] International Telecommunication Union, "Specifications of
signalling system R2", Recommendation Q.400-Q.490, ITU-T,
Geneva, Switzerland, Nov. 1988.
[N-18] International Telecommunication Union, "Telephone user part
signalling procedures", Recommendation Q.724, ITU-T, Geneva,
Switzerland, Nov. 1988.
Schulzrinne, Petrack Expires - April 2005 [Page 76]
RTP Events and Tones Payloads October 2004
[N-19] International Telecommunication Union, "Procedures for
document facsimile transmission in the general switched
telephone network", Recommendation T.30, ITU-T, Geneva,
Switzerland, July 2003.
[N-20] International Telecommunication Union, "Procedures for
starting sessions of data transmission over the public
switched telephone network", Recommendation V.8, ITU-T,
Geneva, Switzerland, Nov. 2000.
[N-21] International Telecommunication Union, "Procedures for the
identification and selection of common modes of operation
between data circuit-terminating equipments (DCEs) and
between data terminal equipments (DTEs) over the public
switched telephone network and on leased point-to-point
telephone-type circuits", Recommendation V.8bis, ITU-T,
Geneva, Switzerland, Nov. 2000.
[N-22] International Telecommunication Union, "Operational and
interworking requirements for {DCEs operating in the text
telephone mode", Recommendation V.18, ITU-T, Geneva,
Switzerland, Nov. 2000. See also Recommendation V.18
Amendment 1, Nov. 2002.
[N-23] International Telecommunication Union, "300 bits per second
duplex modem standardized for use in the general switched
telephone network", Recommendation V.21, ITU-T, Geneva,
Switzerland, Nov. 1988.
[N-24] International Telecommunication Union, "1200 bits per second
duplex modem standardized for use in the general switched
telephone network and on point-to-point 2-wire leased
telephone-type circuits", Recommendation V.22, ITU-T, Geneva,
Switzerland, Nov. 1988.
[N-25] International Telecommunication Union, "Automatic answering
equipment and general procedures for automatic calling
equipment on the general switched telephone network including
procedures for disabling of echo control devices for both
manually and automatically established calls", Recommendation
V.25, ITU-T, Geneva, Switzerland, Oct. 1996. See also
Corrigendum 1 to Recommendation V.25, Jul. 2001.
[N-26] Bellcore, "Functional criteria for digital loop carrier
systems", Technical Requirement TR-NWT-000057, Telcordia
(formerly Bellcore), Morristown, New Jersey, Jan. 1993.
Schulzrinne, Petrack Expires - April 2005 [Page 77]
RTP Events and Tones Payloads October 2004
12.2 Informative References
[I-1] G. Hellstrom, "RTP Payload for Text Conversation", RFC 2793,
Internet Engineering Task Force, May 2000.
[I-2] R. Kreuter, "{RTP Payload for a 64 kbit/s transparent call",
Work in progress, Internet Engineering Task Force, December
2003.
[I-3] International Telecommunication Union, "Pulse code modulation
(PCM) of voice frequencies", Recommendation G.711, ITU-T,
Geneva, Switzerland, Nov. 1988.
[I-4] International Telecommunication Union, "Speech coders : Dual
rate speech coder for multimedia communications transmitting
at 5.3 and 6.3 kbit/s", Recommendation G.723.1, ITU-T,
Geneva, Switzerland, Mar. 1996.
[I-5] International Telecommunication Union, "Coding of speech at 8
kbit/s using conjugate-structure algebraic-code-excited
linear-prediction (CS-ACELP)", Recommendation G.729, ITU-T,
Geneva, Switzerland, Mar. 1996.
[I-6] International Telecommunication Union, "Terminal for low bit-
rate multimedia communication", Recommendation H.324, ITU-T,
Geneva, Switzerland, Mar. 2002.
[I-7] International Telecommunication Union, "ISDN user-network
interface layer 3 specification for basic call control",
Recommendation Q.931, ITU-T, Geneva, Switzerland, May 1998.
[I-8] International Telecommunication Union, "Procedures for real-
time Group 3 facsimile communication over IP networks",
Recommendation T.38, ITU-T, Geneva, Switzerland, Jul. 2003.
[I-9] International Telecommunication Union, "International
interworking for videotex services", Recommendation T.101,
ITU-T, Geneva, Switzerland, Nov. 1994.
[I-10] International Telecommunication Union, "Data protocols for
multimedia conferencing", Recommendation T.120, ITU-T,
Geneva, Switzerland, Jul. 1996.
[I-11] International Telecommunication Union, "A 2-wire modem for
facsimile applications with rates up to 14 400 bit/s",
Recommendation V.17, ITU-T, Geneva, Switzerland, Feb. 1991.
Schulzrinne, Petrack Expires - April 2005 [Page 78]
RTP Events and Tones Payloads October 2004
[I-12] International Telecommunication Union, "600/1200-baud modem
standardized for use in the general switched telephone
network", Recommendation V.23, ITU-T, Geneva, Switzerland,
Nov. 1988.
[I-13] International Telecommunication Union, "4800/2400 bits per
second modem standardized for use in the general switched
telephone network", Recommendation V.27ter, ITU-T, Geneva,
Switzerland, Nov. 1988.
[I-14] International Telecommunication Union, "9600 bits per second
modem standardized for use on point-to-point 4-wire leased
telephone-type circuits", Recommendation V.29, ITU-T, Geneva,
Switzerland, Nov. 1988.
[I-15] International Telecommunication Union, "A modem operating at
data signalling rates of up to 33 600 bit/s for use on the
general switched telephone network and on leased point-to-
point 2-wire telephone-type circuits", Recommendation V.34,
ITU-T, Geneva, Switzerland, Feb. 1998.
[I-16] International Telecommunication Union, "A digital modem and
analogue modem pair for use on the Public Switched Telephone
Network (PSTN) at data signalling rates of up to 56 000 bit/s
downstream and up to 33 600 bit/s upstream", Recommendation
V.90, ITU-T, Geneva, Switzerland, Sep. 1998.
[I-17] International Telecommunication Union, "A digital modem
operating at data signalling rates of up to 64 000 bit/s for
use on a 4-wire circuit switched connection and on leased
point-to-point 4-wire digital circuits", Recommendation V.91,
ITU-T, Geneva, Switzerland, May 1999.
[I-18] International Telecommunication Union, "Enhancements to
Recommendation V.90", Recommendation V.92, ITU-T, Geneva,
Switzerland, Nov. 2000.
[I-19] International Telecommunication Union, "Modem-over-IP
networks: Procedures for the end-to-end connection of V-
series DCEs", Recommendation V.150.1, ITU-T, Geneva,
Switzerland, Jan. 2003.
[I-20] R. Kocen and T. Hatala, "Voice over frame relay
implementation agreement", Implementation Agreement FRF.11,
Frame Relay Forum, Foster City, California, Jan. 1997.
Schulzrinne, Petrack Expires - April 2005 [Page 79]
RTP Events and Tones Payloads October 2004
[I-21] ANSI/TIA, "A Frequency Shift Keyed Modem for Use on the
Public Switched Telephone Network", ANSI TIA-825-A-2003,
Telecommunications Industry Association, Washington, D.C.
USA, Apr. 2003.
[I-22] J. G. van Bosse, em Signaling in Telecommunications Networks.
Telecommunications and Signal Processing, New York, New York:
Wiley, 1998.
[I-23] Siemens, "MFC signaling systems", Jan. 1983. Siemens topics.
Schulzrinne, Petrack Expires - April 2005 [Page 80]
RTP Events and Tones Payloads October 2004
Appendix A: Detailed Delta Between This Document And RFC 2833
This appendix provides a detailed analysis of the differences between
RFC 2833 and the present document and assesses their normative
effect. The first table indicates the source for the content of each
section of the present document, highlighting the changes which have
normative effect. The second table indicates text present in RFC
2833 which was dropped from the present document. In both tables,
"paragraph" is abbreviated to "para" and "section" to "sec" for
brevity. "RFC" indicates RFC 2833.
Note that "new" text means text that has been added in any version of
the present document, not just that added in version -04.
Table A-1: Source of text in the present document
Sec. of Source Norm
draft change
1.1 Sec 1.1 of RFC. Added convention for reference No
numbering and spelling out of abbreviations.
1.2 First para from sec 1 of RFC. Added a couple of No
phrases and the final sentence on out-of-band
events. Second para is new.
1.3 First four paras are new, and so is the final one. No
The remaining content comes from RFC sec 2 (paras 4-
6) and sec 3.1 (paras 4 and 5).
1.4 The opening paragraph, bullet 1), the last sentence No
of the first para of bullet 2), and the second para
of bullet 3) are new. The remainder of bullets 2)
and bullet 3) come from RFC sec 2 (paras 1-3).
2.1 First para is drawn from RFC sec 3.1 para 2 and sec No
3.3 para 6. Second para first part is from RFC sec
3.1 para 2. Last sentence is new.
2.2.1 First two sentences from RFC sec 3.4 para 1. Yes
Remainder is new.
2.2.2 First sentence from RFC sec 3.4 para 2. Remainder Yes
is new.
2.3 From RFC Figure 1. No
Schulzrinne, Petrack Expires - April 2005 [Page 81]
RTP Events and Tones Payloads October 2004
2.3.1 New text. IANA registry is new. Reference to Yes
codepoints in sec 3 of draft refelects RFC sec 3.5
"Events".
2.3.2 Originates from RFC sec 3.5 "E Bit". New text Yes
restricting "E" bit to end of last sub-event. Note
on total event duration from original para dropped
in favour of new text in last para of sec 2.5.2.2.
2.3.3 From RFC sec 3.5 "R Bit". No
2.3.4 From RFC sec 3.5 "Volume". Dropped note on range of Yes
volumes of DTMF digits. Normative changes: extended
applicability of volume field to other tones as
indicated in the codepoint tables. Added "MUST"s
for inapplicable events.
2.3.5 First two sentences from RFC sec 3.5 "Duration", but Yes
modified to handle sub-events. Last para from same
source. Remainder dealing with sub-revents, states,
and soft states is new.
2.4 Mostly new text. The second para comes from RFC sec No
3.9 with a little rewording. The default rate of
8000 Hz comes from RFC sec 3.1.
2.4.1 First two paras are new, as well as the sentence No
following the "a=fmtp" line. Remainder comes from
RFC sec 3.9. "m=" and "a=rtpmap" lines have been
added to the SDP example. MIME line has been
corrected to include range 0-15 rather than 0-11.
2.5 New introductory text. No
2.5.1.1 New text including SHOULD and MUST terminology. Yes
2.5.1.2 First para from RFC sec 3.1 para 2. No
Second para is based on RFC sec 3.6 paras 1 and 3. No
Third para is new but implied by the RFC definitions No
of the E and Marker bits.
Fourth para dealing with states is new. Yes
Fifth para is roughly based on RFC sec 3.6 para 1. No
Schulzrinne, Petrack Expires - April 2005 [Page 82]
RTP Events and Tones Payloads October 2004
Sixth para is from RFC sec 3.6 para 4. No
Seventh para on state updates is new. Yes
Final para on timing is from RFC sec 3.6 para 1. No
2.5.1.3 New text. Yes
2.5.1.4 First para is derived from the last part of RFC sec No
3.6 para 3. Comment on RFC 2198 application is new.
Second para is from RFC sec 3.5 para 7. Last Yes
sentence is new normative behaviour.
2.5.1.5 All new text. Yes
2.5.1.6 First sentence comes from RFC sec 3.6, third para. Yes
Remainder is new.
2.5.2.1 First sentence is from RFC sec 3.9 first para. Yes
Second sentence is new normative text.
2.5.2.2 First para comes from RFC sec 3.1 para 1. No
Paras 2 to 5 come from RFC sec 3.5.8. The second No
sentence of para 2 is new text.
New text beginning "This algorithm is not a license Yes
..." has added to para 4.
Paras 6, 7, and 8 are new. Yes
Para 9 comes from RFC sec 3.4 para 1. No
Para 10 is new. Yes
2.5.2.3 New. Yes
2.5.2.4 New. No
2.5.2.5 New. Yes
2.6 New text. No
2.6.1 From RFC sec 3.7 para 1, with references added to No
the playout algorithms described in sec 2.5.2.2.
Schulzrinne, Petrack Expires - April 2005 [Page 83]
RTP Events and Tones Payloads October 2004
2.6.2 Paras 1 to 4 come from RFC sec 3.7 paras 2-5, with a No
small addition to para 1. The first part of para 5
comes from RFC sec 3.1 para 3. The rest of that
para is new text.
3 New introductory text. No
3.1 Para 1 is new. Para 2 comes from RFC sec 3.3 para 1 No
with some small additions. Para 3 and most of para
4 are new. Para 5 is a reinterpretation of
information given in RFC sec 3.6 para 2. Table 1 is
from RFC sec 3.10, except that the Flash event has
been moved to Table 8.
3.2 New introductory text. Para following dashed Yes
bullets comes from RFC sec 3.3 para 3. The next
para has a "MUST", the only truly normative content
of this section.
3.2.1 All new text down to the table. The last four paras Yes
before the table have normative content.
Table 2 extracts the V.8bis events from RFC sec 3.11 Yes
Table 3. It adds information on frequencies, type,
and volume. The addition of the option to set
Volume is a normative change.
The signal definitions below the table amplify and No
in some cases correct those given in RFC sec 3.11.
3.2.2 All new text before the table. The initial para Yes
amplifies the description of V.21 provided in RFC
sec 3.11. The remaining paras provide new normative
content. Table 3 extracts the V.21 events from RFC
sec 3.11 Table 3. The option to set Volume is a
normative change.
3.2.3 All new text before the table. The two paras Yes
following the bulleted description of protocol flow
add new normative content.
The ANSam and /ANSam events in Table 4 are extracted Yes
from RFC sec 3.11 Table 3. The CI event is new.
The option to set Volume is new.
The ANSam and /ANSam descriptions are modifications Yes
of the descriptions of those events provided in RFC
sec 3.11. The text describing the reporting of
Schulzrinne, Petrack Expires - April 2005 [Page 84]
RTP Events and Tones Payloads October 2004
ANSam and /ANSam in the final para defines new
normative behaviour.
3.2.4 Para 1 is new text. The definitions of ANS and CT Yes
are taken from RFC sec 3.11, except that a sentence
was added to distinguish ANS from ANSam. The
remaining paras before the table are new. New
normative text is present in the last two paras
before the table.
Table 5 extracts the V.25-related events from RFC Yes
sec 3.11 Table 3. The option to set Volume is new.
3.2.5 The text before the table is all new, except that Yes
the description of CT is a modification of the
description provided in RFC sec 3.11. The six last
paras before the table have new normative content.
CNG in Table 6 is extracted from RFC sec 3.11 Table Yes
3. CED and /CED are new names as used in T.30 for
the ANS and /ANS signals respectively from that
table. The V.21 preamble flag is a new event. The
option to set Volume is new.
3.2.6 This section and the events it contains are entirely Yes
new. Aside from the table and succeeding event
definitions, there is related normative content in
the second para before the table and in the paras
beginning with bullet 3) following the table.
3.3 The introductory text before the event definitions Yes
is new. New normative content is defined in the
para relating to the Flash event.
The event definitions and corresponding entries in Yes
Table 8 were taken from RFC sec 3.12. The semantics
of 'state' type events and the option to set Volume
are new normative content.
3.4 Most of the first para is new. The second para Yes
comes from RFC sec 3.3 para 4. The event
definitions are new, and implicitly normative. The
events themselves as listed in Table 9 are from RFC
sec 3.13 Table 5. The option to set Volume is new.
3.5 The introductory text before the table is new except Yes
for para 2, which comes from RFC sec 1 para 3. The
requirements regarding "Reserved" and "Unassigned"
Schulzrinne, Petrack Expires - April 2005 [Page 85]
RTP Events and Tones Payloads October 2004
codes in the last para before the table is new
normative text.
Table 10 comes from RFC sec 3.14 Table 6. A number Yes
of changes have been made to this table, as follows:
Event Old New
Code
138 MF K0 or KP MF Code 11/KP3P/ST3P
(start-of-pulsing)
139 MF K1 MF KP/KP1
140 MF K2 MF KP2/ST2P
141 MF S0 to ST (end- MF ST
of-pulsing)
142 MF S1 MF Code 12/STP
143 MF S3?? (possible Reserved
typo)
160 Wink Reserved
161 Wink off Reserved
162 Incoming seizure Reserved
163 Seizure Reserved
164 Unseize circuit Reserved
165 Continuity test Reserved
166 Default continuity Reserved
tone
167 Continuity tone Continuity check-tone
(single tone)
168 Continuity test Continuity verify-tone
send
170 Continuity Reserved
verified
Schulzrinne, Petrack Expires - April 2005 [Page 86]
RTP Events and Tones Payloads October 2004
171 Loopback Reserved
172 Old milliwatt tone Reserved
(1000 Hz)
173 New milliwatt tone Reserved
(1004 Hz)
174 (unassigned) Metering pulse
175 (unassigned) Trunk unavailable
176- (unassigned) MFC forward 1...15
190
191- (unassigned) MFC backward 1...15
205
3.5.1 New text. New normative aspects are the mapping of Yes
ABCD events to line signals, recommendations for
line signal reporting in the paragraph following
that, the mapping from event codes to register
signals (which is consistent with Table 10), and the
final two paras of the section.
3.5.2 New text. The suggestion to map line signals to on- Yes
hook/off-hook is consistent with RFC sec 3.14 para
1, and is therefore not a normative change. The
mapping between event codes and register signals
involves detailed changes as described above for
Table 10.
3.5.3 New text accompanying new events. Normative aspects Yes
include the mapping of line signals to ABCD bits,
and the new forward and backward register signalling
events also covered in Table 10.
3.5.4 The first three paras come from RFC sec 3.14 paras Yes
2-4. The remainder is new normative text.
3.5.5 New text. The details contradict the tone Yes
descriptions in RFC sec 3.14 in two particulars: the
check tone frequency is 2000 Hz rather than 2010 Hz
(but the difference is within tolerance), and in the
two-tone system 1780 Hz is the verify tone rather
than the check tone. The differences may be due to
typos in RFC 2833 or because RFC 2833 used national
Schulzrinne, Petrack Expires - April 2005 [Page 87]
RTP Events and Tones Payloads October 2004
standards.
The whole area of continuity testing has been much
simplified in the present document.
3.5.6 New text and new event. Yes
3.5.7 New text and new event. Yes
4.1 Taken from RFC sec 4.1. Fourth para slightly No
changed.
4.2 Taken from RFC sec 4.2. No
4.3.1 Taken from RFC sec 4.3. No
4.3.2 New. Yes
4.3.3 Taken from RFC sec 4.4. New normative text has been Yes
added to the descriptions of the "duration" and
"frequency" fields.
4.3.4 New. No
4.4 New introductory text. No
4.4.1 All new, including a number of RECOMMENDED/SHOULDs. Yes
4.4.2 New. Yes
5.1 Paras 1 and 2 from RFC sec 2 para 8. Para 2 has No
been made slightly more detailed. The final comment
is new.
5.2 "Events and audio" and "Events and audio with No
redundancy" alternatives taken from RFC sec 3.2 para
1. Other two alternatives are new. Final para is
taken from RFC sec 3.2 para 2.
5.3 All new. Next-to-last para has a MUST. Yes
5.4.1 Introductory text and figure come from RFC sec 3.8. No
Text and table following the figure are new.
5.4.2 Taken from RFC sec 5. No
6.1 From RFC sec 6.1. No
Schulzrinne, Petrack Expires - April 2005 [Page 88]
RTP Events and Tones Payloads October 2004
6.2 From RFC sec 6.2. A note on binary transport has Yes
been added to "Encoding considerations".
7 From RFC sec 7. No
8 From RFC sec 8. Modified to indicate that all of No
the event codes, not just additional ones, are to be
carried in the IANA registry.
9 New. No
10 From RFC sec 9. Updated to acknowledge No
contributions to the present document as well as the
original RFC 2833.
11 From RFC sec 10. Updated and new author added. No
12 From RFC sec 11. Normative and informative Yes
references distinguished and many new references
added in each category.
Table A-2: Text in RFC 2833 dropped from the present document
RFC Omitted Text Norm
Section Change?
1 The payload formats described here may be useful No
in at least three applications: DTMF handling
for gateways and end systems, as well as "RTP
trunks".
1 Thus, the gateway needs to remove the in-band No
signaling information from the bit stream. It
can now either carry it out-of-band in a
signaling transport mechanism yet to be defined,
or it can use the mechanism described in this
memorandum. (If the two trunk end points are
within reach of the same media gateway
controller, the media gateway controller can
also handle the signaling.) Carrying it in-band
may simplify the time synchronization between
audio packets and the tone or signal
information. This is particularly relevant where
duration and timing matter, as in the carriage
of DTMF signals.
Schulzrinne, Petrack Expires - April 2005 [Page 89]
RTP Events and Tones Payloads October 2004
3.1 The payload format for named telephone events No
described below is suitable for both gateway and
end-to-end scenarios.
3.3 A compliant implementation MUST support the Yes
events listed in Table 1 with the exception of
"flash". If it uses some other, out-of-band
mechanism for signaling line conditions, it does
not have to implement the other events.
3.3 Note that end systems that emulate telephones No
only need to support the events described in
Sections 3.10 and 3.12, while systems that
receive trunk signaling need to implement those
in Sections 3.10, 3.11, 3.12 and 3.14, since MF
trunks also carry most of the "line" signals.
Systems that do not support fax or modem
functionality do not need to render fax-related
events described in Section 3.11.
3.5 Volume The range of valid DTMF is from 0 to -36 dBm0 No
(must accept); lower than -55 dBm0 must be
rejected (TR-TSY-000181, ITU-T Q.24A).
3.9 Since all implementations MUST be able to Yes
receive events 0 through 15, listing these
events in the a=fmtp line is OPTIONAL.
3.11 [Table 2] No
3.14 Note that trunk can also carry line events No
(Section 3.12), as MF signaling does not include
backward signals [15].
3.14 [Definitions of dropped events Wink, Incoming Yes
Seizure, Seizure, Unseize Circuit, Wink Off,
Continuity Tone Send, Continuity Tone Detect.]
4.5 This payload format uses the reliability Yes
mechanism described in Section 3.7.
6.1 All implementations MUST support events 0 Yes
Optional through 15, so that the parameter can be omitted
Parameters if the implementation only supports these
events.
Schulzrinne, Petrack Expires - April 2005 [Page 90]
RTP Events and Tones Payloads October 2004
Disclaimer of validity:
"The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org."
Copyright Notice
"Copyright (C) The Internet Society (2004). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights."
Disclaimer
"This document and the information contained herein are provided on
an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
Schulzrinne, Petrack Expires - April 2005 [Page 91]