Audio/Video Transport (avt)                               H. Schulzrinne
Internet-Draft                                               Columbia U.
Expires: July 14, 2005                                        S. Petrack
                                                                   eDial
                                                               T. Taylor
                                                                  Nortel
                                                        January 13, 2005


    Definition of Events For Modem, FAX, and Text Telephony Signals
                   draft-ietf-avt-rfc2833bisdata-00

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on July 14, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This memo defines event codes for modem, FAX, and text telephony
   signals when carried in the telephony event RTP payload.  In doing
   so, it extends the set of telephony events defined in RFC XXXX
   (currently draft-ietf-avt-rfc2833bis-07).



Schulzrinne, et al.      Expires July 14, 2005                  [Page 1]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1   Terminology  . . . . . . . . . . . . . . . . . . . . . . .  3
     1.2   Overview . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Definitions of Events For Control of Data, FAX, and Text
       Telephony Sessions . . . . . . . . . . . . . . . . . . . . . .  5
     2.1   V.8bis Events  . . . . . . . . . . . . . . . . . . . . . .  5
     2.2   V.21 Events  . . . . . . . . . . . . . . . . . . . . . . .  9
     2.3   V.8 Events . . . . . . . . . . . . . . . . . . . . . . . . 11
     2.4   V.25 Events  . . . . . . . . . . . . . . . . . . . . . . . 14
     2.5   T.30 Events  . . . . . . . . . . . . . . . . . . . . . . . 17
     2.6   V.18 Events  . . . . . . . . . . . . . . . . . . . . . . . 20
   3.  Application Considerations . . . . . . . . . . . . . . . . . . 26
     3.1   Strategies For Handling FAX and Modem Signals  . . . . . . 26
     3.2   Example of V.8 Negotiation . . . . . . . . . . . . . . . . 27
       3.2.1   Simultaneous Transmission of Events and
               Retransmitted Events Using RFC 2198 Redundancy . . . . 31
       3.2.2   Simultaneous Transmission of Events and Voice Band
               Data Using RFC 2198 Redundancy . . . . . . . . . . . . 33
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 36
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 37
   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 39
   6.1   Normative References . . . . . . . . . . . . . . . . . . . . 39
   6.2   Informative References . . . . . . . . . . . . . . . . . . . 40
       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 42
       Intellectual Property and Copyright Statements . . . . . . . . 43
























Schulzrinne, et al.      Expires July 14, 2005                  [Page 2]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


1.  Introduction

1.1  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 [1] and
   indicate requirement levels for compliant implementations.

   In addition to those defined for specifc events, this document uses
   the following abbreviations:

   FAX   facsimile
   HDLC  Half Duplex Link Control
   PSTN  Public Switched (circuit) Telephone Network

1.2  Overview

   This document extends the set of telephony events defined within the
   framework of RFC XXXX [5] to include the control events and tones
   that can appear on a subscriber line serving a FAX machine, a modem,
   or a text telephony device.  Their purpose is to support negotiation,
   start-up and takedown of FAX, modem, or text telephony sessions and
   transitions between operating modes.  The actual FAX and modem
   content are carried by other payload types (e.g, G.711 [19], T.38
   [21], or, in specific circumstances, V.150.1 [32] modem relay, RFC
   2793 [16], or CLEARMODE [18].  The events are organized into several
   groups, corresponding to the ITU-T Recommendation in which they are
   defined.

   NOTE: implementors SHOULD NOT rely on the descriptions of the various
   modem protocols described below without consulting the original
   references (generally ITU-T Recommendations).  The descriptions are
   provided in this document to give a context for the use of the events
   defined here.  They frequently omit important details needed for
   implementation.

   The typical application of these events is to allow the Internet to
   serve as a bridge between terminals operating on the PSTN.  This
   application is characterized as follows:

   o  each gateway will act both as sender and as receiver;

   o  time constraints apply to the exchange of signals, making the
      early identification and reporting of events desirable so that
      receiver playout can proceed in timely fashion;





Schulzrinne, et al.      Expires July 14, 2005                  [Page 3]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   o  transfer of the events must be reliable.

   In some cases, an implementation may simply ignore certain events,
   such as FAX tones, that do not make sense in a particular
   environment.  Section 2.4.1 of RFC XXXX [5] specifies how an
   implementation can use the SDP "fmtp" parameter within an SDP
   description [3] to indicate which events it is prepared to handle.

   Regardless of which events they support, implementations MUST be
   prepared to send and receive data signals using payload types other
   than telephone-event, simultaneously with the use of the latter.
   This is discussed further in Section 3.1.

   In many cases, continuity of playout is critical.  In general this is
   achieved through a combination of buffering at the receiving end and
   maintenance of a constant packetization interval at the sending end.
   As a general principle, the packetization period for a data session
   SHOULD NOT increase at any point during the session.  It MAY
   decrease: for instance, the first packet of the session MAY cover a
   longer period than subsequent packets, thus providing some buffering
   against late delivery of subsequent packets.  Exceptions to the
   general principle are permissible when the timing requirements
   imposed by the data protocol are not stringent.  V.8bis is an example
   of a protocol where packetization intervals can vary between the
   messages and their preparatory signals because of loose time
   constraints and a large buffering period imposed by the protocol
   itself.  Nevertheless, a near-constant packetization rate is
   desirable within individual V.8bis messages.

   A further word on time constraints is in order.  Time constraints
   governing the duration of tones do not pose a problem when using the
   telephone-events payload type: the payload specifies the duration and
   the receiving gateway can play out the tones accordingly.  Problems
   come when time constraints are specified for the duration of silence
   between tones.  A silent period of "at least x ms" is not a problem -
   - event notifications can be received late, but they can still be
   played out at their specified durations.

   The problem comes with requirements of silence for "exactly" some
   period or for "at most" some period.  The most general constraint of
   the latter type has to do with the operation of echo suppressors
   (ITU-T Rec.  G.164 [6] and echo cancellers (ITU-T Rec.  G.165 [7]).
   These devices may re-activate after as little as 100 ms of no signal
   on the line.  As a result, in any situation where echo suppressors or
   cancellers must be disabled for signalling to work, tone events must
   be reported quickly enough to ensure that these devices do not become
   renabled.  This principle is reflected in the succeeding sections.




Schulzrinne, et al.      Expires July 14, 2005                  [Page 4]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


2.  Definitions of Events For Control of Data, FAX, and Text Telephony
   Sessions

2.1  V.8bis Events

   Recommendation V.8bis [11] is a general procedure for two endpoints
   to establish each others' capabilities and to transition between
   different operating modes, both at call startup and after the call
   has been established.  It supports many of the same terminals as V.8
   [10] (see below), but allows more detailed parameter negotiation.  It
   lacks support for some of the older V-series modems defined in V.8,
   but adds capabilities for simultaneous voice and data, H.324 [20]
   multilink, and T.120 [23] conferencing.  The ability to change
   operating modes in mid-call (e.g., to provide alternating voice and
   data) is unavailable in V.8.

   V.8bis distinguishes between "signals" and "messages".  The V.8bis
   "signals": ESi/ESr, MRe/MRd, and CRe/CRd -- consist of tones, as
   described in the next paragraph.  The V.8bis "messages": MS, CL, CLR,
   ACK(1), ACK(2), NAK(1), NAK(2), NACK(3), and NACK(4) -- consist of
   sequences of bits transported over V.21 [13] modulation.

   Signals are intended to be comprehensible at the receiver even in the
   presence of voice content.  They consist of two tone segments.  The
   first segment consists of a dual frequency tone held for 400 ms, and
   has the function of preparing the receiver and any in-line echo
   suppressor or canceller for what follows.  The specific frequencies
   depend only on whether the signal is from the initiator or the
   responder in a transaction.  The second segment follows immediately
   after the first, and is a single tone held for 100 ms.  The frequency
   used indicates the specific signal of the six signals defined.  The
   complete V.8bis strategy for dealing with echo suppressors or
   cancellers is described in Rec.  V.8bis Appendix III.  The only
   silent period constraints imposed are of the "at least" type, posing
   no difficulties for the use of the telephone-events payload.

   V.8bis messages can be transmitted only when voice content is absent.
   The V.8bis protocol uses signals to ensure that the connection is
   operating in non-voice mode before passing messages.  At the physical
   level, V.8bis messages use V.21 [13] frequency-shift signalling to
   transfer message content.  V.21 is described in the next section.
   V.8bis uses V.21 in half-duplex mode, assigning the lower channel to
   the initiator and the upper channel to the responder.

   The V.21 signals are preceded by a 100 ms preamble of 1650 Hz tone
   (the V.21 upper channel mark tone), which must be omitted if the
   preceding signal was ESi or ESr.  (The second segment of ESr is also
   1650 Hz.)  The sender MAY report this preamble tone either as a



Schulzrinne, et al.      Expires July 14, 2005                  [Page 5]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   single extended V.21 upper channel "1" event, or as a series of "1"
   events of normal duration.  It is not necessary to provide an event
   report before the preamble has completed, since the receiver will
   still be playing out the preceding V.8bis signal when this happens
   (see below).

   The events associated with V.8bis signals are described further
   below.  No events are defined for V.8bis messages, only for the
   individual bits transmitted using V.21, so a brief description
   follows:

   o  the V.8bis CL message describes the sending terminal's
      capabilities

   o  the CLR message also describes capabilities, but indicates that
      the sender wants to receive a CL in return

   o  the MS establishes a particular operating mode

   o  the ACK and NAK messages are used to terminate the message
      transactions.

   The V.8bis messages are organized as a sequence of octets.  The first
   two to five octets are HDLC flags (0x7E).  Then comes a message type
   identifier (four bits), a V.8bis version identifier (four bits), zero
   to two more octets of identifying information, followed by zero or
   more information field parameters in the form of bit maps.  An
   individual bit map is one to five octets in length.  Up to 64 octets
   of non-standard information may also be present.  The information
   fields are followed by a checksum and one to three HDLC flags.

   Applications supporting V.8bis signalling using the telephone-events
   payload MUST transfer V.8bis messages in the form of sequences of
   bits, using the V.21 bit events defined in the next section.  The
   transmitted information MUST include the complete contents of the
   message: the initial HDLC flags, the information field, the checksum,
   and the terminating HDLC flags.

   Transmission MUST also include the extra "0" bits added according to
   the procedures of Rec.  V.8bis clause 7.2.8 to prevent false
   recognition of HDLC flags at the receiver.  Implementors should note
   that these extra "0" bits mean that in general V.8bis messages as
   transmitted on the wire will not come out to an even multiple of
   octets.  Sending implementations MAY choose to vary the packetization
   interval to include exactly one octet of information plus any extra
   "0" bits inserted into that octet; the resulting variation will be
   insignificant compared with the amount of buffering caused by the
   preceding V.8bis signal (see below).



Schulzrinne, et al.      Expires July 14, 2005                  [Page 6]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   The power levels of the V.8bis and V.21 signals are subject to
   national regulation.  Thus it seems suitable to model V.8bis events
   as tones for which the volumes SHOULD be specified by the sender.  If
   the receiver is rendering the V.8bis tones as audio content for
   onward transmission, the receiver MAY use the volumes contained in
   the event reports, or MAY modify the volumes to match downstream
   national requirements.

   Table 1 summarizes the event codes defined for V.8bis signalling in
   this document.  The individual events are described following the
   table.  The sender SHALL set the RTP timestamp for these events to
   indicate the time at which the beginning of segment 1 was detected.
   The sender SHOULD send an interim report for the event as soon as it
   has been identified.  The end of the event SHALL be indicated when
   the end of segment 2 has been detected.

   Note: since the sender cannot identify the specific event until
   segment 2 has been detected, the receiver will receive the first
   report of the event more than 400 ms after it has begun.  This has
   the implication that the receiver MUST be able to buffer more than
   400 ms of the V.21 events which follow (i.e., more than 120 events at
   the nominal V.21 rate of 300 bits/s).

   +------------+----------+----------+----------+----------+----------+
   | Event      |    Freq. |    Freq. |    Event |     Type |  Volume? |
   |            |     (Hz) |     (Hz) |     Code |          |          |
   |            |  Segment |  Segment |          |          |          |
   |            |        1 |        2 |          |          |          |
   +------------+----------+----------+----------+----------+----------+
   | CRdi       |   1375 + |     1900 |       41 |     tone |      yes |
   |            |     2002 |          |          |          |          |
   |            |          |          |          |          |          |
   | CRdr       |   1529 + |     1900 |       42 |     tone |      yes |
   |            |     2225 |          |          |          |          |
   |            |          |          |          |          |          |
   | CRe        |   1375 + |      400 |       43 |     tone |      yes |
   |            |     2002 |          |          |          |          |
   |            |          |          |          |          |          |
   | ESi        |   1375 + |      980 |       44 |     tone |      yes |
   |            |     2002 |          |          |          |          |
   |            |          |          |          |          |          |
   | ESr        |   1529 + |     1650 |       45 |     tone |      yes |
   |            |     2225 |          |          |          |          |
   |            |          |          |          |          |          |
   | MRdi       |   1375 + |     1150 |       46 |     tone |      yes |
   |            |     2002 |          |          |          |          |
   |            |          |          |          |          |          |
   | MRdr       |   1529 + |     1150 |       47 |     tone |      yes |



Schulzrinne, et al.      Expires July 14, 2005                  [Page 7]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |            |     2225 |          |          |          |          |
   |            |          |          |          |          |          |
   | MRe        |   1375 + |      650 |       48 |     tone |      yes |
   |            |     2002 |          |          |          |          |
   +------------+----------+----------+----------+----------+----------+

                   Table 1: Events for V.8bis signals

   CRdi:

      V.8bis [11] Capabilities Request (CRd) signal when used to
      initiate a transaction (Rec.  V.8bis Table 7, transactions 2,3,8,
      and 9).  This signal requests that the remote station transition
      from telephony mode to an information transfer mode and requests
      the transmission of a capabilities list message by the remote
      station.  CRdi is sent by the calling station at call startup
      (transactions 2 and 3), the initiating station subsequent to call
      startup (also transactions 2 and 3), or by the answering station
      in response to MRd at call startup if the answering station
      originally issued an MRe and now wants to know the calling
      station's capabilities (transactions 8 and 9).

   CRe:

      V.8bis Capabilities Request (CRe) signal, used specifically by an
      automatic answering station to initiate V.8bis signalling (Rec.
      V.8bis Table 7, transactions 2, 3, 12, and 13).  Like CRdi, this
      signal requests that the remote station transition from telephony
      mode to an information transfer mode and requests the transmission
      of a capabilities list message by the remote station.

   CRdr:

      V.8bis Capabilities Request (CRd) signal when used by the calling
      station as a response to MRe or CRe during call startup to allow
      the calling station to control the outcome of the message
      transaction (Rec.  V.8bis Table 7, transactions 10-13).  Like CRdi
      and CRe, this signal requests that the remote station transition
      from telephony mode to an information transfer mode and requests
      the transmission of a capabilities list message by the remote
      station.

   ESi:

      V.8bis Escape Signal (ESi).  This signal requests that the remote
      station transition from telephony mode to an information transfer
      mode.  ESi is used to precede a message which initiates a V.8bis
      transaction if the transaction is not initiated by MRx or CRx



Schulzrinne, et al.      Expires July 14, 2005                  [Page 8]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      (Rec.  V.8bis Table 7, transactions 4, 5, and 6).  It is intended
      to allow the responding station to detect the arrival of an
      initiating signal in the presence of local voice or other audio.
      PSTN connections with network echo suppressors may be accommodated
      by inserting a 1.5 s silent interval between the ESi signal and
      the transmission of the MS, CL or CLR message.

   ESr:

      V.8bis Escape Signal (ESr) has the same meaning as ESi, but is
      used as a response to MRe or CRe to prepare the way for an MS, CL,
      or CLR message (Rec.  V.8bis Table 7, transactions 1, 2, and 3).
      Used in this way, it turns off any announcement being generated by
      the automatic answering station during message transmission.

   MRdi:

      V.8bis Mode Request (MRd) signal when used to initiate a
      transaction (Rec.  V.8bis Table 7, transaction 1).  This signal
      requests that the remote station transition from telephony mode to
      an information transfer mode and requests the transmission of a
      mode select message by the remote station.  In particular, signal
      MRdi is sent by the initiating station during the course of a
      call, or by the calling station at call establishment.

   MRe:

      V.8bis Mode Request (MRe) signal, sent by an automatic answering
      station during call setup.signal.  Like MRdi, this signal requests
      that the remote station transition from telephony mode to an
      information transfer mode and requests the transmission of a mode
      select message by the remote station.

   MRdr:

      V.8bis Mode Request (MRd) signal when used to respond to an MRe in
      order to give the calling station control over the outcome of the
      message transaction (Rec.  V.8bis Table 7, transactions 7, 8, and
      9).  It has the same meaning as MRdi and MRe.

2.2  V.21 Events

   V.21 [13] is a modem protocol offering data transmission at a maximum
   rate of 300 bits/s.  Two channels are defined, supporting full duplex
   data transmission if required.  One channel uses frequencies 980 Hz
   for "1" and 1180 Hz for "0"; the other channel uses frequencies 1650
   Hz for "1" and 1850 Hz for "0".  The modem can operate synchronously
   or asynchronously.



Schulzrinne, et al.      Expires July 14, 2005                  [Page 9]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   V.21 is used by other protocols (e.g., V.8bis, V.18, T.30) for
   transmission of control data, and is also used in its own right
   between text terminals.  The telephone-events payload type SHOULD NOT
   be used to carry user data as opposed to control data -- other
   payload types such as RFC 2793 [16], or V.150.1 modem relay [32] are
   more suitable for that purpose.  The V.21 events are summarized in
   Table 2.

   Sending implementations MUST report a completed event for every bit
   transmitted (i.e., rather than at transitions between "0" and "1").
   Bit events are assumed to begin and end with the clock interval for
   the event, neglecting the rise and fall times between bit
   transitions.  (Thus it is important for a gateway to determine the
   actual bit rate in use before beginning to report V.21 events.  This
   consideration only comes into play when dealing with certain types of
   text terminal which use V.21 signals at 110 bits/s rather than the
   nominal 300 bits/s.  See Section 2.6.)

   Implementations SHOULD pack multiple events into one packet, using
   the procedures of section 2.5.1.5 of RFC XXXX [5].  Eight to ten bits
   is a reasonable packetization interval.

   Reliable transmission of V.21 events is important, to prevent data
   corruption.  Reporting an event per bit rather than per transition
   increases reporting redundancy and thus reporting reliability, since
   each event completion is transmitted three times as described in
   section 2.5.1.4 of RFC XXXX [5].  To reduce the number of packets
   required for reporting, implementations SHOULD carry the
   retransmitted events using RFC 2198 [2] redundancy encoding.  This is
   illustrated in the example in Section 3.2.1.

   The time to transmit one V.21 bit at the nominal rate of 300 bits/s
   is 3.33 ms, or 26.67 timestamp units at the default 8000 Hz sampling
   rate for the telephone-events payload type.  Because this duration is
   not an integral number of timestamp units, accurate reporting of the
   beginning of the event and the event duration is impossible.  Sending
   gateways SHOULD round V.21 event starting times to the nearest whole
   timestamp unit.

   When sending multiple consecutive V.21 events in a succession of
   packets, the sending gateway MUST ensure that individual event
   durations reported do not cause the last event of one packet to
   overlap with the first event of the next, taking into account the
   respective initial event timestamps.  To accomplish this, the sending
   gateway MUST derive the individual event durations as the succession
   of differences between the event starting times (so that, at 8000 Hz,
   every third event has reported duration 26 units, the remainder 27
   units).



Schulzrinne, et al.      Expires July 14, 2005                 [Page 10]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   Where a receiving gateway recognizes that a packet reports a
   consecutive series of V.21 bit events, it SHOULD play them out at a
   uniform rate despite the possible one-timestamp-unit discrepancies in
   their reported spacing and duration.

   +------------------+------------+-----------+-----------+-----------+
   | Event            |  Frequency |     Event |      Type |   Volume? |
   |                  |       (Hz) |      Code |           |           |
   +------------------+------------+-----------+-----------+-----------+
   | V.21 channel 1,  |       1180 |        37 |      tone |       yes |
   | "0" bit          |            |           |           |           |
   |                  |            |           |           |           |
   | V.21 channel 1,  |        980 |        38 |      tone |       yes |
   | "1" bit          |            |           |           |           |
   |                  |            |           |           |           |
   | V.21 channel 2,  |       1850 |        39 |      tone |       yes |
   | "0" bit          |            |           |           |           |
   |                  |            |           |           |           |
   | V.21 channel 2,  |       1650 |        40 |      tone |       yes |
   | "1" bit          |            |           |           |           |
   +------------------+------------+-----------+-----------+-----------+

                    Table 2: Events for V.21 signals


2.3  V.8 Events

   V.8 [10] is an older general negotiation and control protocol,
   supporting startup for the following terminals: H.324 [20]
   multimedia, V.18 [12] text, T.101 [22] videotext, T.30 [9] send or
   receive FAX, and a long list of V-series modems including V.34 [28],
   V.90 [29], V.91 [30], and V.92 [31].  In contrast to V.8bis [11], in
   V.8 only the calling terminal can determine the operating mode.

   V.8 does not use the same terminology as V.8bis.  Rather, it defines
   four signals which consist of bits transferred by V.21 [13] at 300
   bits/s: the call indicator signal (CI), the call menu signal (CM),
   the CM terminator (CJ), and the joint menu signal (JM).  In addition,
   it uses tones defined in V.25 [15] and T.30 [9] (described below),
   and one tone (ANSam) defined in V.8 itself.  The calling terminal
   sends using the V.21 low channel; the answering terminal uses the
   high channel.

   The basic protocol sequence is subject to a number of variations to
   accommodate different terminal types.  A pure V.8 sequence is as
   follows:





Schulzrinne, et al.      Expires July 14, 2005                 [Page 11]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   1.  After an initial period of silence, the calling terminal
       transmits the V.8 CI signal.  It repeats CI at least three times,
       continuing with occasional pauses until it detects ANSam tone.
       The CI indicates whether the calling terminal wants to function
       as H.324, V.18, T.30 send, T.30 receive, or a V-series modem.

   2.  The answering terminal transmits ANSam after detecting CI.  ANSam
       will disable any G.164 [6] echo suppressors on the circuit after
       400 ms and any G.165 [7] echo cancellors after one second of
       ANSam playout.

   3.  On detecting ANSam, the calling terminal pauses at least half a
       second, then begins transmitting CM to indicate detailed
       capabilities within the chosen mode.

   4.  After detecting at least two identical sequences of CM, the
       answering terminal begins to transmit JM, indicating its own
       capabilities (or offering an alternative terminal type if it
       cannot support the one requested).

   5.  After detecting at least two identical sequences of JM, the
       calling terminal completes the current octet of CM, then
       transmits CJ to acknowledge the JM signal.  It pauses exactly 75
       ms, then starts operating in the selected mode.

   6.  The answering terminal transmits JM until it has detected CJ.  At
       that point it stops transmitting JM immediately, pauses exactly
       75 ms, then starts operating in the selected mode.


   The CI, CM, and JM signals all consist of a fixed sequence of ten "1"
   bits followed by a signal-dependent pattern of ten synchronization
   bits, followed by one or more octets of variable information.  Each
   octet is preceded by a "0" start bit and followed by a "1" stop bit.
   The combination of the synchronization pattern and V.21 channel
   uniquely identifies the message type.  The CJ signal consists of
   three successive octets of all zeros with stop and start bits but
   without the preceding "1"s and synchronizing pattern of the other
   signals.

   Applications supporting the telephone-events payload for V.8
   signalling MUST report each instance of a CM, JM, and CJ signal
   respectively as a series of V.21 bit events (Section 2.2).  If both
   gateways support the CI event in Table 3, the synchronization part of
   the CI signal (ten '1's followed by '00000 00001') MAY be reported
   either as a CI event or as a series of V.21 bit events.  If the CI
   event is reported, the remainder of the CI signal (the call function
   octet and its start and stop bit) MUST be reported as a series of



Schulzrinne, et al.      Expires July 14, 2005                 [Page 12]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   V.21 bit events.

   As a general principle, the packetization interval SHOULD remain
   constant for V.8 and subsequent data signalling, once a gateway has
   sent out the first packet for that session.  The packetization period
   for the initial packet MAY be larger than that for the remainder of
   the session.

   The overlapping nature of V.8 signalling means that there is no risk
   of silence exceeding 100 ms once ANSam has disabled any echo control
   circuitry.  However, the 75 ms pause before entering operation in the
   selected data mode will require both the calling and the answering
   gateways to recognize the completion of CJ, so they can change from
   playout of telephone-events to playout of the data-bearing payload
   after the 75 ms period.

   +------------+------------------+-----------+-----------+-----------+
   | Event      |   Frequency (Hz) |     Event |      Type |   Volume? |
   |            |                  |      Code |           |           |
   +------------+------------------+-----------+-----------+-----------+
   | ANSam      |        2100 x 15 |        34 |      tone |       yes |
   |            |                  |           |           |           |
   | /ANSam     |  2100 x 15 phase |        35 |      tone |       yes |
   |            |             rev. |           |           |           |
   |            |                  |           |           |           |
   | CI         |      (V.21 bits) |        53 |      tone |       yes |
   +------------+------------------+-----------+-----------+-----------+

                    Table 3: Events for V.8 signals

   ANSam:

      Modified answer tone ANSam consists of a sinewave signal at 2100
      Hz, amplitude-modulated by a sine wave at 15 Hz.  The beginning of
      the event is at the later of the beginning of the tone or the
      occurrence of a phase reversal terminating a /ANSam event.  The
      end of the event is at the sooner of the ending of the tone or the
      occurrence of a phase reversal (marking the beginning of a /ANSam
      event).  Phase reversals are used to disable echo cancellation; if
      they are being applied, they occur at 450 ms intervals.

      The modulated envelope for the ANSam tone ranges in amplitude
      between 0.8 and 1.2 times its average amplitude.  The average
      transmitted power is governed by national regulations.  Thus it
      makes sense to indicate the volume of the signal.

   /ANSam:




Schulzrinne, et al.      Expires July 14, 2005                 [Page 13]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      /ANSam reports the same physical signal as ANSam, but is reported
      following a phase reversal in that signal.  It begins with the
      phase reversal and ends at the sooner of the end of the tone or
      another phase reversal (marking the beginning of a new ANSam
      event).

   CI:

      CI reports the occurrence of the V.21 bit pattern '11111 11111
      00000 00001' indicating the beginning of a V.8 CI signal.  The
      event begins at the beginning of the first bit and ends at the end
      of the last one.

   The sender MUST allow at least one packetization interval worth of
   ANSam to accumulate before sending the initial report.  Moreover, an
   ANSam event packet SHOULD NOT be sent until it is possible to
   discriminate between an ANSam event and an ANS event (see V.25
   events, below).  The sender MUST provide subsequent ANSam updates
   (including transitions to and from /ANSam) at a constant
   packetization interval, to avoid unwanted gaps in playout at the
   receiving end.  If a phase reversal is detected between updates for
   the ANSam, the sender MUST maintain the reporting interval through
   the reversal.  Thus in general the packet that reports the end of
   ANSam will also report an initial segment of /ANSam.  Similar
   behaviour governs the transition from /ANSam to ANSam if another
   phase reversal is detected.

   If the sending gateway is reporting a CI signal as the combination of
   the CI event and subsequent V.21 bit events, it MAY wait until the
   end of the CI event before it sends out any report of that event.
   Note that if the CI signal is sent at all, it will typically be
   repeated several times.  After the initial occurrence, the sender
   SHOULD report subsequent CI signals entirely in the form of V.21 bits
   to avoid the risk of gaps in playout between CI signals.  (It is
   unclear whether modem implementations would tolerate such gaps.)

2.4  V.25 Events

   V.25 [15] is a start-up protocol predating V.8 [10] and V.8bis [11].
   It specifies the exchange of two tone signals: CT and ANS.

   CT (calling tone) consists of a series of interrupted bursts of 1300
   Hz tone, on for a duration of not less than 0.5 s and not more than
   0.7 s and off for a duration of not less than 1.5 s and not more than
   2.0 s.  [15].  Modems not starting with the V.8 CI signal often use
   this tone.

   ANS (answering tone) is a 2100 Hz tone is used to disable echo



Schulzrinne, et al.      Expires July 14, 2005                 [Page 14]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   suppression for data transmission [15], [9].  For FAX machines,
   Recommendation T.30 [9] refers to this tone as called terminal
   identification (CED) answer tone.  ANS differs from V.8 ANSam in
   that, unlike the latter, it has constant amplitude.

   V.25 specifically includes procedures for disabling echo suppressors
   as defined by ITU-T Rec.  G.164 [6].  However, G.164 echo suppressors
   have now for the most part been replaced by G.165 [7] echo
   cancellers, which require phase reversals in the disabling tone (see
   ANSam above).  As a result, Recommendation V.25 was modified in July,
   2001 to say that phase reversal in the ANS tone is required if echo
   cancellers are to be disabled.

   One possible V.25 sequence is as follows:

   1.  The calling terminal starts generating CT as soon as the call is
       connected.

   2.  The called terminal waits in silence for 1.8 to 2.5 s after
       answer, then begins to transmit ANS continuously.  If echo
       cancellers are on the line the phase of the ANS signal is
       reversed every 450 ms.  ANS will not reach the calling terminal
       until the echo control equipment has been disabled.  Since this
       takes about a second it can only happen in the gap between one
       burst of CT and the next.

   3.  Following detection of ANS, the calling terminal may stop
       generating CT immediately or wait until the end of the current
       burst to stop.  In any event, it must wait at least 400 ms (at
       least 1 s if phase reversal of ANS is being used to disable echo
       cancellers) after stopping CT before it can generate the calling
       station response tone.  This tone is modem-specific, not
       specified in V.25.

   4.  The called terminal plays out ANS for 2.6 to 4.0 seconds or until
       it has detected calling station response for 100 ms.  It waits
       55- 95 ms (nominal 75 ms) in silence.  (Note that the upper limit
       of 95 ms is rather close to the point at which echo control may
       reestablish itself.)  If the reason for ANS termination was
       timeout rather than detection of calling station response, the
       called terminal begins to play out ANS again to maintain
       disabling of echo control until the calling station responds.

   The events defined for V.25 signalling are shown in Table 4.







Schulzrinne, et al.      Expires July 14, 2005                 [Page 15]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   +---------------+---------------+------------+-----------+----------+
   | Event         | Frequency     | Event Code |      Type |  Volume? |
   |               | (Hz)          |            |           |          |
   +---------------+---------------+------------+-----------+----------+
   | Answer tone   | 2100          |         32 |      tone |      yes |
   | (ANS)         |               |            |           |          |
   |               |               |            |           |          |
   | /ANS          | 2100 ph. rev. |         33 |      tone |      yes |
   |               |               |            |           |          |
   | CT            | 1300          |         49 |      tone |      yes |
   +---------------+---------------+------------+-----------+----------+

                    Table 4: Events for V.25 signals

   ANS:

      The beginning of the event is at the later of the beginning of the
      2100 Hz tone or the phase reversal terminating a /ANS event.  The
      end of the event is at the sooner of the ending of the tone or the
      occurrence of a phase reversal (marking the beginning of a /ANS
      event).

   /ANS:

      /ANS reports the same physical signal as ANS, but is reported
      following a phase reversal in that signal.  It begins with the
      phase reversal and ends at the sooner of the end of the tone or
      another phase reversal (marking the beginning of a new ANS event).

   CT:

      The beginning of the CT event is at the beginning of an individual
      burst of the 1300 Hz tone.  The end of the event is at the end of
      that tone burst.  The gateway at the calling end SHOULD use a
      packetization interval smaller than the nominal duration of a CT
      burst, to ensure that CT playout at the called end precedes the
      sending of ANS from that end.

   The initial interval for each occurrence of ANS and CT (following
   silence) SHOULD be larger than the interval for subsequent updates
   for the same tone event.  (The large tolerance on inter-tone-burst
   timing means that changes of packetization interval across the
   periods of silence is safe.)   An initial ANS event packet SHOULD NOT
   be sent until it is possible to discriminate between an ANS event and
   an ANSam event (see V.8 events, above).  After the initial report, to
   preserve the continuity of tone playout, the packetization interval
   should be constant until all reports for that tone event have been
   sent.  If a phase reversal is detected between updates for the ANS,



Schulzrinne, et al.      Expires July 14, 2005                 [Page 16]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   the sender MUST maintain the reporting interval through the
   transition to /ANS and vice versa.  Thus except by coincidence the
   packet that reports the end of ANS will also report an initial
   segment of /ANS.

2.5  T.30 Events

   ITU-T Recommendation T.30 [9] defines the procedures used by Group
   III FAX terminals.  The pre-message procedures for which the events
   of this section are defined are used to identify terminal
   capabilities at each end and negotiate operating mode.  Post-message
   procedures are also included, to handle cases such as multiple
   document transmission.  FAX terminals support a wide variety of
   protocol stacks, so T.30 has a number of options for control
   protocols and sequences.

   T.30 defines two tone signals used at the beginning of a call.  The
   CNG signal is sent by the calling terminal.  It is a pure 1100 Hz
   tone played in bursts: 0.5 s on, 3 s off.  It continues until timeout
   or until the calling terminal detects a response.  Its primary
   purpose is to let human operators at the called end know that a FAX
   terminal has been activated at the calling end.

   The called terminal waits in silence for at least 200 ms.  It then
   may return CED tone (which is physically identical to V.25 ANS), or
   else V.8 ANSam if it has V.8 capability.  If ANSam is returned and
   the calling terminal has V.8 capability, it transmits CI to begin a
   V.8 negotiation.  Otherwise, the called terminal stops transmitting
   CED after 2.6 to 4 seconds, waits 75 +/- 20 ms in silence, then
   enters the negotiation phase.

   In the negotiation phase the terminals exchange binary messages using
   V.21 signals, high channel frequencies only.  Each message is
   preceded by a one-second (nominal) preamble consisting entirely of
   HDLC flag octets (0x7E).  This flag has the function of preparing
   echo control equipment for the message which follows.

   The pre-transfer messages exchanged using the V.21 coding are:

   Digital Identification Signal (DIS):

      Characterizes the standard ITU-T capabilities of the called
      terminal.  This is always the first message sent.

   Digital Transmit Command (DTC):

      A possible response to the DIS signal by the calling terminal.  It
      requests the called terminal to be the transmitter of the FAX



Schulzrinne, et al.      Expires July 14, 2005                 [Page 17]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      content.

   Digital Command Signal (DCS):

      A command message sent by the transmitting terminal to indicate
      the options to be used in the transmission and request that the
      other end prepare to receive FAX content.  This is sent by the
      calling end if it will transmit, or by the called end in response
      to a DTC from the calling end.  It is followed by a training
      signal, also sent by the transmitting terminal.

   Confirmation To Receive (CFR):

      A digital response confirming that the entire pre-message
      procedure including training has been completed and the message
      transmissions may commence.

   Each message may consist of multiple frames bounded by HDLC flags.
   The messages are organized as a series of octets, but like V.8bis,
   T.30 calls for the insertion of extra "0" bits to prevent spurious
   recognition of HDLC flags.

   T.30 also provides for the transmission of control messages after
   document transmission has completed (e.g., to support transmission of
   multiple documents).  The transition to and from the modem used for
   document transmission (V.17 [24], V.27ter [26], V.29 [27], V.34 [28])
   is preceded by 75 ms (nominal) of silence).

   Applications supporting T.30 signalling using the telephone-events
   payload MUST report the preamble preceding each message as a single
   preamble event rather than a series of bits.  However, the T.30
   control message following the preamble MUST be reported in the form
   of a sequence of V.21 bit events (Section 2.2).  The transmitted
   information MUST include the complete contents of the message: the
   initial HDLC flags, the information field, the checksum, the
   terminating HDLC flags, and the extra "0" bits added to prevent false
   recognition of HDLC flags at the receiver.  Implementors should note
   that these extra "0" bits mean that in general T.30 messages as
   transmitted on the wire will not come out to an even multiple of
   octets.

   The training signal sent by the transmitting terminal before CFR
   consists of a steady string of V.21 high channel zeros (1850 Hz tone)
   for 1.5 s.  This SHOULD also be sent as a series of V.21 bit events.
   However, if the sending gateway is capable of recognizing the
   transition from the end of the DCS to the start of training, it MAY
   report the training signal as a single extended V.21 (high channel)
   '0' event.  If it does so, it MUST maintain the packetization rate



Schulzrinne, et al.      Expires July 14, 2005                 [Page 18]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   used for the preceding V.21 signalling to prevent an unwanted gap in
   the playout of the training signal.

   The events defined for T.30 signalling are shown in Table 5.  The CED
   and /CED events represent exactly the same tone signals as V.25 ANS
   and /ANS, and are given the same codepoints; they are reproduced here
   only for convenience.

   +---------------+---------------+------------+-----------+----------+
   | Event         | Frequency     | Event Code |      Type |  Volume? |
   |               | (Hz)          |            |           |          |
   +---------------+---------------+------------+-----------+----------+
   | CNG (Calling  | 1100          |         36 |      tone |      yes |
   | tone)         |               |            |           |          |
   |               |               |            |           |          |
   | CED (Called   | 2100          |         32 |      tone |      yes |
   | tone)         |               |            |           |          |
   |               |               |            |           |          |
   | /CED          | 2100 ph. rev. |         33 |      tone |      yes |
   |               |               |            |           |          |
   | V.21 preamble | (V.21 bits)   |         54 |      tone |      yes |
   | flag          |               |            |           |          |
   +---------------+---------------+------------+-----------+----------+

                    Table 5: Events for T.30 signals

   CNG:

      The beginning of the CNG event is at the beginning of an
      individual burst of the 1100 Hz tone.  The end of the event is at
      the end of that tone burst.  It is

   CED:

      The beginning of the event is at the later of the beginning of the
      2100 Hz tone or the phase reversal terminating a /CED event.  The
      end of the event is at the sooner of the ending of the tone or the
      occurrence of a phase reversal (marking the beginning of a /CED
      event).


   /CED:

      /CED reports the same physical signal as CED, but is reported
      following a phase reversal in that signal.  It begins with the
      phase reversal and ends at the sooner of the end of the tone or
      another phase reversal (marking the beginning of a new CED event).




Schulzrinne, et al.      Expires July 14, 2005                 [Page 19]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   V.21 preamble flag:

      This event begins with the first V.21 bits transmitted after a
      period of silence.  It ends when a pattern of V.21 bits other than
      an HDLC flag is observed.  This means that the V.21 preamble event
      absorbs the initial HDLC flags of the following message.

   The initial interval for each occurrence of CNG and CED (following
   silence) SHOULD be larger than the interval for subsequent updates
   for the same tone event.  (The large tolerance in T.30 timings means
   that changes of packetization interval across the periods of silence
   is safe.)   An initial CED event packet SHOULD NOT be sent until it
   is possible to discriminate between a CED event and an ANSam event
   (see V.8 events, above).  After the initial report, to preserve the
   continuity of tone playout, the packetization interval should be
   constant until all reports for that tone event have been sent.  If a
   phase reversal is detected between updates for the CED, the sender
   MUST maintain the reporting interval through the transition to /CED
   and vice versa.  Thus the packet that reports the end of CED will
   generally also report an initial segment of /CED.

   As with CNG and CED, the sending gateway SHOULD use a larger
   packetization interval for the initial report of the V.21 preamble
   flag event than the interval for subsequent updates for the same
   event.  Updates to preamble event SHOULD be reported at the
   packetization interval that will be used for the subsequent reporting
   of V.21 bit events, to provide a smooth transition to the latter.
   However, since the total preamble duration is in the order of a
   second, it is reasonable to update at a slower rate until perhaps 750
   ms of duration has accumulated, then move to a faster rate.

2.6  V.18 Events

   ITU-T Recommendation V.18 [12] defines a terminal for text
   conversation, possibly in combination with voice.  What follows is a
   description of the use of telephone events for V.18 startup.  In all
   cases, once the startup procedures have been completed, the gateways
   SHOULD use another payload type to transfer the content of the text
   conversation.

   V.18 is intended to interoperate with a variety of legacy text
   terminals, so its start-up sequence can consist of a series of
   stimuli designed to determine what is at the other end.  Two V.18
   terminals talking to each other will use V.8 to negotiate startup,
   and continue at the physical level with V.21 at 300 bits/s carrying
   7-bit characters bounded by start and stop bits.  The V.18 terminal
   is also designed to interoperate with:




Schulzrinne, et al.      Expires July 14, 2005                 [Page 20]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   o  Baudot [33], a five bit character encoding nominally operating at
      45.45 or 50 bits/s with frequencies 1800 Hz = "0", 1400 Hz = "1";

   o  Q.23 [8] (DTMF), which uses combinations of "*" and "#" as escapes
      to achieve a full repertoire of characters; these combinations are
      documented in V.18 Annex B;

   o  EDT, which is V.21 [13] operating at 110 bits/s in half-duplex
      mode (lower channel only); characters are 7 bit IA5 plus initial
      start bit, trailing parity bit, and two stop bits;

   o  Bell 103 mode (documented in Recommendation V.18 Annex D), which
      is structurally similar to V.21, but uses different frequencies:
      lower channel, 1070 Hz = "0", 1270 Hz = "1"; upper channel, 2025
      Hz = "0", 2225 Hz = "1"; characters are US ASCII framed by one
      start bit, one trailing parity bit, and one stop bit;

   o  V.23 [25] based videotex, in Minitel and Prestel versions.  V.23
      offers a forward channel operating at 1200 bits/s if possible
      (2100 Hz = "0", 1300 Hz = "1") or otherwise at 600 bits/s (1700 Hz
      = "0", 1300 Hz = "1"), and a 75 bits/s backward channel which is
      transmitting 390 Hz (continuous "1"s) except when "0" is to be
      transmitted (450 Hz);

   o  a non-V.18 text terminal using V.21 [13] at 300 bits/s.
      Characters are 7 bit national (e.g., US ASCII) with a start bit,
      parity, and one stop bit.

   The startup sequences for all these different terminal types are
   naturally quite different.  The V.18 initial startup sequence
   addresses itself to V.8-capable terminals and V.21 terminals and, by
   the combination of signals, to V.23 videotex terminals.  During the
   initial startup sequence the V.18 terminal listens for frequency
   responses characterizing the other terminal types.  If it does not
   make contact in the preliminary step it probes for each type
   specifically.  By the nature of the application, V.18 has been
   designed to provide an extremely robust startup capability.

   More on the details of V.18 startup below.  The point to make here is
   that gateways intending to serve V.18 MUST be prepared to transfer
   information using payload types other than telephone-events from the
   start of the session.  Events have been defined as shown in Table 6
   to allow the sending gateway to indicate the nature of the modulated
   content it is receiving.  However, the alternative payload type used
   to transfer the content may (for example, in the case of RFC 2793
   [16]) be independent of the type of modulation received at the
   sending gateway.  A receiving gateway MUST NOT rely on the receipt of
   a V.18- related event to control playout at its end if content is



Schulzrinne, et al.      Expires July 14, 2005                 [Page 21]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   available in another payload type.

   Note that none of the codepoints in Table 6 were defined in RFC 2833
   [17].

   +----------+----------+------------+----------+----------+----------+
   | Event    | Bit Rate | Frequency  |    Event |     Type |  Volume? |
   |          | bits/s   | (Hz)       |     Code |          |          |
   +----------+----------+------------+----------+----------+----------+
   | ANS2225  | N/A      | 2225       |       52 |     tone |      yes |
   |          |          |            |          |          |          |
   | V21L110  | 110      | 980/1180   |       55 |    other |       no |
   |          |          |            |          |          |          |
   | B103L300 | 300      | 1070/1270  |       56 |    other |       no |
   |          |          |            |          |          |          |
   | V23Main  | 600/1200 | 1700-2100/ |       57 |    other |       no |
   |          |          | 300        |          |          |          |
   |          |          |            |          |          |          |
   | V23Back  | 75       | 450/390    |       58 |    other |       no |
   |          |          |            |          |          |          |
   | Baud4545 | 45.45    | 1800/1400  |       59 |    other |       no |
   |          |          |            |          |          |          |
   | Baud50   | 50       | 1800/1400  |       60 |    other |       no |
   +----------+----------+------------+----------+----------+----------+

                 Table 6: Events for V.18 interworking

   ANS2225:

      This 2225 Hz answer tone is described in ITU-T Recommendation
      V.18, Annex D [12] for Bell 103 class modems operating in the text
      telephone mode.  It is also referred to in ITU-T Recommendation
      V.22 [14].  This is a pure tone with no amplitude modulation and
      no semantics attached to phase reversals, if there are any.  It is
      necessary to accommodate it for completeness, and for compliance
      with various legal ordinances.  A distinct codepoint was allocated
      to this event since it must be differentiated from the normal,
      2100 Hz answer tone when reproduced at the far- end gateway.


   V21L110:

      indicates that the sending device has detected V.21 modulation
      operating in the lower channel at 110 bits/s.

   B103L300:





Schulzrinne, et al.      Expires July 14, 2005                 [Page 22]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      indicates that the sending device has detected Bell 103 class
      modulation operating in the low channel at 300 bits/s.

   V23Main:

      indicates that the sending device has detected V.23 modulation
      operating in the high speed channel.

   V23Back:

      indicates that the sending device has detected V.23 modulation
      operating in the 75 bit/s back-channel.

   Baud4545:

      indicates that the sending device has detected Baudot modulation
      operating at 45.45 bits/s.

   Baud50:

      indicates that the sending device has detected Baudot modulation
      operating at 50 bits/s.

   The V.18 startup procedure for the calling terminal requires it to
   transmit a V.18 sequence in the following cycle:

   1.  Silence for one second.

   2.  Repeat the following steps three times:

       A.  Four repetitions of V.8 CI (Table 3) on the V.21 low channel,
           without preamble.  The call function octet for a V.18 text
           terminal is defined in V.8 to be '01000 00101'.

       B.  Silence for two seconds.

   3.  Play out the XCI signal, a three second string of V.23 bit
       patterns defined in clause 3.13 of Recommendation V.18 and using
       the V.23 1200 bits/s upper channel.  The sending gateway MUST
       provide the pattern using an alternate payload type, but MAY also
       send the V23Main event defined in Table 6 for the duration of XCI
       playout.  The receiving gateway MUST be prepared to play out the
       pattern from that alternate payload type without relying on
       receipt of the V23Main event.

   The second and third steps are repeated until a response is detected.
   The following responses are possible:




Schulzrinne, et al.      Expires July 14, 2005                 [Page 23]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   o  2100 Hz modulated (ANSam) as defined in ITU-T Recommendation V.8;
      this would indicate a V.8-capable terminal.  The V.18 terminal
      completes a V.8 negotiation to start up.  The gateways MUST use
      the events as defined for V.8 to sustain this negotiation.

   o  2100 Hz (ANS) as defined in ITU-T V.25; this could indicate a
      V.18, V.21 (300 bits/s), or V.23 terminal.  The calling V.18
      terminal transmits a 40-bit pattern (TXP) using the V.21 low
      channel and monitors the frequencies returned.  The calling end
      gateway SHOULD send the TXP pattern as a sequence of V.21 low
      channel bit events.  An answering V.18 terminal will return TXP,
      so the calling end gateway MUST be prepared to play the
      corresponding V.21 sequence back to the calling terminal.

   o  2225 Hz; this indicates a Bell 103 class terminal in answer mode.
      The gateway at the answering end MUST report this as the ANS2225
      defined in this section.  The event begins when the 2225 Hz tone
      is detected.  Event updates should be provided at reasonable
      intervals until the tone is taken away.

   o  1300 Hz; provided this persists for at least 1.7 s, it indicates a
      V.23-based terminal operating at 600 or 1200 bits/s.  The calling
      terminal will enter V.23 mode, transmitting on the 75 bits/s V.23
      back-channel.  The gateway at the answering end 1300 Hz tone MAY
      also report the V23Main event.  When the calling V.18 terminal
      responds, the gateway at the calling end MAY also report the
      V23Back event.

   o  1650 Hz; if this persists at least 500 ms, it indicates a V.21
      (300 bits/s) terminal.  The calling V.18 terminal will enter into
      that mode of operation.

   o  1400 or 1800 Hz; this indicates a Baudot terminal.  The calling
      terminal will determine the line rate and enter into Baudot mode.
      Either gateway MAY send the Baud4545 or Baud50 event as applicable
      if and when it identifies the nature of the signals being passed.

   o  DTMF tones; these indicate a DTMF terminal.  The calling terminal
      will enter DTMF mode.

   o  980 or 1180 Hz; these indicate a V.21-based terminal running at
      either 110 or 300 bits/s, and using the low channel.  The calling
      terminal does timing to make the distinction.  Note that this is
      very difficult, and in practice the sending gateway is often
      informed in advance (e.g.  through provisioning) what line speed
      is being used.  If it observes continuous 980 Hz for at least
      1.5s, the calling terminal enters V.21 (300 bit/s) mode using the
      high channel for transmission.  The gateway at the answering end



Schulzrinne, et al.      Expires July 14, 2005                 [Page 24]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      SHOULD NOT use V.21 events to report the initial signals from the
      answering terminal.  The tones payload type defined in this
      document MAY be used instead.  A gateway receiving V.21 signals at
      110 bits/s  MAY report the V21L110 event once it has made a
      definitive determination of the line speed.

   o  1270 Hz; this indicates a Bell 103 terminal operating in calling
      mode (lower channel).  The V.18 terminal enters Bell 103 mode
      using the higher channel.  The gateways MUST transmit the Bell 103
      modem content using an alternative payload type, and MAY report
      the B103L300 or B103H300 event as applicable to the modulation
      received from the terminal at their end.

   o  390 Hz (only when sending XCI); this indicates a V.23 terminal
      using the 75 bits/s channel.  The V.18 terminal enters V.23 mode
      using the high-speed (1200 bits/s) channel.  The gateway at the
      answering end MAY report the V2375 event.  The gateway at the
      calling end MAY report the V231200 event.

   Similar logic governs the actions taken by a V.18 terminal operating
   in answer mode.






























Schulzrinne, et al.      Expires July 14, 2005                 [Page 25]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


3.  Application Considerations

3.1  Strategies For Handling FAX and Modem Signals

   As described in Section 1.2, the typical data application involves a
   pair of gateways interposed between two terminals, where the
   terminals are in the PSTN.  The gateways are likely to be serving a
   mixture of voice and data traffic, and need to adopt payload types
   appropriate to the media flows as they occur.  If voice compression
   is in use for voice calls, this means that the gateways need the
   flexibility to switch to other payload types when data streams are
   recognized.

   Within the established IETF framework, this implies that the gateways
   must negotiate the potential payloads (voice, telephone-events,
   tones, voice-band data, T.38 FAX [21], and possibly RFC 2793 [16]
   text and CLEARMODE [18] octet streams) as separate payload types.
   From a timing point of view, this is most easily done at the
   beginning of a call, but results in an over-allocation of resources
   at the gateways and in the intervening network.

   One alternative is to use named events to buy time while out-of-band
   signals are exchanged to update to the new payload type applicable to
   the session.  Thanks to the events defined in this document, this is
   a viable approach for sessions beginning with V.8, V.8bis, T.30, or
   V.25 control sequences.

   Named data-related events also allow gateways to optimize their
   operation when data signals are received in a relatively general
   form.  One example is the use of V.8-related events to deduce that
   the voice-band data being sent in a G.711 payload comes from a
   higher-speed modem and therefore requires disabling of echo
   cancellors.

   All of the control procedures described in the sub-sections of
   Section 2 eventually give way to data content.  As mentioned above,
   this content will be carried by other payload types.  Receiving
   gateways MUST be prepared to switch to the other payload type within
   the time constraints associated with the respective applications.
   (For several of the procedures documented above, the sender provides
   75 ms of silence between the initial control signalling and the
   sending of data content.)  In some cases (V.8bis [11], T.30 [9]),
   further control signalling may happen after the call has been
   established.

   A possible strategy is to send both telephone-events and the data
   payload in an RFC 2198 [2] redundancy arrangement.  The receiving
   gateway then propagates the data payload whenever no event is in



Schulzrinne, et al.      Expires July 14, 2005                 [Page 26]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   progress.  For this to work, the data payload and events (when
   present) MUST cover exactly the same time period; otherwise spurious
   events will be detected downstream.  An example of this modem of
   operation is shown below.

   Note that there are a number of cases where no control sequence will
   precede the data content.  This is true, for example, for a number of
   legacy text terminal types.  In such instances, the events defined in
   Section 2.6 in particular MAY be sent to help the remote gateway
   optimize its handling of the alternative payload.

3.2  Example of V.8 Negotiation

   This section presents an example of the use of the event codes
   defined in Section 2.  The basic scenario is the startup sequence for
   duplex V.34 modem operation.  It is assumed that once the initial V.8
   sequence is complete the gateways will enter into voice band data
   operation using G.711 encoding to transmit the modem signals.  The
   basic packet sequence is indicated in Table 7.  Sample packets are
   then shown in detail for two variants on event transmission strategy:

   o  simultaneous transmission of events and retransmitted events using
      RFC 2198 [2] redundancy;

   o  simultaneous transmission of events, retransmitted events, and
      voice band data using RFC 2198 redundancy.

   For simplicity and semi-realism, the times shown for the example
   scenario assume a fixed lag at each gateway of 20 ms between the
   packet side of the gateway and the local user equipment and vice
   versa (i.e., minimum of 40 ms between packet received and packet sent
   specifically in response to the received packet).  A propagation
   delay of 5 ms is assumed between gateways.  It is assumed that the
   event packetization interval is 30 ms, a reasonable compromise
   between packet volume and buffering delay, particularly for V.21
   events.

   At the basic V.8 protocol level, the table assumes that the answering
   modem waits 0.2 s (200 ms) from the beginning of the call to start
   transmitting ANSam.  The calling modem waits 1 s (1000 ms) from the
   time it begins to receive ANSam until it begins to send the V.8 CM
   signal.  Both modems wait 75 ms from the time they finish sending and
   receiving CJ respectively until they begin sending V.34 modem
   signals.

   +-----------+-------------------------------------------------------+
   | Time (ms) | Event                                                 |
   +-----------+-------------------------------------------------------+



Schulzrinne, et al.      Expires July 14, 2005                 [Page 27]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |     220.0 | The called gateway detects the start of ANSam from    |
   |           | its end.                                              |
   |           |                                                       |
   |     250.0 | The called gateway sends out the first ANSam event    |
   |           | packet.  M bit is set, timestamp is ts0 + 1760 (where |
   |           | ts0 is the timestamp value at the start of the call). |
   |           | The initial ANSam event continues until a phase shift |
   |           | is detected at 670.0 ms (see below).  Up to this      |
   |           | time, the called gateway sends out further ANSam      |
   |           | event updates, with the same initial timestamp, M bit |
   |           | off, and cumulative duration increasing by 240 units  |
   |           | each time.                                            |
   |           |                                                       |
   |     255.0 | The calling gateway receives the first ANSam event    |
   |           | report and begins playout of ANSam tone at its end.   |
   |           |                                                       |
   |     275.0 | The calling terminal receives the beginning of ANSam  |
   |           | tone and starts its timer.  It will begin sending the |
   |           | CM signal 1 s later (at 1275.0 ms into the call).     |
   |           |                                                       |
   |     670.0 | The called gateway detects a phase shift in the       |
   |           | incoming signal, marking a change from ANSam to       |
   |           | /ANSam.  This happens to coincide with the end of a   |
   |           | packetization interval.  For the sake of the example, |
   |           | assume that the called gateway does not detect this   |
   |           | in time for the event report it sends out.            |
   |           |                                                       |
   |     700.0 | The called gateway issues its next-scheduled event    |
   |           | report packet, indicating an initial report for       |
   |           | /ANSam (M-bit set, timestamp ts0 + 5360, duration 240 |
   |           | timestamp units).  The packet also carries the first  |
   |           | retransmission of the final ANSam report, total       |
   |           | duration 3600 units, this time with the E-bit set.    |
   |           |                                                       |
   |    1120.0 | The called gateway detects another phase shift and    |
   |           | subsequently reports the end of /ANSam and the        |
   |           | beginning of ANSam.                                   |
   |           |                                                       |
   |    1295.0 | The calling gateway begins to receive the CM signal   |
   |           | from the calling modem.                               |
   |           |                                                       |
   |    1325.0 | The calling gateway sends a packet containing the     |
   |           | first 9 bits of the CM signal.                        |
   |           |                                                       |
   |    1445.0 | The calling gateway sends out a packet containing the |
   |           | last 4 bits of the first CM signal, plus the first 5  |
   |           | bits of the next repetition of that signal.  CM bits  |
   |           | will continue to be transmitted from the calling      |



Schulzrinne, et al.      Expires July 14, 2005                 [Page 28]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |           | gateway until 2015.0 ms (see below), for a total of   |
   |           | 24 packets.  (The final packet also carries the       |
   |           | beginning of the CJ signal.)                          |
   |           |                                                       |
   |    1570.0 | The called gateway detects another phase shift and    |
   |           | subsequently reports the end of ANSam and the         |
   |           | beginning of /ANSam.                                  |
   |           |                                                       |
   |    1596.7 | The called gateway completes playout of the final bit |
   |           | of the second occurence of the CM signal.             |
   |           |                                                       |
   |    1636.7 | The called gateway detects end of /ANSam (and         |
   |           | beginning of JM) from the called modem. The next      |
   |           | packet is not yet due to go out.                      |
   |           |                                                       |
   |    1660.0 | The called gateway sends out a packet combining the   |
   |           | final /ANSam event report (E-bit set and total        |
   |           | duration 533 timestamp units) with the first 7 bits   |
   |           | of the JM signal.  The M-bit for the packet is set    |
   |           | and the packet timestamp is ts0 + 12560 (the start of |
   |           | the now-discontinued /ANSam event).                   |
   |           |                                                       |
   |    1690.0 | The called gateway sends out a packet containing the  |
   |           | next nine bits of JM signal.  The M-bit is set and    |
   |           | the timestamp is ts0 + 13280 (beginning of the first  |
   |           | bit in the packet).  JM will continue to be           |
   |           | transmitted until 2170.0 ms (see below), for a total  |
   |           | of 18 packets (plus two for final retransmissions).   |
   |           |                                                       |
   |    1938.3 | The calling gateway completes playout of the final    |
   |           | packet of the second occurence of the JM signal.      |
   |           |                                                       |
   |    1995.0 | The calling gateway begins to receive the initial     |
   |           | bits of the CJ signal.                                |
   |           |                                                       |
   |    2015.0 | The calling gateway sends a packet containing the     |
   |           | final 3 bits of the first decad of a CM signal and    |
   |           | first 6 bits of a CJ signal.                          |
   |           |                                                       |
   |    2095.0 | The calling gateway receives the last bit of the CJ   |
   |           | signal.  A period of silence lasting 75 ms begins at  |
   |           | the called end.  It is not yet time to send out an    |
   |           | event report.                                         |
   |           |                                                       |
   |    2105.0 | The calling gateway sends out a packet containing the |
   |           | final 6 bits of the CJ signal.                        |
   |           |                                                       |
   |    2130.0 | The called gateway finishes playing out the last CJ   |



Schulzrinne, et al.      Expires July 14, 2005                 [Page 29]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |           | signal bit sent to it.                                |
   |           |                                                       |
   |    2135.0 | The calling gateway sends a packet containing no new  |
   |           | events, but retransmissions of the last 15 bits of    |
   |           | the CJ signal (in two generations).                   |
   |           |                                                       |
   |    2165.0 | The calling gateway sends out a packet containing no  |
   |           | new events, but retransmissions of the final 6 bits   |
   |           | of the CJ signal.                                     |
   |           |                                                       |
   |    2170.0 | The called gateway sends out the last packet          |
   |           | containing bits of the JM signal (except for          |
   |           | retransmissions).  Note that according to the V.8     |
   |           | specification these bits do not in general complete a |
   |           | JM signal or even an "octet" of that signal (although |
   |           | they happen to do so in this example).  A 75 ms       |
   |           | period of silence begins at the called end.           |
   |           |                                                       |
   |    2170.0 | The calling gateway begins to receive V.34 signalling |
   |           | from the called modem.                                |
   |           |                                                       |
   |    2175.0 | The calling gateway finishes playing out the last JM  |
   |           | signal bit sent to it.                                |
   |           |                                                       |
   |    2195.0 | The calling gateway sends out a first packet of V.34  |
   |           | signalling as voice band data (PCMU).  Timestamp is   |
   |           | ts0 + 17360 and M-bit is set to indicate the          |
   |           | beginning of content after silence. The packet        |
   |           | contains 200 8-bit samples.  Packetization interval   |
   |           | is shown here as continuing to be 30 ms.  It could be |
   |           | less, but MUST NOT be more because that would make    |
   |           | the silent period too long.                           |
   |           |                                                       |
   |    2200.0 | The called gateway sends a packet containing no new   |
   |           | events, but retransmissions of the last 18 bits of    |
   |           | the JM signal (in two generations).                   |
   |           |                                                       |
   |    2225.0 | The calling gateway sends out the second packet of    |
   |           | V.34 signalling as voice band data (PCMU).  Timestamp |
   |           | is ts0 + 17560 and M-bit is not set. The packet       |
   |           | contains 240 8-bit samples.                           |
   |           |                                                       |
   |    2230.0 | The called gateway sends out a packet containing no   |
   |           | new events, but retransmissions of the final 9 bits   |
   |           | of the JM signal.                                     |
   |           |                                                       |
   |    2245.0 | The called gateway begins to receive V.34 signalling  |
   |           | from the called modem.                                |



Schulzrinne, et al.      Expires July 14, 2005                 [Page 30]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |           |                                                       |
   |    2255.0 | The calling gateway sends out a third packet of V.34  |
   |           | signalling as voice band data (PCMU).  Timestamp is   |
   |           | ts0 + 17800 and M-bit is not set. The packet contains |
   |           | 240 8-bit samples.                                    |
   |           |                                                       |
   |    2260.0 | The called gateway sends out a first packet of V.34   |
   |           | signalling as voice band data (PCMU).  Timestamp is   |
   |           | ts0 + 17960 and M-bit is set to indicate the          |
   |           | beginning of content after silence. The packet        |
   |           | contains 120 samples.  Packetization interval is      |
   |           | shown here as continuing to be 30 ms.  It could be    |
   |           | less, but MUST NOT be more because that would make    |
   |           | the silent period too long.                           |
   |           |                                                       |
   |     . . . | . . .                                                 |
   +-----------+-------------------------------------------------------+

                Table 7: Events for Example V.8 Scenario


3.2.1  Simultaneous Transmission of Events and Retransmitted Events
      Using RFC 2198 Redundancy

   Negotiation of the transmission mode being described in this section
   would use SDP similar to the following:

      m=audio 12343 RTP/AVP 99
      a=rtpmap:99  pcmu/8000
      m=audio 12343 RTP/AVP 100 101
      a=rtpmap:100 red/8000/1
      a=fmtp:100 101/101/101
      a=rtpmap:101 telephone-event/8000
      a=fmtp:101 0-15,32-49,52-60

   This indicates two media streams, the first for G.711 (i.e., voice or
   voice-band data), the second for triply-redundant telephone events.
   As RFC 2198 notes, it is also possible for the sender to send
   telephone-event payloads without redundancy in the second stream,
   although the redundant form is the primary transmission mode.  (It
   would be reasonable to send the interim ANSam reports without
   redundancy.)  The set of telephone events supported includes the DTMF
   events (not relevant in this example), and all of the data events
   defined in this document.  In fact, only event codes 34-35 and 37-40
   are used in the example.

   For the purpose of illustrating the use of RFC 2198 redundancy as
   well as showing the basic composition of the event reports, the



Schulzrinne, et al.      Expires July 14, 2005                 [Page 31]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   second packet reporting JM signal bits (sent by the called gateway at
   1690.0 ms) seems to be a good choice.  This packet will also carry
   the second retransmission of the final /ANSam event report and the
   first retransmission of the initial 7 bits of the JM signal.  The
   detailed content of the packet is shown in Figure 1.  To see the
   contents of the successive generations more clearly, they are
   presented as if they were aligned on successive 32-bit boundaries.
   In fact, they are all offset by one octet, following on consecutively
   from the RFC 2198 header.

   The M-bit is set in the RTP header for the packet, as required for
   the coding of multiple events in the primary block of data.  In fact,
   RFC 2198 implies that this is the correct behaviour, but does not say
   so explicitly.  The E-bit is set for every event.  It is possible
   that it would not be set for the final event in the primary block.

       0                   1                    2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3  4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X| CC=0  |1|  PT=100     |   sequence number = seq0 + 48 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              timestamp = ts0 + 13280                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| block PT=101|  timestamp offset = 720   | block length =  4 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| block PT=101|  timestamp offset = 267   | block length = 28 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0| block PT=101|     (begin block for /ANSam ...)
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                      /ANSam block (second retransmission)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 35  |1|R| volume    |       duration = 533        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              First 7 bits of JM (="1111111" in V.21 high channel)
                    (first retransmission)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /    (5 similar events, durations 27,26,27,27,26 respectively)  /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+



Schulzrinne, et al.      Expires July 14, 2005                 [Page 32]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


              Next 9 bits of JM (="111000000" in V.21 high channel)
                    (new content)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /     (7 similar events, codes 40,40,39,39,39,39,39 and         /
      /      durations 26,27,27,26,27,27,26 respectively)             /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 39  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Figure 1: Packet Contents, Redundant Events Only

   Since all of the events in the above packet are consecutive and
   adjacent, it would have been permissible according to the
   telephone-events payload specification to carry them as a simple
   event payload without the RFC 2198 header.  The advantage of the
   latter is that the receiving gateway can skip over the retransmitted
   events when processing the packet, unless it needs them.

3.2.2  Simultaneous Transmission of Events and Voice Band Data Using RFC
      2198 Redundancy

   Negotiation of the transmission mode being described in this section
   would use SDP similar to the following:

      m=audio 12343 RTP/AVP 99
      m=audio 12343 RTP/AVP 99 100 101
      a=rtpmap:99 red/8000/1
      a=fmtp:99 100/101/101/101
      a=rtpmap:100  pcmu/8000
      a=rtpmap:101 telephone-event/8000
      a=fmtp:101 0-15,32-49,52-60

   This indicates one media stream, with G.711 (i.e., voice or
   voice-band data) as the primary content, along with three blocks of
   telephone events.  RFC 2198 requires that the more voluminous
   representation (i.e., the G.711) be the primary one.  The most recent
   block of events covers the same time period as the voice-band data.
   The other two streams provide the first and second retransmissions of
   the events as in the previous example.  Because G.711 is the primary
   content, the M-bit for the packets will in general not be set, except
   after periods of silence.

   Figure 2 shows the detailed packet content for the same sample point
   as in the previous figure, but including the G.711 content.




Schulzrinne, et al.      Expires July 14, 2005                 [Page 33]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


       0                   1                    2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3  4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X| CC=0  |0|  PT=99      |   sequence number = seq0 + 48 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              timestamp = ts0 + 13280                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| block PT=101|  timestamp offset = 720   | block length =  4 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| block PT=101|  timestamp offset = 267   | block length = 28 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1| block PT=101|  timestamp offset = 0     | block length = 36 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0| block PT=100|     (begin block for /ANSam ...)
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                      /ANSam block (second retransmission)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 35  |1|R| volume    |       duration = 533        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              First 7 bits of JM (="1111111" in V.21 high channel)
                    (first retransmission)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /    (5 similar events, durations 27,26,27,27,26 respectively)  /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              Next 9 bits of JM (="111000000" in V.21 high channel)
                    (new content)

      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 40  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /     (7 similar events, codes 40,40,39,39,39,39,39 and         /
      /      durations 26,27,27,26,27,27,26 respectively)             /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |     event = 39  |1|R| volume    |       duration = 27         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

             30 ms of G.711-encoded voice-band data (240 samples)



Schulzrinne, et al.      Expires July 14, 2005                 [Page 34]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Sample 1    |   Sample 2    |   Sample 3    |   Sample 4    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /                            . . .                              /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |   Sample 237  |   Sample 238  |   Sample 239  |   Sample 240  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  Figure 2: Packet Contents With Voice-Band Data Combined With Events










































Schulzrinne, et al.      Expires July 14, 2005                 [Page 35]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


4.  Security Considerations

   The events defined in this document are for the setup and control of
   sesssions between data terminals or FAX devices.  They carry
   information of no particular value to any attacker.  The primary
   threat is denial of service, either through injection of
   inappropriate signals at vulnerable points in the control sequence,
   or through blocking of enough event packets to disrupt that sequence.
   To meet the injection threat, gateways SHOULD authenticate incoming
   media packets.  Alternative means for doing this are at the transport
   level (i.e., by sending the packets in IPSEC streams) or through use
   of media encryption negotiated by SDP exchanges.

   See RFC XXXX [5] for further discussion of security issues associated
   with gateways and telephone events.




































Schulzrinne, et al.      Expires July 14, 2005                 [Page 36]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


5.  IANA Considerations

   This document adds the events in Table 8 to the registry established
   by RFC XXXX [5].

   +--------+-------------------------------------------+--------------+
   |  Event | Event Name                                |    Reference |
   |  Code  |                                           |              |
   +--------+-------------------------------------------+--------------+
   |   32   | ANS (V.25 Answer tone). Also known as CED |   <This RFC> |
   |        | (T.30 Called tone).                       |              |
   |        |                                           |              |
   |   33   | /ANS (V.25 Answer tone after phase        |   <This RFC> |
   |        | shift). Also known as /CED (T.30 Called   |              |
   |        | tone after phase shift)                   |              |
   |        |                                           |              |
   |   34   | ANSam (V.8 amplitude modified Answer      |   <This RFC> |
   |        | tone)                                     |              |
   |        |                                           |              |
   |   35   | /ANSam (V.8 amplitude modified Answer     |   <This RFC> |
   |        | tone after phase shift)                   |              |
   |        |                                           |              |
   |   36   | CNG (T.30 Calling tone)                   |   <This RFC> |
   |        |                                           |              |
   |   37   | V.21 channel 1 (low channel), "0" bit     |   <This RFC> |
   |        |                                           |              |
   |   38   | V.21 channel 1, "1" bit                   |   <This RFC> |
   |        |                                           |              |
   |   39   | V.21 channel 2, "0" bit                   |   <This RFC> |
   |        |                                           |              |
   |   40   | V.21 channel 2, "1" bit                   |   <This RFC> |
   |        |                                           |              |
   |   41   | V.8bis CRdi signal                        |   <This RFC> |
   |        |                                           |              |
   |   42   | V.8bis CRdr signal                        |   <This RFC> |
   |        |                                           |              |
   |   43   | V.8bis CRe signal                         |   <This RFC> |
   |        |                                           |              |
   |   44   | V.8bis ESi signal                         |   <This RFC> |
   |        |                                           |              |
   |   45   | V.8bis ESr signal                         |   <This RFC> |
   |        |                                           |              |
   |   46   | V.8bis MRdi signal                        |   <This RFC> |
   |        |                                           |              |
   |   47   | V.8bis MRdr signal                        |   <This RFC> |
   |        |                                           |              |
   |   48   | V.8bis MRe signal                         |   <This RFC> |
   |        |                                           |              |



Schulzrinne, et al.      Expires July 14, 2005                 [Page 37]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   |   49   | CT (V.25 Calling Tone)                    |   <This RFC> |
   |        |                                           |              |
   |   52   | ANS2225: 2225 Hz indication for text      |   <This RFC> |
   |        | telephony                                 |              |
   |        |                                           |              |
   |   53   | CI (V.8 Call Indicator signal preamble)   |   <This RFC> |
   |        |                                           |              |
   |   54   | V.21 preamble flag (T.30)                 |   <This RFC> |
   |        |                                           |              |
   |   55   | V21L110: 110 bits/s V.21 indication for   |   <This RFC> |
   |        | text telephony                            |              |
   |        |                                           |              |
   |   56   | B103L300: Bell 103 low channel indication |   <This RFC> |
   |        | for text telephony                        |              |
   |        |                                           |              |
   |   57   | V23Main: V.23 main channel indication for |   <This RFC> |
   |        | text telephony                            |              |
   |        |                                           |              |
   |   58   | V23Back: V.23 back channel indication for |   <This RFC> |
   |        | text telephony                            |              |
   |        |                                           |              |
   |   59   | Baud4545: 45.45 bits/s Baudot indication  |   <This RFC> |
   |        | for text telephony                        |              |
   |        |                                           |              |
   |   60   | Baud50: 50 bits/s Baudot indication for   |   <This RFC> |
   |        | text telephony                            |              |
   +--------+-------------------------------------------+--------------+

  Table 8: Data-Related Additions To RFC XXXX Telephony Event Registry






















Schulzrinne, et al.      Expires July 14, 2005                 [Page 38]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


6.  References

6.1  Normative References

   [1]   Bradner, S., "Key words for use in RFCs to indicate requirement
         levels", RFC 2119, March 1997.

   [2]   Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley,
         M., Bolot, J., Vega-Garcia, A. and S. Fosse-Parisis, "RTP
         payload for redundant audio data", RFC 2198, September 1997.

   [3]   Handley, M. and V. Jacobson, "SDP: Session Description
         Protocol", RFC 2327, April 1998.

   [4]   Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson,
         "RTP: A Transport Protocol for Real-Time Applications", RFC
         3550, STD 0064, July 2003.

   [5]   Schulzrinne, H., Petrack, S. and T. Taylor, "RTP Payload for
         DTMF Digits, Telephony Tones and Telephony Signals", Work in
         progress: draft-ietf-avt-rfc2833bis-07.txt, January 2005.

   [6]   International Telecommunication Union, "Echo suppressors",
         ITU-T Recommendation G.164, November 1988.

   [7]   International Telecommunication Union, "Echo cancellers", ITU-T
         Recommendation G.165, March 1993.

   [8]   International Telecommunication Union, "Technical features of
         push-button telephone sets", ITU-T Recommendation Q.23,
         November 1988.

   [9]   International Telecommunication Union, "Procedures for document
         facsimile transmission in the general switched telephone
         network", ITU-T Recommendation T.30, July 2003.

   [10]  International Telecommunication Union, "Procedures for starting
         sessions of data transmission over the public switched
         telephone network", ITU-T Recommendation V.8, November 2000.

   [11]  International Telecommunication Union, "Procedures for the
         identification and selection of common modes of operation
         between data circuit-terminating equipments (DCEs) and between
         data terminal equipments (DTEs) over the public switched
         telephone network and on leased point-to-point telephone-type
         circuits", ITU-T Recommendation V.8bis, November 2000.

   [12]  International Telecommunication Union, "Operational and



Schulzrinne, et al.      Expires July 14, 2005                 [Page 39]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


         interworking requirements for {DCEs operating in the text
         telephone mode", ITU-T Recommendation V.18, November 2000.

         See also Recommendation V.18 Amendment 1, Nov.  2002.

   [13]  International Telecommunication Union, "300 bits per second
         duplex modem standardized for use in the general switched
         telephone network", ITU-T Recommendation V.21, November 1988.

   [14]  International Telecommunication Union, "1200 bits per second
         duplex modem standardized for use in the general switched
         telephone network and on point-to-point 2-wire leased
         telephone-type circuits", ITU-T Recommendation V.22, November
         1988.

   [15]  International Telecommunication Union, "Automatic answering
         equipment and general procedures for automatic calling
         equipment on the general switched telephone network including
         procedures for disabling of echo control devices for both
         manually and automatically established calls", ITU-T
         Recommendation V.25, October 1996.

         See also Corrigendum 1 to Recommendation V.25, Jul.  2001.

6.2  Informative References

   [16]  Hellstrom, G., "RTP Payload for Text Conversation", RFC 2793,
         May 2000.

         An update to this RFC, draft-ietf-avt-rfc2793bis-09.txt, has
         been approved and awaits publication as an RFC at time of
         writing.

   [17]  Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits,
         Telephony Tones and Telephony Signals", RFC 2833, May 2000.

   [18]  Kreuter, R., "RTP payload format for a 64 kbit/s transparent
         call", Work in progress: draft-ietf-avt-rtp-clearmode-05.txt,
         April 2004.

   [19]  International Telecommunication Union, "Pulse code modulation
         (PCM) of voice frequencies", ITU-T Recommendation G.711,
         November 1988.

   [20]  International Telecommunication Union, "Terminal for low
         bit-rate multimedia communication", ITU-T Recommendation H.324,
         March 2002.




Schulzrinne, et al.      Expires July 14, 2005                 [Page 40]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   [21]  International Telecommunication Union, "Procedures for
         real-time Group 3 facsimile communication over IP networks",
         ITU-T Recommendation T.38, July 2003.

   [22]  International Telecommunication Union, "International
         interworking for videotex services", ITU-T Recommendation
         T.101, November 1994.

   [23]  International Telecommunication Union, "Data protocols for
         multimedia conferencing", ITU-T Recommendation T.120, July
         1996.

   [24]  International Telecommunication Union, "A 2-wire modem for
         facsimile applications with rates up to 14 400 bit/s", ITU-T
         Recommendation V.17, February 1991.

   [25]  International Telecommunication Union, "600/1200-baud modem
         standardized for use in the general switched telephone
         network", ITU-T Recommendation V.23, November 1988.

   [26]  International Telecommunication Union, "4800/2400 bits per
         second modem standardized for use in the general switched
         telephone network", ITU-T Recommendation V.27ter, November
         1988.

   [27]  International Telecommunication Union, "9600 bits per second
         modem standardized for use on point-to-point 4-wire leased
         telephone-type circuits", ITU-T Recommendation V.29, November
         1988.

   [28]  International Telecommunication Union, "A modem operating at
         data signalling rates of up to 33 600 bit/s for use on the
         general switched telephone network and on leased point-to-point
         2-wire telephone-type circuits", ITU-T Recommendation V.34,
         February 1998.

   [29]  International Telecommunication Union, "A digital modem and
         analogue modem pair for use on the Public Switched Telephone
         Network (PSTN) at data signalling rates of up to 56 000 bit/s
         downstream and up to 33 600 bit/s upstream", ITU-T
         Recommendation V.90, September 1998.

   [30]  International Telecommunication Union, "A digital modem
         operating at data signalling rates of up to 64 000 bit/s for
         use on a 4-wire circuit switched connection and on leased
         point-to-point 4-wire digital circuits", ITU-T Recommendation
         V.91, May 1999.




Schulzrinne, et al.      Expires July 14, 2005                 [Page 41]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


   [31]  International Telecommunication Union, "Enhancements to
         Recommendation V.90", ITU-T Recommendation V.92, November 2000.

   [32]  International Telecommunication Union, "Modem-over-IP networks:
         Procedures for the end-to-end connection of V-series DCEs",
         ITU-T Recommendation V.150.1, January 2003.

   [33]  Telecommunications Industry Association, "A Frequency Shift
         Keyed Modem for Use on the Public Switched Telephone Network",
         ANSI TIA- 825-A-2003, April 2003.


Authors' Addresses

   Henning Schulzrinne
   Columbia U.
   Dept. of Computer Science
   Columbia University
   1214 Amsterdam Avenue
   New York, NY  10027
   US

   EMail: schulzrinne@cs.columbia.edu


   Scott Petrack
   eDial
   266 Second Ave
   Waltham, MA  02451
   US

   EMail: scott.petrack@edial.com


   Tom Taylor
   Nortel
   1852 Lorraine Ave
   Ottawa, Ontario  K1H 6Z8
   CA

   EMail: taylor@nortel.com










Schulzrinne, et al.      Expires July 14, 2005                 [Page 42]


Internet-Draft    Modem, FAX, and Text Telephony Events     January 2005


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Schulzrinne, et al.      Expires July 14, 2005                 [Page 43]