draft-ietf-rohc-sigcomp-00

Network Working Group                        H. Hannu (Editor), Ericsson
INTERNET-DRAFT                                             Z. Liu, Nokia
Expires: April 2002                         R. Price, Siemens/Roke Manor
                                               J. Rosenberg, Dynamicsoft
                                            J. Christoffersson, Ericsson
                                                       C. Clanton, Nokia
                                                   S. Forsgren. Ericsson
                                                            K. Le, Nokia
                                                         K. Leung, Nokia
                                                    K. Svanbro, Ericsson

                                                        October 19, 2001



                    Signaling Compression (SigComp)
                     draft-ietf-rohc-sigcomp-00.txt




   Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is a submission of the IETF ROHC WG. Comments should be
   directed to its mailing list, rohc@cdt.luth.se.











Hannu, Liu, Price, Rosenberg, et al.                            [Page 1]


INTERNET-DRAFT                  SigComp               October 19 , 2001


Abstract

   The Session Initiation Protocol (SIP), along with many other IP
   protocols used for multimedia communications, such as RTSP, are
   textual protocols engineered for bandwidth rich links. With the
   planned usage of these protocols in wireless handsets as part of 2.5G
   and 3G wireless, the large size of these messages is problematic.
   This draft provides a robust and efficient message compression
   scheme, suitable for compression of ASCII based protocols' messages.


0. Document History

   - October 19, 2001, version 00

   First version. The draft describes the current ideas, from people
   involved in the ROHC WG, of how to perform compression of application
   signaling messages.


TABLE OF CONTENTS

   1.  Introduction..................................................3

   2.  Terminology...................................................3

   3.  High-level description........................................3

   4.  SigComp Components............................................6

   5.  Protocol Component............................................6

   6.  Compression Framework Component...............................12

   7.  Security considerations.......................................21

   8.  IANA considerations...........................................21

   9.  Authors addresses.............................................22

   10. Intellectual Property Right Considerations....................22

   11. References....................................................22












Hannu, Liu, Price, Rosenberg, et al.                            [Page 2]


INTERNET-DRAFT                  SigComp               October 19 , 2001


1. Introduction

   The Session Initiation Protocol (SIP) [SIP], along with many other IP
   protocols used for multimedia communications, such as RTSP [RTSP],
   are textual protocols engineered for bandwidth rich links. As a
   result, these messages have not been optimized for message size.
   Typical SIP messages are from a few hundred bytes to as high as 2000.
   To date, this has not been a significant problem.

   With the planned usage of these protocols in wireless handsets as
   part of 2.5G and 3G wireless, the large size of these messages is
   problematic. The problem is not bandwidth efficiency (the media
   stream still consumes the majority of the bandwidth), but rather
   latency. With low-rate IP connectivity, store-and-forward delays are
   significant. For CDMA2000, for example, the basic channel will
   support only 9.6 kbps. At this rate, transmission of each byte
   requires 0.8ms. A 500 byte SIP message requires nearly half a second
   to transmit. Taking into account retransmits, and the multiplicity of
   messages that are required in some flows, call setup and feature
   invocation are adversely affected. Therefore, we believe there is
   merit in investigating improvements in message sizes.
   This document defines SigComp, an efficient and robust scheme for
   message compression when the transmission path between the compressor
   and decompressor is unreliable, i.e. prone to errors, losses and
   misordering. SigComp fulfills the requirements stated in [REQ]


2. Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC-2119].

   Byte buffer

   The SigComp decompressor maintains a byte buffer containing any
   previously received text strings that might be useful for future
   compression.

   Token

   A token is an instruction transmitted from the compressor to the
   decompressor.


3. High-level description

   SigComp is useful for compression of ASCII-based application
   signaling protocol messages. Compression and decompression is
   performed at two points, e-net and e-ms, see Figure 1. The compressor
   uses the context to compress a message to a SigComp message. The




Hannu, Liu, Price, Rosenberg, et al.                            [Page 3]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   reverse process is done at the decompressor. The context of SigComp
   is defined as the information necessary to perform compression and
   decompression. The correct context to use is identified by the Source
   and Destination addresses of the IP header. By using this
   identification method there is no need to add an explicit context
   identifier in the SigComp message.
   To improve the compression efficiency, SigComp uses previous messages
   in the compression/decompression process of later message. To be
   robust against context inconsistency in this process, SigComp has an
   internal acknowledgement scheme.


                +------------------+            +--------------------+
                | +--------------+ |            | +----------------+ |
                | |   Context    | |            | |    Context     | |
                | +--------------+ |            | +----------------+ |
                |      |       |   |            |       |        |   |
     original   |      |       |   | compressed |       |        |   |
     message    | +----------+ |   |  message   | +------------+ |   |
    ------------+>|Compressor|-) - + - - - - - -+>|Decompressor|-)---+->
                | +----------+ |   |            | +------------+ |   |
                |              |   |            |                |   |
   decompressed |              |   | compressed |                |   |
     message    | +--------------+ |  message   | +----------------+ |
   <------------+-| Decompressor |<+- - - - - - +-|   Compressor   |<+--
                | +--------------+ |            | +----------------+ |
                +------------------+            +--------------------+
                      E-ms                             E-net

                  Figure 1.  SigComp High-level view.

   Although Figure 1. implies that the context is shared between
   compressor and decompressor located in the same entity, this must not
   necessarily be the case. SigComp can be applied in an environment
   where shared context is not possible or unwanted, with just one
   change: The context is identified by the IP-source and IP-destination
   address pair. Thus, if the IP-source and IP-destination addresses
   would switch, it will generate a new context.
   However, it is recommended that SigComp is used with shared context
   because of the increase in compression ratio this gives.


3.1. Context data

   The Context of SigComp is defined as the information necessary to
   perform compression and decompression. The context at compressor and
   the corresponding decompressor must be kept consistent. The context
   is built up by the parts described in the following subsections.







Hannu, Liu, Price, Rosenberg, et al.                            [Page 4]


INTERNET-DRAFT                  SigComp               October 19 , 2001


3.1.1. Static Context Data

   The context of SigComp is populated with static information to use in
   the compression/decompression process, i.e. static context data. The
   static information comes from knowing what protocols are to be
   compressed. Information such as protocol-specific header field names,
   Methods, Status codes etc. are typical static context data. It does
   not change over time, is relevant for all users of SigComp, and can
   be applied to all sessions. Thus, at least the static context data
   can be applied when compressing messages.


3.1.2. User specific Context Data

   To further increase the compression efficiency SigComp has the
   possibility to pre-populate the context with information useful in
   the compression process. This type of information is regarded as user
   specific context data as it is not defined within SigComp.
   Information about commonly used connections, such as SIP URLs and
   appliction settings (speech codecs, etc) are typical examples of user
   specific context data.

   Note: How pre-population is performed is yet to be determined.


3.1.3. Dynamic Context Data

   Messages associated with protocols such as SIP tend to have similar
   characteristics within a session. Therefore SigComp makes use of
   messages sent from/to a user. To be able to decompress SigComp
   messages correctly, information in previous messages must not be used
   in the compression process until the message(s) has been
   acknowledged. This part of the context is referred to as the dynamic
   context data part.


3.1.4. Cross session Context

   The messages also have similar characteristics between sessions. The
   idea is to take advantage of this with Cross session context.
   Basically this means that the context is kept between sessions, e.g.
   between two SIP INVITES.

   Note: How to utilize Cross Session Context is yet to be determined.


3.1.5. Keeping Context

   TBW.






Hannu, Liu, Price, Rosenberg, et al.                            [Page 5]


INTERNET-DRAFT                  SigComp               October 19 , 2001


4. SigComp Components

   The SigComp compression scheme is divided into two components,
   Protocol and Compression framework component. Basically, the protocol
   component sees to that the contexts are consistent even for a certain
   amount of message loss and misordering and also make it possible to
   use previously sent and received message in the
   compression/decompression of later messages. The compression
   framework component defines the structure of the part of the context
   that is used in the compression/decompression process and what
   information to update the context with.
   Figure 2 depicts which parts of the SigComp message that relates to
   which component.

   +-----------------+
   | SigComp Header  |   * Protocol Component
   +-----------------+
   |                 |
   :   Compressed    :   * Compression framework Component
   |   information   |
   +-----------------+

   Figure 2. SigComp message.


5. Protocol Component

   This chapter describes the protocol component of SigComp.
   The protocol component include functions, such as SigComp message
   acknowledgement scheme and context verification function.
   The four main task that the protocol component performs are:

   1) Process a message and sending it compressed, as a SigComp message.
   2) Receiving a SigComp message and pass it on uncompressed.
   3) Acknowledge received SigComp messages.
   4) Receiving acknowledgements for SigComp messages.

   The sections in this chapter describes functions that are needed for
   the SigComp protocol component to perform its tasks.


5.1. SigComp header

   To achieve robustness and keep track of sent and received messages, a
   header is added to the compressed message. The SigComp header consist
   of the following fields:

   * Message IDentification (MID) number: The MID number is used to keep
   track of sent and received messages, and is useful for detecting
   message loss and/or misordering.





Hannu, Liu, Price, Rosenberg, et al.                            [Page 6]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   * CRC-verification of messages (CRC-M): This CRC is calculated over
   the uncompressed message, e.g. the SIP INVITE message, and is used to
   verify the correctness of decompressed SigComp messages.

   * Acknowledgement: A SigComp acknowledgement can either be carried
   within a SigComp message (Piggybacked) or be sent by itself
   (Standalone).

   Note: It is assumed that the total length of  a SigComp message is
   provided by the lower layer. Since the length of the header part is
   self-contained the length of the compressed message can be derived by
   subtracting the header length from the total SigComp message length.
   The compressed information length could be zero in the case of a
   Standalone acknowledgement.

   Formats are described in the following sub-sections.


5.1.1 Normal message formats

   These are the normal formats to use. Format 1 should be used when
   there has been no gap in the received MID numbers up to the
   generation of this SigComp message. Format 2 should be used to
   acknowledge received SigComp messages when there has been a gap in
   the received MID numbers.

   Format 1:

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   |      MID      | Cumulative ACK|
   +---+---+---+---+---+---+---+---+
   |             CRC-M             |
   +---+---+---+---+---+---+---+---+
   /      Compressed message       /
   /   according to section 6      /
   +---+---+---+---+---+---+---+---+

   Format 2:

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1   1   1   0 |      MID      |
   +---+---+---+---+---+---+---+---+
   |             CRC-M             |
   +---+---+---+---+---+---+---+---+
   |      AC       |     RMID (1)  |
   +---+---+---+---+---+---+---+---+
   |    RMID (2)   |     RMID (3)  |
   +---+---+---+---+---+---+---+---+
   :                               :




Hannu, Liu, Price, Rosenberg, et al.                            [Page 7]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   /                               /
   :                               :
   +---+---+---+---+---+---+---+---+
   |    RMID (C-1) |    RMID (C)   |
   +---+---+---+---+---+---+---+---+
   /      Compressed message       /
   /   according to section 6      /
   +---+---+---+---+---+---+---+---+

   * MID: "0000" to "1101". The Message IDentification (MID) number is
   commonly increased with one for each sent message. However, there is
   the exception when the next MID to be assigned is not "free", see
   section X.

   * Cumulative ACK: Acknowledges all SigComp messages with MID number
   equal to or less than the value of this field. The value of this
   field must not be set to a value so that non-received SigComp
   messages are acknowledged. A received acknowledgement with higher
   value than the maximum MID must be ignored and not be regarded as
   acknowledgement for SigComp messages.

   * CRC-M: TBD

   AC:   Number of received SigComp message being acknowledged (RMID),
         which are included in the List. If AC is an even number the
         last RMID, RMID(C), is only padding, in order to get octet
         aligned.

   RMID: Received MID of a SigComp message, which are being
         acknowledged.


5.1.2 Extended message formats

   Format 3: Reserved.

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1   1   1   1 |      TBD      |
   +---+---+---+---+               +
   /              TBD              /
   :                               :
   /                               /
   +---+---+---+---+---+---+---+---+

   Note: The usage of this format is yet to be determined.


5.2. Acknowledgement procedure

   There is one basic rule for the acknowledgement procedure:




Hannu, Liu, Price, Rosenberg, et al.                            [Page 8]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   "All SigComp messages should be acknowledged".

   Two functions are needed:
   1) A mapping function between sending MID numbers and acknowledged
   MID numbers, called MAPP-Function.

   2) A function that keeps track of when a received SigComp MID number
   is not to be acknowledged anymore, called TRACK-Function.

   It is up to the implementation to realize the two functions.

   The acknowledgement procedure is best described using an example, see
   Figure 3. There is one purpose with the procedure:
   "Make it possible to use information in previous messages in the
   compression and decompression process of future messages".

              e-A                            e-B
               |                              |
    Step (0)   |<----------- MID 1B ----------|
               |                              |
    Step (1)   |-------- MID 1A, ACK 1B  ---->|
               |                              |
    Step (2)   |<--------MID 2B, ACK 1A ------|
               |                              |

   Figure 3. Example of the acknowledgement procedure.

   Step (0): A SigComp message with Message IDenitifcation number 1 is
   sent from e-B to e-A, denoted MID 1B in Figure 3. MID 1B must be
   acknowledged by e-A until e-A is positive that e-B knows that MID 1B
   is received.

   Step (1): Entity e-A acknowledges MID 1B with SigComp message MID 1A.
   A mapping between MID 1A and MID 1B is stored at e-A. Note that MID
   1A do not have to carry compressed information, it can be a
   Standalone Acknowledgement.

   Step (2): Entity e-B acknowledges MID 1A and stores a mapping between
   MID 2B and MID 1A. When Entity e-A receives MID 2B it uses the
   mapping to find out which SigComp message(s) from e-B was
   acknowledged by MID 1A. The mapping is then removed and MID 1B does
   not have to be acknowledged anymore.

   To summarize:
   When an entity is to generate an acknowledgement it uses the TRACK-
   Function to find out the MID number it should acknowledge.
   Upon the reception of a SigComp message the header is scanned to find
   the acknowledgements and the MAPP-Function is used to find out the
   corresponding received SigComp messages that do not have to be
   acknowledged anymore.





Hannu, Liu, Price, Rosenberg, et al.                            [Page 9]


INTERNET-DRAFT                  SigComp               October 19 , 2001


5.2.1. Reuse of Message IDentification numbers

   This section describe when a MID can be reused to avoid
   misinterpretations in case of wraparound of MID numbers combined with
   message loss and/or misordering.

   Basically a MID number MUST not be reused until the SigComp message
   using that particular MID number has been acknowledged or that the
   message is regarded as lost.

   Figure 4 is a continuation of Figure 3 and depicts when it is ok to
   reuse a MID number that has been acknowledged.

              e-A                            e-B
               |                              |
    Step (3)   |-------- MID 2A, ACK 2B  ---->|
               |                              |

   Figure 4. Continuation of Figure 3.

   Step (3): When Entity e-B receives an acknowledgement for MID 2B it
   knows that the mapping for MID 1B is removed at e-A. Therefore it
   is safe for entity e-B to reuse MID 1B.

   In the case of MID number reuse when the previous message using that
   MID number has not been acknowledged. Then the previous message
   should be regarded as lost, if the entity has received
   acknowledgments for higher MID numbers than the MID number the entity
   wishes to reuse. However, as SigComp should stand against moderate
   misordering according to [REQ], a MID number X should not be regarded
   as lost until an acknowledgement for message(s) with MID numbers >
   (X+2) has been received.


5.2.2. Specification of ordering constraints

   Inconsistency in the dynamic context data can happen if the context
   is shared and messages, which will update the dynamic context data,
   are sent concurrently by both entities. The approach to deal with
   this problem is to specify ordering constraints of all update
   messages sent by both entities so that the order is total and the
   same at both entities, at least for the eligible updates. Three rules
   are defined, local, causality and concurrent rule. With the first two
   rules and the use of the 3-way handshake, it is possible to ensure
   consistency during updates of dynamic context data, but there is one
   case that these do not resolve; dynamic context data updates issued
   concurrently by both entities.

   With an update request in the description of the rules below means;
   Every message that carries information that is used to update the
   dynamic context data.




Hannu, Liu, Price, Rosenberg, et al.                           [Page 10]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   * Local Rule: When a dynamic context data update request R' is sent
   before R" at the same entity, R' is scheduled to perform an update
   before R" at both entities.  A sending entity schedules its own
   update requests according to their sequence numbers, whereas a
   receiving entity schedules these requests according to the request
   sequence numbers, which can be inferred from the sequence numbers of
   those SigComp messages containing these requests.

   * Causality rule:  When a SigComp message containing a dynamic
   context data update request R' piggybacks with acknowledgements, R'
   is scheduled to perform an update after all the updates corresponding
   to the acknowledged update requests.

   * Concurrent rule: The problem explained above can be eliminated by
   requiring the maintenance of a total order relationship for a
   sequence of all known context update requests at each entity. This
   requires that one entity is the master and one is the slave.

   An example showing how the scheme with the total order relationship
   solves the inconsistent context update problem is shown in Figure 5.
   Suppose all messages exchanged between Entity X (a master entity) and
   Entity Y (a slave entity) are employed for subsequent
   compression/decompression, and all received messages are acknowledged
   and piggybacked in all subsequent messages sent by the receiver.
   Whenever concurrent dictionary updates are happened, update requests
   initiated from the master entity are ordered first.  For example, the
   update requests for Messages 1, 2, and 3 are ordered before those for
   Messages 101, 102, and 103 respectively in the list of the total
   order relationship for a sequence of dictionary updates (TOL).
   DD is a logical representation of the Dynamic Context Data, TSS is
   the logical representation of sent SigComp messages and TRS is the
   logical representation storage of received SigComp messages.

   Since the update request for Message 1 is ordered before that for
   Message 101 according to the total order relationship, the update
   request for Message 101 at Entity Y is blocked until that for Message
   1 becomes ready to be moved, as indicated by the reception of the
   acknowledgement piggybacked with Message 3.


                        |                          |
      DD  = {}          |                          |  DD  = {}
      TSS = {1}         |                          |  TSS = {101}
      TRS = {}          |                          |  TRS = {}
      TOL = {1}         |                          |  TOL = {}
                        |      [1]          [101]  |
                        |-----------\  /-----------|
                        |            \/            |
                        |            /\            |
                        |<----------/  \---------->|
                        |                          |




Hannu, Liu, Price, Rosenberg, et al.                           [Page 11]


INTERNET-DRAFT                  SigComp               October 19 , 2001


      DD  = {}          |                          |  DD  = {}
      TSS = {1, 2}      |                          |  TSS = {101, 102}
      TRS = {101}       |                          |  TRS = {1}
      TOL = {1, 101, 2} |                          |  TOL = {1}
                        |      [2]          [102]  |
                        |-----------\  /-----------|
                        |            \/            |
                        |            /\            |
                        |<----------/  \---------->|
                        |                          |
      DD  = {1}         |                          |  DD  = {}
      TSS = {2, 3}      |                          |  TSS = {101 - 103}
      TRS = {101, 102}  |                          |  TRS = {1, 2}
      TOL = {101, 2,    |                          |  TOL = {1, 101, 2}
             102, 3}    |                          |
                        |   [3]             [103]  |
                        |-----------\  /-----------|
                        |            \/            |
                        |            /\            |
                        |<----------/  \---------->|
                        |                          |
      DD  = {1, 101, 2} |                          |  DD  = {1, 101}
      TSS = {3}         |                          |  TSS = {102, 103}
      TRS = {102, 103}  |                          |  TRS = {2, 3}
      TOL = {102,3,103} |                          |  TOL = {2, 102, 3}
                        |                          |

                   CD Entity X                CD Entity Y

   Figure 5. An Example of Proposed Scheme with Total Order Relationship
   for Dynamic Context Data Updates.


6. Compression Framework Component

   This chapter introduces a simple but flexible dictionary structure
   for the SigComp compression scheme.

   The goal with the flexible dictionary structure is to standardize a
   decompressor capable of interoperating with a wide range of
   compression algorithms. Consequently this chapter describes the
   decompressor operation only, i.e. the actions which the decompressor
   takes upon receiving a certain instruction from the compressor.


6.1.  Information stored in the SigComp dictionary

   An important feature of SigComp is that it offers a standard
   decompressor which can interoperate with a wide range of compression
   algorithms. The precise method for compression is left as an





Hannu, Liu, Price, Rosenberg, et al.                           [Page 12]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   implementation decision, and in fact the standard decompressor can
   interoperate with any of the following classes of algorithm:

   *    Generic text compressor (for example [DEFLATE] or a similar
        algorithm)

   *    Protocol-aware compressor offering excellent performance for
        one signaling protocol

   *    Hybrid compressor with similar performance to [DEFLATE] for
        generic text strings and superior performance for certain
        signaling protocols

   The choice of which instructions to send to the decompressor are left
   as a local implementation decision at the compressor. The only
   requirement is that of transparency, i.e. the compressor MUST NOT
   send instructions which cause the decompressor to incorrectly
   decompress a given signaling message.

   Note however that it is perfectly acceptable for the compressor to
   send tokens which update the dictionary at the decompressor, but
   which cause no decompressed message to be outputted. Indeed, this is
   a useful technique for pre-populating the dictionary with well-known
   text strings.


6.1.1.  Structure of SigComp dictionary

   The SigComp dictionary consists of a simple byte buffer designed to
   hold the current uncompressed message, the current compressed
   message, and any other previously received text strings that might be
   useful for future compression.

   The size buffer_size of the byte buffer is negotiated by an
   externally defined mechanism (e.g. by the underlying SigComp protocol
   used to transport the compressed messages, or alternatively by the
   mechanism used to negotiate use of SigComp itself). Entries in the
   byte buffer are referred to as buffer[n] where 0 =< n < buffer_size.

   As all of the SigComp tokens currently use 2-byte indices into the
   byte buffer, the maximum size of the buffer is 64K.


6.1.2.  Important entries in the byte buffer

   The first few bytes in the SigComp byte buffer are used to store some
   important 2-byte integers. These integers are given the following
   names:







Hannu, Liu, Price, Rosenberg, et al.                           [Page 13]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   Position in buffer:          Name:

        0 - 1                   first_token
        2 - 3                   compressed_start
        4 - 5                   compressed_length
        6 - 7                   uncompressed_start
        8 - 9                   uncompressed_length
        10 - 11                 circular_buffer
        12 - 13                 result_integer
        14 - 15                 stack_free
        16 - 17                 stack[0]
        18 - 19                 stack[1]
        20 - 21                 stack[2]
           :                        :

   The MSBs of the integer are always stored before the LSBs. So, for
   example, the MSBs of first_token are stored in buffer[0] whilst the
   LSBs are stored in buffer[1].

   The use of each integer is described in the following sections of the
   draft.


6.1.3.  Decompressor actions upon receiving a SigComp message

   When a SigComp context is initialized all entries in the byte buffer
   are set to 0. Upon receiving a SigComp message, the decompressor
   strips off the underlying protocol header and then performs the
   following actions:

   1.)   The message is copied directly into the byte buffer beginning
   at the byte specified in compressed_start. The length of the
   compressed message in bytes (known from the underlying protocol used
   to transport the compressed message) is then copied into
   compressed_length.

   Note that the buffer is circular, so once a byte is copied into
   buffer[buffer_size - 1], the next byte is copied into
   buffer[circular_buffer]. The parameter circular_buffer (see Section
   3.2) can be set to prevent the first part of the buffer from being
   overwritten by new compressed messages. Typically this area of the
   buffer is used to hold important tokens and text strings that should
   be kept from one compressed message to the next.

   2.)   Next, the tokens contained within the byte buffer are executed
   beginning at the byte specified in first_token. The tokens are
   executed consecutively unless indicated explicitly (for example when
   the decompressor encounters a SWITCH token). If the byte
   buffer[buffer_size - 1] is ever reached, the next byte is found in
   buffer[circular_buffer].





Hannu, Liu, Price, Rosenberg, et al.                           [Page 14]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   3.)   When the decompressor reaches buffer[0], instead of executing
   the token contained within buffer[0] it stops token execution and
   outputs the uncompressed message. The location of the uncompressed
   message is specified by uncompressed_start and uncompressed_length.

   As stated before the buffer is circular, so once a byte is copied
   from buffer[buffer_size - 1], the next byte is copied from
   buffer[circular_buffer].
   Note that the byte buffer is not reset between SigComp messages
   unless explicitly requested by the underlying protocol.

   Note also that if uncompressed_length is set to 0 then the
   decompressor does not output an uncompressed message. This is very
   useful for populating the byte buffer with well-known text strings,
   before any actual decompression of signaling messages takes place.


6.1.4.  Decompression failure

   If the compressed messages received by the decompressor are corrupted
   (either accidentally or maliciously) then one of three possibilities
   might occur:

   *    A decompressed message is outputted that is incorrect.

   *    A token is encountered that cannot be processed successfully by
        the decompressor (for example a RETURN token when no CALL token
        has reviously been encountered).

   *    The decompressor never finishes decompressing a message.

   To counter the first possibility the underlying protocol SHOULD
   include a checksum to ensure that each message is decompressed
   successfully. If the decompressed message fails the checksum then
   "decompression failure" has occurred. The decompressor does not
   output an uncompressed message, and ignores any future compressed
   message except those which explicitly request the byte buffer to e
   reinitialized.

   If a token is encountered that cannot be successfully processed then
   decompression failure occurs automatically.

   To counter the third possibility, decompression failure SHOULD also
   occur after a certain number of tokens have been processed for a
   given compressed message. The maximum number of tokens to process is
   currently left as an implementation decision (but might in future be
   negotiated).








Hannu, Liu, Price, Rosenberg, et al.                           [Page 15]


INTERNET-DRAFT                  SigComp               October 19 , 2001


6.2.  Library of tokens

   The SigComp decompressor currently understands seven types of token,
   chosen to support the widest possible range of compression algorithms
   with the minimum possible overhead.

   All tokens are stored as a single byte to indicate the token type,
   followed by 0 or more bytes containing the parameters required by the
   token. The following token types are currently available:

   COPY
   ADD / SUBTRACT
   LSHIFT / RSHIFT
   COMPARE
   SWITCH
   CALL ... RETURN
   HUFFMAN

   Each token is explained in more detail below:


6.2.1.  COPY

   The COPY token instructs the decompressor to copy a string of bytes
   from one part of the byte buffer to another.

   A COPY token is stored in the byte buffer as 7 consecutive bytes as
   described below. A simple mnemonic description of the COPY token is
   also provided, which can be useful when writing example lists of
   tokens to be transmitted within a compressed message.

   Mnemonic description:

   COPY (position, length, destination)

   Exact byte description:

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 0 0 0| Position MSB  | Position LSB  |  Length MSB   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Length LSB   |Destination MSB|Destination LSB|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The meaning of the three parameters is explained below:

   Position:    2-byte integer indicating the location of the first
                byte in the string to be copied.

   Length:      2-byte integer indicating the number of bytes to be




Hannu, Liu, Price, Rosenberg, et al.                           [Page 16]


INTERNET-DRAFT                  SigComp               October 19 , 2001


                copied.

   Destination: 2-byte integer indicating the location to which the
                first byte in the string will be copied.

   Note that once a byte is copied into buffer[buffer_size - 1], the
   next byte is copied into buffer[circular_buffer].


6.2.2.  ADD / SUBTRACT

   The ADD token instructs the decompressor to add the two 2-byte
   integers at the specified locations in the byte buffer (addition
   performed modulo 2^16) and to store the result in the location of the
   first integer.

   ADD (index_1, index_2)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 0 0 1|  Index 1 MSB  |  Index 1 LSB  |  Index 2 MSB  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+
   |  Index 2 LSB  |
   +-+-+-+-+-+-+-+-+

   The SUBTRACT token is the same as the ADD token except that the
   second integer is subtracted from the first (subtraction performed
   modulo 2^16).

   SUBTRACT (index_1, index_2)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 0 1 0|  Index 1 MSB  |  Index 1 LSB  |  Index 2 MSB  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+
   |  Index 2 LSB  |
   +-+-+-+-+-+-+-+-+


6.2.3.  LSHIFT / RSHIFT

   The LSHIFT token instructs the decompressor to left shift the 2-byte
   integer at the specified location. As always, the MSBs of the integer
   are stored before the LSBs.

   LSHIFT (index)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 0 1 1|   Index MSB   |   Index LSB   |




Hannu, Liu, Price, Rosenberg, et al.                           [Page 17]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Note that the least significant bit of the integer becomes zero, and
   the most significant bit is discarded.

   The RSHIFT token performs a right shift of the 2-byte integer at the
   specified location.

   RSHIFT (index)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 0 0|   Index MSB   |   Index LSB   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


6.2.4.  COMPARE

   The COMPARE token instructs the decompressor to compare the two 2-
   byte integers at the specified locations in the byte buffer.

   COMPARE (index_1, index_2)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 0 1|  Index 1 MSB  |  Index 1 LSB  |  Index 2 MSB  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+
   |  Index 2 LSB  |
   +-+-+-+-+-+-+-+-+

   If the first integer referenced by the COMPARE token is less than the
   second integer then the decompressor sets result_integer = 0. If both
   integers are equal then it sets result_integer = 1. If the first
   integer is greater than the second then the decompressor sets
   result_integer = 2.


6.2.5.  SWITCH

   The SWITCH token performs a conditional jump based on the contents of
   result_integer.

   SWITCH (index_0, index_1, ... , index_n - 1)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 1 0|  Index 0 MSB  |  Index 0 LSB  |  Index 1 MSB  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   +-+-+-+-+-+-+-+-+-+-+-+-+
   |  Index 1 LSB  |  ...
   +-+-+-+-+-+-+-+-+-+-+-+-+





Hannu, Liu, Price, Rosenberg, et al.                           [Page 18]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   When a SWITCH token is encountered, the decompressor reads the
   integer contained within result_integeer. Suppose that this integer
   is j. The decompressor then continues token execution at the byte
   position specified by index j.

   If result_integer specifies an index which beyond the size of the
   byte buffer, a bad compressed message has been received and
   decompression failure occurs.


6.2.6.  CALL ... RETURN

   The CALL and RETURN tokens provide support for compression algorithms
   with a nested structure.

   CALL (index)
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 1 1|   Index MSB   |   Index LSB   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   RETURN

   +-+-+-+-+-+-+-+-+
   |0 0 0 0 1 0 0 0|
   +-+-+-+-+-+-+-+-+

   When the decompressor reaches a CALL token, it finds the byte
   position of the token immediately following the CALL token and copies
   this integer into stack[stack_free] ready for later retrieval. It
   then increases stack_free by 1 and continues token execution at the
   byte position specified in the CALL token.

   When the decompressor reaches a RETURN token it decreases stack_free
   by 1, and then continues token execution at the byte position
   specified in stack[stack_free].

   If stack_free ever becomes more than buffer_size - 1 or less than 0
   then a bad compressed message has been received and decompression
   failure occurs (see Section 3.4.).


6.2.7.  HUFFMAN

   The HUFFMAN token maps a shorthand Huffman code onto its uncompressed
   equivalent.

   HUFFMAN (position, bit offset, destination, n, length 0, length 1,
   ... , length n - 1)

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 1 0 0 1| Position MSB  | Position LSB  |  Bit Offset   |




Hannu, Liu, Price, Rosenberg, et al.                           [Page 19]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Destination MSB|Destination LSB|   MSB of n    |   LSB of n    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   Length 0    |   Length 1    |      ...      | Length n - 1  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The meaning of the first three parameters is explained below:

   Position:    2-byte integer indicating the byte location of the
                Huffman code to be decompressed.

   Bit Offset:  1-byte integer indicating the bit offset at which the
                Huffman code begins within the byte specified above.

   Destination: 2-byte integer indicating the location to which the
                uncompressed value will be copied.

   Following the [DEFLATE] convention a bit offset of 0 indicates the
   least significant bit of a byte, whilst a bit offset of 7 indicates
   the most significant bit.

   For example, suppose that an 8-bit Huffman code begins at byte
   position 0 and bit offset 2. In this case the 8 bits of the Huffman
   code can be found in the following locations (Bit 0 is the first bit
   in the Huffman code and Bit 7 is the last bit):

    MSB         LSB MSB         LSB

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |5 4 3 2 1 0    |            7 6|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Byte 0          Byte 1

   The remaining parameters specify the actual Huffman codes and their
   uncompressed equivalents. Note that the Huffman codes are downloaded
   to the decompressor using "Canonical" Huffman as described in Section
   3.2.2 of [DEFLATE]. This format is very efficient because it can
   specify a set of Huffman codes by sending only their lengths.

   The parameter n specifies the total number of Huffman codes.
   Following this parameter is the length of each Huffman code in bits,
   where each code can be between 1 and 255 bits in lengths.

   The length j specifies the length of the Huffman code which maps onto
   the uncompressed 2-byte integer j. If length j is set to 0 then there





Hannu, Liu, Price, Rosenberg, et al.                           [Page 20]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   is no Huffman code mapping onto the integer j, so it cannot be
   communicated to the decompressor.

   The actual Huffman codes themselves are assigned as per [DEFLATE],
   using the following two rules:

   *    All codes of a given bit length have lexicographically
        consecutive values, in the same order as the symbols they
        represent;

   *    Shorter codes lexicographically precede longer codes.

   As an example, suppose that the uncompressed 2-byte integers 0, 1, 2
   and 3 have Huffman codes of lengths 2, 1, 3 and 3 bits respectively.
   Then the actual Huffman codes have the following values:

   Uncompressed integer         Code length (bits)      Huffman code

            0                           2                    10
            1                           1                    0
            2                           3                    110
            3                           3                    111

   When the decompressor encounters a HUFFMAN token it reads the Huffman
   code at the specified position and bit offset, and maps it to the
   corresponding 2-byte uncompressed integer. The decompressor then
   copies this uncompressed integer to the specified destination in the
   byte buffer. Finally, the decompressor updates the position and bit
   offset parameters with the location of the next Huffman code (ready
   to decompress it at a later stage).

   If the bit offset does not take a value between 0 and 7 inclusive
   then a bad compressed message has been received and decompression
   failure occurs (see Section 6.1.4.). Decompression failure also
   occurs if a Huffman code is encountered for which no corresponding
   uncompressed integer has been defined.


7. Security considerations

   TBW


8. IANA considerations

   TBW









Hannu, Liu, Price, Rosenberg, et al.                           [Page 21]


INTERNET-DRAFT                  SigComp               October 19 , 2001


9. Authors' Addresses

   Hans Hannu
   Box 920
   Ericsson Erisoft AB
   SE-971 28 Lulea, Sweden

   Phone: +46 920 20 21 84
   Fax: +46 920 20 20 99
   EMail: hans.hannu@ericsson.com

   Zhigang Liu
   2-700
   Mobile Networks Laboratory
   Nokia Research Center
   6000 Connection Drive Irving, TX 75039, USA

   Phone: +1 972 894-5935
   Fax: +1 972 894-4589
   EMail: zhigang.liu@nokia.com

   Richard Price
   Roke Manor Research Ltd
   Romsey, Hants, SO51 0ZN
   United Kingdom

   Phone: +44 1794 833681
   Email: richard.price@roke.co.uk


   more authors...


10. Intellectual Property Right Considerations

   TBW


11. References

   [DEFLATE]   "DEFLATE Compressed Data Format Specification version
               1.3", RFC 1951, P. Deutsch, May 1996

   [RTSP]      H. Schulzrinne, A. Rao and R. Lanphier, Real Time
               Streaming Protocol (RTSP), RFC 2326, April 1998.

   [REQ]       H. Hannu (Editor), Signaling Compression Requirements &
              Assumptions, Internet Draft (work in progress),September
              2001. <draft-ietf-rohc-signaling-req-assump-02.txt>






Hannu, Liu, Price, Rosenberg, et al.                           [Page 22]


INTERNET-DRAFT                  SigComp               October 19 , 2001


   [SIP]       M. Handley, H. Schulzrinne, E. Schooler and J. Rosenberg,
               SIP: Session Initiation Protocol, RFC 2543, August 2000.

This Internet-Draft expires in April 2002.



















































Hannu, Liu, Price, Rosenberg, et al.                           [Page 23]

Document	Document type	This is an older version of an Internet-Draft that was ultimately published as RFC 3320. Expired & archived
	Select version	00 01 02 03 04 05 06 07 RFC 3320
	Compare versions
	Author
	Replaces	draft-ietf-rohc-sigcomp-algorithm draft-ietf-rohc-sigcomp-udvm
	RFC stream
	Other formats	txt pdf bibtex bibxml
	Additional resources	Mailing list discussion