Network Working Group                  Richard Price, Siemens/Roke Manor
INTERNET-DRAFT                                      Hans Hannu, Ericsson
Expires: September 2002                  Carsten Bormann, TZI/Uni Bremen
                                           Jan Christoffersson, Ericsson
                                                      Zhigang Liu, Nokia
                                         Jonathan Rosenberg, dynamicsoft

                                                           March 1, 2002


                      Signaling Compression (SigComp)
                     <draft-ietf-rohc-sigcomp-05.txt>


Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is a submission of the IETF ROHC WG. Comments should be
   directed to its mailing list, rohc@ietf.org.


Abstract

   This document defines SigComp, a solution for compressing messages
   generated by application protocols such as [SIP] and [RTSP]. The
   architecture and pre-requisites of SigComp are outlined, along with
   the format of the SigComp message.

   Decompression functionality for the SigComp solution is provided by a
   "Universal Decompressor Virtual Machine" optimized for the task of
   running decompression algorithms. The UDVM can be configured to
   understand the output of many well-known compressors such as
   [DEFLATE].




Price, Hannu, et al.                                            [Page 1]


INTERNET-DRAFT                  SigComp                   March 1, 2002


Table of contents

   1.  Introduction..................................................2
   2.  Terminology...................................................3
   3.  SigComp Architecture..........................................6
   4.  SigComp message flow..........................................11
   5.  SigComp compressor............................................14
   6.  State handling and announcement...............................16
   7.  Overview of the UDVM..........................................20
   8.  Decompressing a SigComp message...............................23
   9.  UDVM instruction set..........................................26
   10. Security considerations.......................................38
   11. IANA considerations...........................................40
   12. Acknowledgements..............................................41
   13. AuthorsÆ addresses............................................41
   14. References....................................................42
   Appendix A. Document history......................................43

1.  Introduction

   The Session Initiation Protocol [SIP], along with many other
   application protocols used for multimedia communications such as
   [RTSP], is a textual protocol engineered for bandwidth rich links. As
   a result, SIP messages have not been optimized in terms of size.
   Typical SIP messages range from a few hundred bytes to as high as two
   thousand. To date, this has not been a significant problem.

   With the planned usage of these protocols in wireless handsets as
   part of 2.5G and 3G cellular networks, the large size of these
   messages is problematic. With low-rate IP connectivity, store-and-
   forward delays are significant. Taking into account retransmits, and
   the multiplicity of messages that are required in some flows, call
   setup and feature invocation are adversely affected. Therefore, we
   believe there is merit in reducing these message sizes.

   This document outlines the architecture and pre-requisites of the
   SigComp solution, the format of the SigComp message, algorithm
   upload, and the Universal Decompressor Virtual Machine (UDVM) that
   provides decompression functionality.

   SigComp is offered to applications as a "shim" layer between the
   application and the transport. The service provided is that of the
   underlying transport plus compression. Both connection-oriented and
   connectionless transports are supported by SigComp.

   This document focuses on the signaling scenario where an end-terminal
   communicates with a proxy. However SigComp may be applicable to other
   scenarios with multiple endpoints compressing and decompressing data.







Price, Hannu, et al.                                            [Page 2]


INTERNET-DRAFT                  SigComp                   March 1, 2002


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC-2119].

   SigComp

     The overall solution for signaling compression, comprising the
     compressor, decompressor, dispatchers and state handler.

   Application

     For the purpose of this document, an application is a text-based
     protocol software that:

     a) sends application data to the compressor dispatcher
     b) receives data from the decompressor dispatcher
     c) authenticates the sender of a decompressed message and gives
        permission for state to be saved in the sender's name

   Application message

     An uncompressed message provided to or from the application.

   Endpoint

     One instance of an application plus a SigComp layer. Each endpoint
     is capable of sending and/or receiving SigComp messages.

   Endpoint identity

     A unique indicator assigned to each endpoint by the application
     (for example an URI). The application authenticates the sender of
     a decompressed message, and provides their endpoint identity to the
     SigComp state handler.

   Transport

     Mechanism for passing data between two endpoints. SigComp is
     capable of sending messages over a wide range of transports
     including TCP, UDP and [SCTP].

   Message-based transport

     A transport that carries data as a set of bounded messages.

   Stream-based transport

     A transport that carries data as a continuous stream with no
     message boundaries.




Price, Hannu, et al.                                            [Page 3]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Application-defined parameters

     Parameters that must be agreed upon by the local and remote
     endpoints invoking SigComp. Values for the application-defined
     parameters are typically fixed to meet the requirements of a
     particular signaling application.

   SigComp message

     May contain a compressed application message in the form of UDVM
     bytecode. In case of a message-based transport such as UDP, a
     SigComp message corresponds to exactly one (UDP) datagram. For a
     stream-based transport such as TCP, each SigComp message is
     separated by a reserved delimiter.

   Standalone SigComp message

     A SigComp message that does not include any compressed application
     data. Certain signaling applications may not allow standalone
     SigComp messages due to security requirements.

   Compressor

     Entity that invokes an encoder, and keeps track of states that can
     be used for compression. It is responsible for supplying UDVM
     bytecode to the remote decompressor in order for compressed
     data to be decompressed.

   Encoder

     Encodes data according to a particular compression algorithm.

   Compressor dispatcher

     Entity that receives uncompressed application messages, invokes a
     compressor, and forwards the resulting SigComp messages to a remote
     endpoint.

   Decompressor

     Entity that is responsible for converting a SigComp message into
     uncompressed data. Decompression functionality is provided by the
     UDVM.

   Decompressor dispatcher

     Entity that receives SigComp messages, invokes a decompressor, and
     forwards the decompressed application messages to an application.







Price, Hannu, et al.                                            [Page 4]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Virtual machine

     A machine architecture designed to be implemented in software
     (although silicon implementations are of course possible).

   Universal Decompressor Virtual Machine (UDVM)

     The virtual machine described in this document. The UDVM is used
     for decompression of SigComp messages.

   Bytecode

     Machine code that can be executed by a virtual machine. UDVM
     bytecode is a combination of UDVM instructions and compressed data.

   Per-message compression

     Compression that does not reference data from previous messages.
     SigComp can decompress a message of this type using only the
     application-defined parameters and the data in the message itself.

   Dynamic compression

     Compression relative to messages sent prior to the current
     compressed message. SigComp stores and retrieves this data using
     the state handler.

   State

     Data saved for retrieval by later SigComp messages. An item of
     state typically reflects the contents of the UDVM memory after
     decompressing a message, but state can also be created by the
     compressor or by the application.

   State handler

     Entity responsible for storing and accessing state information
     once permission is granted by the application.

   State identifier

     Reference used to access an item of state previously created by the
     compressor, the decompressor or the application.

   CPU cycles

     A measure of the amount of "CPU power" required to execute a UDVM
     instruction (the simplest UDVM instructions require a single CPU
     cycle). An upper limit is placed on the number of cycles that can
     be used to decompress each bit in a compressed message.





Price, Hannu, et al.                                            [Page 5]


INTERNET-DRAFT                  SigComp                   March 1, 2002


3.  SigComp Architecture


   In the SigComp architecture compression and decompression is
   performed at two communicating entities. SigComp is offered to
   applications as a "shim" layer between the application and the
   underlying transport, and so these entities are endpoints when viewed
   from a transport layer perspective. Note however that from the
   application perspective SigComp is applied on a per-hop basis.

   Figure 1 shows the layout of a communicating endpoint that implements
   a SigComp layer. The figure does not mandate any particular
   implementation, but is shown to the reader for the sake of clarity.

   The SigComp layer is further decomposed in the following components:

   - A compressor dispatcher: this is the interface from the
     application. The compressor dispatcher receives an application
     message and an identifier for the receiving endpoint. Based on the
     endpoint identity the compressor dispatcher invokes a particular
     compressor, which returns a SigComp message that is forwarded to
     the remote SigComp endpoint.

   - A decompressor dispatcher: this is the interface towards the
     application. A SigComp message is received by the decompressor
     dispatcher and an instance a decompressor is invoked. Once the
     dispatcher has received the (decompressed) application data it
     forwards the message to the application.

   - One or more compressors: a distinct compressor is invoked for each
     remote endpoint with which the local application wishes to
     communicate. A compressor receives an (uncompressed) application
     message from the compressor dispatcher, compresses the
     message, and returns a SigComp message to the compressor
     dispatcher. During the compression process the compressor may
     invoke the state handler to restore previous state or save new
     state. Each compressor chooses a certain algorithm to encode the
     data, (e.g. [DEFLATE]).

   - One or more decompressors: since SigComp can run over an unsecure
     transport layer, a distinct decompressor must be invoked on a
     per-message basis. A decompressor receives a SigComp message from
     the decompressor dispatcher, decompresses the message, and returns
     the (decompressed) application message to the decompressor
     dispatcher. During the decompression process, the decompressor may
     invoke the state handler to restore previous state or save new
     state.

   - State handler: this entity contains enough logic to store and
     retrieve states. State is information that is stored between
     SigComp messages: this data can be saved either by a compressor, a
     decompressor or an application. For security purposes the state




Price, Hannu, et al.                                            [Page 6]


INTERNET-DRAFT                  SigComp                   March 1, 2002


     handler must always ask the application to grant permission for new
     states to be saved. State creation and retrieval are further
     described in Chapter 6.


               +---------------------------------------------+
               |                                             |
               |                 Application                 |
               |                                             |
               |                                             |
               +---------------------------------------------+
                      |                    |         ^
            Message & |           Endpoint |         | Decompressed
             endpoint |           identity |         | message
             identity |                    |         |
                      |                    |         |
       +-- -- -- -- --|-- -- -- -- -- -- --|-- -- -- |- -- -- -- -- +
       |              |                    |         |              |
                      v                    v         |
       |    +--------------+  +--------------+  +--------------+    |
    SigComp |              |  |              |  |              | SigComp
    message |  Compressor  |  |    State     |  | Decompressor | message
    <-------|  dispatcher  |  |   handler    |  |  dispatcher  |<-------
       |    |              |  |              |  |              |    |
            +--------------+  +--------------+  +--------------+
       |           ^  ^          ^  ^  ^  ^          ^  ^           |
                   |  |          |  |  |  |          |  |
       |           |  |          |  |  |  |          |  |           |
                   |  |          |  |  |  |          |  |
       |           v  |          |  |  |  |          v  |           |
            +--------------+     |  |  |  |     +--------------+
       |    | Compressor 1 |     |  |  |  |     |Decompressor 1|    |
            |              |<----+  |  |  +---->|              |
       |    |  (Encoder)   |        |  |        |    (UDVM)    |    |
            |              |        |  |        |              |
       |    +--------------+        |  |        +--------------+    |
                      |             |  |                |
       |              v             |  |                v           |
            +--------------+        |  |        +--------------+
       |    | Compressor 2 |        |  |        |Decompressor 2|    |
            |              |<-------+  +------->|              |
       |    |  (Encoder)   |                    |    (UDVM)    |    |
            |              |                    |              |
       |    +--------------+                    +--------------+    |

       |                        SigComp layer                       |
       +-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- +

    Figure 1: High-level architectural overview of one SigComp endpoint






Price, Hannu, et al.                                            [Page 7]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Note that it is possible for SigComp to decompress messages from
   multiple endpoints at different physical locations in a network, as
   the architecture is designed to prevent data from one endpoint
   interfering with data from a different endpoint. A consequence of
   this design choice is that it is difficult for a malicious user to
   disrupt SigComp operation by inserting false compressed messages on
   the transport layer.

   Each decompressor in the architecture of Figure 1 is an instance of
   the Universal Decompressor Virtual Machine (UDVM). Figure 2 gives a
   more detailed view of a UDVM, including all of the interfaces between
   the UDVM and its environment.


   +----------------+                                 +----------------+
   |                |     Request compressed data     |                |
   |                |-------------------------------->|                |
   |                |<--------------------------------|                |
   |                |     Provide compressed data     |                |
   |                |                                 |   Dispatcher   |
   |                |                                 |                |
   |                |    Output decompressed data     |                |
   |                |-------------------------------->|                |
   |                |                                 |                |
   |                |                                 +----------------+
   |      UDVM      |
   |                |                                 +----------------+
   |                |    Request state information    |                |
   |                |-------------------------------->|                |
   |                |<--------------------------------|                |
   |                |    Provide state information    |                |
   |                |                                 |     State      |
   |                |                                 |    Handler     |
   |                |   Make state creation request   |                |
   |                |-------------------------------->|                |
   |                |      Forward announcement       |                |
   |                |                                 |                |
   +----------------+                                 +----------------+

         Figure 2: Interfaces between the UDVM and its environment

   Note that for simplicity, the UDVM indicates when it requires
   additional compressed data or state information using an explicit
   instruction. It then pauses and waits for the information to be
   supplied before continuing with the next instruction. This prevents
   the arrival of more data from interfering with the operation of the
   UDVM (e.g. by accidentally overwriting UDVM memory that is currently
   in use).







Price, Hannu, et al.                                            [Page 8]


INTERNET-DRAFT                  SigComp                   March 1, 2002


3.1.  Requirements on application

   From an application perspective the SigComp layer appears as a new
   transport, with similar behavior to the original transport used to
   carry uncompressed data (for example SigComp/UDP behaves similarly to
   native UDP).

   If the application wishes to mix SigComp messages with other types of
   data (e.g. uncompressed data, or SigComp data for a different
   application) on the same transport then the transport must
   distinguish between the different types of data. This means that a
   new port will need to be reserved or discovered for the SigComp
   messages destined for a particular application. For example SIP uses
   port 5060 for TCP and port 5061 for TLS/TCP, so it could similarly
   reserve another port for SigComp/TCP.

   In the interests of security, a new interface is required to the
   signaling application in order to leverage the authentication
   functions built into the application itself. When the application
   receives a decompressed message it determines the identity of the
   sending endpoint and supplies this information to the state handler.

3.2.  Application-defined parameters

   When an application invokes SigComp, a number of parameters are
   provided by the application to control the maximum size of compressed
   messages, the UDVM memory size etc. The local and remote applications
   that wish to communicate MUST initially agree on a common set of
   values for these parameters.

   Note that the majority of application-defined parameters are set to
   fixed values for a particular signaling application. However,
   endpoints implementing SigComp will typically have a wide range of
   capabilities; each offering a different amount of working memory,
   processing power and so on. In order to support this wide variation
   in endpoint capabilities, SigComp includes a mechanism for modifying
   the following application-defined parameters on the fly:

   UDVM_version
   UDVM_memory_size
   cycles_per_bit
   cycles_per_message
   Initial state

   The SigComp announcement mechanism is described further in Section
   6.3.

   The advantage of building the announcement mechanism into SigComp is
   that it avoids the need for any form of negotiation to be performed
   by the application itself. Instead, it is sufficient to initialize





Price, Hannu, et al.                                            [Page 9]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   all of the application-defined parameters to fixed values and modify
   them later using SigComp itself.

   Each application-defined parameter is described below.

   Note that unless otherwise indicated, all of the parameters can be
   stored as 2-byte integers.

   UDVM_version

     The UDVM_version parameter specifies the level of functionality
     available at the UDVM. The basic version of the UDVM (Version 0)
     is defined in this document.

   maximum_expansion_size

     The maximum_expansion_size parameter prevents the generation of
     excessively large SigComp messages. If set to 0 then the parameter
     is ignored by SigComp; for any other value then if an uncompressed
     message is k bytes long, the corresponding SigComp message must be
     no larger than (k + maximum_expansion_size). Note that any value
     other than 0 bans the creation of standalone SigComp messages (i.e.
     messages that do not contain a compressed application message).

   maximum_compressed_size

     The maximum_compressed_size parameter limits the size of one
     compressed message. SigComp rejects any message larger than the
     specified value.

   maximum_uncompressed_size

     The maximum_uncompressed_size parameter limits the size of one
     uncompressed message. SigComp rejects any message larger than the
     specified value.

   minimum_hash_size

     The minimum_hash_size parameter specifies the minimum size of the
     state identifier when creating new state information. This value
     needs to be sufficiently large to prevent malicious users from
     guessing a state identifier by brute force.

   UDVM_memory_size

     The UDVM_memory_size parameter specifies the total number of
     bytes in the UDVM memory.

   cycles_per_bit

     The cycles_per_bit parameter specifies the number of "CPU cycles"




Price, Hannu, et al.                                           [Page 10]


INTERNET-DRAFT                  SigComp                   March 1, 2002


     that can be used to decompress a single bit of data. One CPU cycle
     typically corresponds to a single UDVM instruction, although some
     of the high-level instructions may require additional cycles.

   cycles_per_message

     The cycles_per_message parameter specifies the number of additional
     CPU cycles made available at the start of a compressed message.
     These cycles can be useful when decompressing algorithms that
     upload additional data on a per-message basis, for example a new
     set of Huffman codes as with [DEFLATE].

     The total maximum number of "CPU cycles" available for each
     compressed message is specified by the following formula:

     maximum_cycles = message_size * cycles_per_bit + cycles_per_message

   maximum_state_size

     The maximum_state_size parameter specifies the maximum amount of
     state information that can be saved by a local endpoint, for each
     remote endpoint with which it communicates. Note that the amount of
     state information is expressed as a multiple of the parameter
     UDVM_memory_size, because an item of state generally
     reflects the contents of the UDVM memory.

   Initial state

     The application can store useful information in the form of state.
     This predefined state is used to offer a range of well-known
     decompression algorithms to the compressor, which can choose to
     avoid uploading bytecode for a new algorithm if it supports one of
     the well-known algorithms. Each item of initial state can be made
     mandatory for every instance of the application, or it can be made
     optional (in which case support for the relevant state will need to
     be advertised before the state can be used).


4.  SigComp message flow

   This chapter describes the SigComp message flow and the operation of
   the compressor and decompressor dispatcher.

4.1.  Message exchange

   The local SigComp layer may send compressed data to a remote SigComp
   layer, and the local SigComp layer may also receive compressed data.
   Note however that compression in one direction does not necessarily
   imply compression in the reverse direction. Furthermore, even in the
   case that there are two unidirectional compressed flows between two





Price, Hannu, et al.                                           [Page 11]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   SigComp layers, there is no need to use the same compression
   algorithm at both compressors.

4.2.  SigComp message format

   In every SigComp message the first few bytes are interpreted as a
   state identifier that accesses some previously stored state
   information.

   This state information includes all of the data needed to decompress
   the SigComp message: including the decompression algorithm that will
   be applied to the remainder of the message, as well as any additional
   information that is required (e.g. one or more previously received
   messages if dynamic compression is in use).

   The format of the basic SigComp message is given in Figure 4:

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1   1   1   1   1 |  length   |
   +---+---+---+---+---+---+---+---+
   |                               |
   :   state_identifier (n-bytes)  :
   |                               |
   +---+---+---+---+---+---+---+---+
   |                               |
   :   Remaining SigComp message   :
   |                               |
   +---+---+---+---+---+---+---+---+

    Figure 4: Basic SigComp message

   The length field is a 3-bit value (MSBs before LSBs) that indicates
   the length of the state identifier. The actual size n of the state
   identifier is calculated as follows:

                   n  =  minimum_hash_size + length - 1

   The state identifier is then extracted from the SigComp message and
   then executed as defined by the STATE-EXECUTE instruction of Chapter
   9.

   If the length value is set to 0 then no state is accessed; instead
   the entire SigComp message is copied into the UDVM memory beginning
   at Address 6, and then executed starting from Address 6.

   All other addresses in the UDVM memory are initialized to 0.

   Decompression failure occurs if the SigComp message is too short to
   contain the expected state identifier, or if the requested state does
   not exist. See Section 8.2 for further details.




Price, Hannu, et al.                                           [Page 12]


INTERNET-DRAFT                  SigComp                   March 1, 2002


4.3.  Interfaces to and from the compressor dispatcher

   When the application provides a message to be compressed, it MUST
   also provide an "endpoint identity" that distinguishes the endpoint
   from other endpoints.

   The exact format of the endpoint identity is unimportant, provided
   that distinct endpoints have distinct endpoint identities.

   The SigComp layer contains one compressor for each remote endpoint
   with which the local application is communicating; the dispatcher
   forwards each new application message to the appropriate compressor
   (invoking a new compressor if a new endpoint identity is
   encountered).

   Note that the application MUST indicate to the compressor dispatcher
   when it no longer wishes to communicate with a particular endpoint,
   so that the resources taken by the corresponding compressor can be
   reclaimed.

4.4.  Interfaces to and from the decompressor dispatcher

   To ensure that SigComp can run over an unsecure transport layer, the
   decompressor dispatcher invokes a new decompressor for each new
   SigComp message. Resources for the decompressor are released as soon
   as the message is decompressed.

   Upon the arrival of a SigComp message the decompressor dispatcher
   invokes an instance of the UDVM and loads it with the indicated state
   as per Section 4.2. The message is then decompressed by the UDVM,
   returned to the decompressor dispatcher, and passed on to the
   receiving application.

   Note that when the UDVM is invoked it does not receive any compressed
   data by default, but instead requests new data explicitly using a
   specific instruction. Therefore, the dispatcher is responsible for
   buffering each SigComp message and passing the data to the UDVM when
   it is requested. If the UDVM requests additional compressed data that
   is not yet available then it pauses and waits until enough data has
   been received by the dispatcher.

   Uncompressed data is also outputted by the UDVM using a specific
   instruction. Note that the UDVM has no awareness of whether the
   underlying transport is message-based or stream-based, and so it
   always outputs uncompressed data as a stream. It is the
   responsibility of the dispatcher to provide the uncompressed message
   to the application in the expected form (i.e. as a stream or as a set
   of distinct, bounded messages).







Price, Hannu, et al.                                           [Page 13]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   For a stream-based transport, the dispatcher delimits messages by
   parsing the compressed data stream for instances of 0xFF and taking
   the following actions:

   Occurs in data stream:     Meaning:

   0xFF 00                    one 0xFF byte in the data stream
   0xFF 01                    same, but the next byte is quoted (could
                              be another 0xFF)
      :                                           :
   0xFF 7F                    same, but the next 127 bytes are quoted
   0xFF 80 to 0xFF FE         reserved
   0xFF FF                    message boundary

   The reserved characters are useful for byte stuffing (if a
   compression algorithm generates compressed data containing the
   character 0xFF then it should be replaced by the character 0xFF00 to
   avoid accidentally inserting a message delimiter into the compressed
   data stream).


5.  SigComp compressor

   An important feature of SigComp is that if two endpoints cannot agree
   on a common algorithm with which to send and receive data, it is
   possible for the compressor to upload bytecode for its own choice of
   algorithm to the decompressor. In particular this means that it is
   not necessary to force all compressors to use the same default
   algorithm; instead each implementer has the freedom to pick one of
   the predefined algorithms or to upload their own if needed.

   The overall requirement placed on the compressor is that of
   transparency, i.e. the compressor MUST NOT send bytecode which cause
   the UDVM to incorrectly decompress a given message.

   The following more specific requirements are also placed on the
   compressor (they can be considered particular instances of the
   transparency requirement):

   *    It is RECOMMENDED that the compressor supply a CRC over the
        uncompressed message to ensure that successful decompression has
        occurred. A UDVM instruction is provided to verify this CRC.

   *    If the transport is message-based then the compressor MUST
        preserve the boundaries between messages.

   *    If the transport is stream-based but the application defines its
        own internal message boundaries, then the compressor SHOULD
        preserve the boundaries between messages by using the "end-of-
        message" character 0xFFFF reserved by SigComp.





Price, Hannu, et al.                                           [Page 14]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   *    The compressor MUST NOT exceed the maximum_compressed_size and
        MUST ensure that the message can be decompressed using no more
        than the resources available at the remote decompressor.

   The reason for preserving the message boundaries over a stream-based
   transport is that damage to one compressed message does not affect
   the decompression of subsequent messages. Moreover, the application
   typically vetoes state creation requests on a per-message basis.

5.1.  Supplying bytecode to the UDVM

   A compressor MUST be certain that compressed data can be decompressed
   before the data is to be sent, i.e. the UDVM instructions for
   decompression MUST be available at the remote decompressor. Several
   options exist for ensuring that this bytecode is available:

   1. Each SigComp message sent from the compressor contains the
      necessary UDVM instructions for decompression.

   2. By setting up a reliable connection, such as a TCP connection,
      between a compressor and its remote decompressor the UDVM
      instructions can be transferred and saved as state.

   3. If there are predefined UDVM codes for well-known algorithms, a
      compressor only needs to send the state identifier of that UDVM
      decompression algorithm code to its remote decompressor. The
      decompressor can then populate the UDVM locally.

   In order to save delay for "time-critical" sessions, the UDVM
   instructions should be uploaded prior to any initiation of "time-
   critical" sessions.

5.2.  Compression failure

   The compressor SHOULD make every effort to successfully compress an
   application message, but in certain cases this might not be possible
   (particularly if a low maximum_compressed_size has been set by the
   application). In this case a "compression failure" is called.
   Reasons for compression failure include the following:

   *    A compressed or uncompressed message exceeds the maximum size
        defined by the application.

   *    The maximum_compressed_size is exceeded for a certain message.

   *    Insufficient resources are available at the compressor or at the
        remote decompressor.

   If a compression failure occurs when compressing a message then the
   compressor informs the dispatcher and takes no further action. The





Price, Hannu, et al.                                           [Page 15]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   dispatcher MUST report this failure to the application. The
   application may then try other methods to deliver the message.


6.  State handling and state announcement

   This chapter defines the behavior of the SigComp state handler. The
   function of the state handler is to retain information between
   successive SigComp messages; it is the only SigComp entity that is
   capable of this function, and so it is of particular importance from
   a security perspective.

6.1.  Storing and retrieving state

   To provide security against the malicious insertion or modification
   of SigComp messages, the UDVM memory is reset after decompressing
   each message. This ensures that damaged SigComp messages do not
   prevent the successful decompression of subsequent valid messages.

   Note however that the overall compression ratio is often
   significantly higher if messages can be compressed relative to the
   information stored in previous messages. For this reason it is
   possible to create "state" information for access when a later
   message is being decompressed.

   Both the creation and access of state are designed to be secure
   against malicious tampering with the compressed data. State can only
   be created when a complete message has been successfully
   decompressed, and the state handler MUST NOT save state without
   permission from the application.

   Upon receiving a decompressed message, the application may supply the
   state handler with the identity of the sending endpoint. Supplying
   this identity grants permission for the state handler to do the
   following:

   *    An item of state can be saved using the memory reserved for the
        specified endpoint.

   *    Announcement information can be taken into account
        when sending SigComp messages to the specified endpoint.

   This is especially useful if the application has an authentication
   mechanism that can be applied to determine whether the decompressed
   data is legitimate.

   Also note that state is not deleted when it is accessed. So even if a
   malicious user manages to access state information, subsequent
   messages compressed relative to this state can still be successfully
   decompressed. Instead, the state handler is responsible for deleting





Price, Hannu, et al.                                           [Page 16]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   state information once it determines that the state will no longer be
   needed.

   Each item of state stores the following information:

   Name:                      Type of data:

   state_identifier           16-byte value
   state start                2-byte value
   state_instruction          2-byte value
   state length               2-byte value
   state_value                String of bytes

   The state_identifier must be supplied to retrieve an item of state
   from the state handler. State can be accessed using the UDVM
   instructions STATE-REFERENCE and STATE-EXECUTE, and can be created
   using the END-MESSAGE instruction.

   The state_value is a byte string that contains the actual value that
   is copied from/to the UDVM memory. The state_length specifies the
   number of bytes contained within state_value, and state_start gives
   the UDVM memory address to which the state_value is copied when it is
   accessed.

   Finally, state_instruction specifies the memory address of the next
   UDVM instruction to execute when state is accessed.

   The kind of information which is included in the state_value is up to
   a particular compressor and the uploaded instructions in the remote
   UDVM. However a compressor MUST NOT use a state that is not known to
   be established at the remote decompressor.

6.2. Saving and deleting states

   The state handler for each endpoint is expected to offer memory to
   store UDVM-created state. Every remote endpoint that wishes to
   communicate with the local endpoint expects to be able to store a
   fixed amount of state; the number of bytes that it can store is given
   by the formula UDVM_memory_size * maximum_state_size.

   Note that each item of state costs (state_length + 22) bytes to
   store.

   The state handler keeps track of which endpoint created each item of
   state; when a particular endpoint exceeds its allocated memory limit
   then sufficient items of state created by the same endpoint are
   deleted (oldest state first) until enough memory is available to
   accommodate the new state.







Price, Hannu, et al.                                           [Page 17]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   The application MUST indicate to the state handler when it no longer
   wishes to communicate with a particular endpoint, so that the
   resources taken by the corresponding state can be reclaimed.

6.3.  Announcement

   The announcement information is used to modify the value of certain
   application-defined parameters. Since these parameter values are
   saved between SigComp messages, they are considered to be part of the
   overall state and hence are supplied from the UDVM to the state
   handler.

   The following list of parameters is passed to the state handler using
   the appropriate UDVM instruction (namely the END-MESSAGE
   instruction):

         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |            length             |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |         UDVM_version          |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |       UDVM_memory_size        |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |        cycles_per_bit         |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |      cycles_per_message       |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |              n                |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |          id_length 1          |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                               |
         :          id_value_1           :
         |                               |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :         :
                    :         :
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |          id_length n          |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                               |
         :          id_value_n           :
         |                               |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         |                               |
         :           reserved            :
         |                               |
         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         Figure 5: Announcement information





Price, Hannu, et al.                                           [Page 18]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   If the application does not return a valid endpoint identifier then
   the announcement information is automatically discarded by the state
   handler. Otherwise it is passed to the compressor responsible for
   sending messages to the given endpoint.

   The reserved field allows for additional items of data to be added to
   the announcement information in future.

   Note that the length field specifies the total length of the
   announcement information including the reserved field. As usual, MSBs
   are stored preceding LSBs.

   The remaining items of data are explained in greater detail below:

6.3.1.  UDVM version

   The next 2 bytes of the announcement information specify whether only
   the basic version of the UDVM is available, or whether an upgraded
   version of the UDVM is available offering additional instructions
   etc.

   The basic version of the UDVM is Version 0, which is the version
   described in this document. Upgraded versions MUST be backwards-
   compatible with the basic version in the following sense:

   *    If some UDVM bytecode reaches the END-MESSAGE or DECOMPRESSION-
        FAILURE instructions when running on Version 0 of the UDVM, then
        the upgraded version MUST run the bytecode in an identical
        manner.

   This condition ensures that all bytecode that is valid for Version 0
   of the UDVM will continue to be valid for upgraded versions of the
   UDVM. However, bytecode that is invalid on Version 0 of the UDVM
   (i.e. bytecode that produces a decompression failure that is not
   manually triggered) may become valid on upgraded versions.

   The simplest way to upgrade the UDVM in a backwards-compatible manner
   is to add additional UDVM instructions, as this will not affect the
   operation of existing UDVM bytecode.

6.3.2.  Memory size and CPU cycles

   The next 6 bytes of data specify new values for the application-
   defined parameters UDVM_memory_size, cycles_per_bit and
   cycles_per_message.

   Note that this data can only be used to increase the amount of
   resources available at the remote UDVM. If the data specifies a
   parameter value that is smaller than the value already possessed by
   the state handler, the parameter keeps its original value (i.e. the
   announcement data for this parameter is simply ignored).




Price, Hannu, et al.                                           [Page 19]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   In particular, only allowing the parameter values to increase means
   that the announcement mechanism is robust against message loss or
   reordering.

   The parameters can only be restored to their original values if reset
   or renegotiated by the application.

6.3.3.  State identifiers

   The list of state identifiers indicates that the sending endpoint
   supports one or more optional mechanisms (including well-known
   decompression algorithms, dictionaries of common SIP phrases,
   feedback mechanisms etc.).

   The integer n specifies the number of state identifiers to follow.
   The field id_length_j specifies the length in bytes of id_value_j,
   where acceptable values for id_length_j range from 1 to 16 inclusive.
   If a value outside this range is received then the subsequent state
   identifiers are ignored by the state handler.

   Each id_value_j indicates support for one optional mechanism at the
   sending endpoint. The optional mechanisms themselves, and their
   corresponding state identifiers, are beyond the scope of this
   document.


7.  Overview of the UDVM

   Decompression functionality for SigComp is provided by a "Universal
   Decompressor Virtual Machine" (UDVM). The UDVM is a virtual machine
   much like the Java Virtual Machine but with a key difference: it is
   designed solely for the purpose of running decompression algorithms.

   The motivation for creating the UDVM is to provide unlimited
   flexibility when choosing how to compress a given item of data.
   Rather than picking one of a small number of pre-negotiated
   compression algorithms, the implementer has the freedom to select an
   algorithm of their choice. The compressed data is then combined with
   a set of UDVM instructions that allow the original data to be
   extracted, and the result is outputted as UDVM bytecode.

   Since the UDVM is optimized specifically for running decompression
   algorithms, the code size of a typical algorithm is small (often sub
   100 bytes). Moreover the UDVM approach does not add significant extra
   processing or memory requirements compared to running a fixed pre-
   programmed decompression algorithm.

   This chapter describes some basic features of the UDVM, including the
   well-known variables and instruction operands.






Price, Hannu, et al.                                           [Page 20]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Recall that the amount of memory available to the UDVM is specified
   by the application-defined parameter UDVM_memory_size. Any attempt to
   read memory addresses beyond the overall memory size MUST cause a
   decompression failure (see Section 8.2).

7.1.  Well-known variables

   The first few variables in the UDVM memory have special tasks, for
   example specifying the location of the stack used by the CALL and
   RETURN instructions. Each of these well-known variables is a 2-byte
   integer.

   The following list gives the name of each well-known variable and the
   memory address at which the variable can be found:

   Name:           Starting memory address:

   byte_copy_left             0
   byte_copy_right            2
   stack_location             4

   The MSBs of each variable are always stored before the LSBs. So, for
   example, the MSBs of stack_location are stored at Address 4 whilst
   the LSBs are stored at Address 5.

   The use of each well-known variable is described in the following
   sections of the document.

7.2.  Instruction operands

   Each of the UDVM instructions is followed by 0 or more bytes
   containing the operands required by the instruction.

   To reduce the code size of a typical UDVM program, each operand for a
   UDVM instruction is compressed using variable-length encoding. The
   aim is to store more common operand values using fewer bits than
   rarely occurring values.

   Three different types of operand are available: the literal, the
   reference and the multitype. The operand types that follow each UDVM
   instruction are specified in Chapter 9.

   The UDVM bytecode for each operand type is illustrated in Figure 7 to
   Figure 9, together with the integer values represented by the
   bytecode.

   Note that the MSBs in the bytecode are illustrated as preceding the
   LSBs. Also, any string of bits marked with k consecutive "n"s is to
   be interpreted as an integer N from 0 to 2^k - 1 inclusive (with the
   MSBs of n illustrated as preceding the LSBs).





Price, Hannu, et al.                                           [Page 21]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   The decoded integer value of the bytecode can be interpreted in two
   ways. In some cases it is taken to be the actual value of the
   operand. In other cases it is taken to be a memory address at which
   the 2-byte operand value can be found (MSBs found at the specified
   address, LSBs found at the following address). The latter case is
   denoted by memory[X] where X is the address and memory[X] is the 2-
   byte value starting at Address X.

   The simplest operand type is the literal (#), which encodes a
   constant integer from 0 to 65535 inclusive. A literal operand may
   require between 1 and 3 bytes depending on its value.

   Bytecode:                  Operand value:      Range:

   0nnnnnnn                        N                   0 - 127
   10nnnnnn nnnnnnnn               N                   0 - 16383
   11000000 nnnnnnnn nnnnnnnn      N                   0 - 65535

               Figure 7: Bytecode for a literal (#) operand

   The second operand type is the reference ($), which is always used to
   access a 2-byte value located elsewhere in the UDVM memory. The
   bytecode for a reference operand is decoded to be a constant integer
   from 0 to 65535 inclusive, which is interpreted as the memory address
   containing the actual value of the operand.

   Note that reference operands can always take values from 0 to 65535
   inclusive, as they reference 2-byte values.

   Bytecode:                  Operand value:      Range:

   0nnnnnnn                        memory[2 * N]       0 - 65535
   10nnnnnn nnnnnnnn               memory[2 * N]       0 - 65535
   11000000 nnnnnnnn nnnnnnnn      memory[N]           0 - 65535

              Figure 8: Bytecode for a reference ($) operand

   The third kind of operand is the multitype (%), which can be used to
   encode both actual values and memory addresses. The multitype operand
   also offers efficient encoding for small integer values (both
   positive and negative) and for powers of 2.

   Bytecode:                  Operand value:      Range:

   00nnnnnn                        N                   0 - 63
   01nnnnnn                        memory[2 * N]       0 - 65535
   1000011n                        2 ^ (N + 6)        64 , 128
   10001nnn                        2 ^ (N + 8)    256 , ... , 32768
   111nnnnn                        N + 65504       65504 - 65535
   1001nnnn nnnnnnnn               N + 61440       61440 - 65535
   101nnnnn nnnnnnnn               N                   0 - 8191




Price, Hannu, et al.                                           [Page 22]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   110nnnnn nnnnnnnn               memory[N]           0 - 65535
   10000000 nnnnnnnn nnnnnnnn      N                   0 - 65535
   10000001 nnnnnnnn nnnnnnnn      memory[N]           0 - 65535

              Figure 9: Bytecode for a multitype (%) operand

7.3.  Byte copying

   A number of UDVM instructions require a string of bytes to be copied
   to and from areas of the UDVM memory. This section defines how the
   byte copying operation should be performed.

   In general, the string of bytes is copied in ascending order of
   memory address. So if a byte is copied from/to Address n then the
   next byte is copied from/to Address n + 1. As usual, if a byte is
   read from an address beyond the overall memory size then
   decompression failure occurs.

   Note however that if a byte is copied from/to the memory address
   specified in byte_copy_right, the byte copy operation continues by
   copying the next byte from/to the memory address specified in
   byte_copy_left. This is useful for setting up a "circular buffer"
   within the UDVM memory.

   Note that the string of bytes is copied on a purely byte-by-byte
   basis. In particular, some of the later bytes to be copied may
   themselves have been written into the UDVM memory by the byte copying
   operation currently being performed.

   Equally, it is possible for a byte copying operation to overwrite the
   instruction that called the byte copy. If this occurs then the byte
   copying operation MUST be completed as if the original instruction
   were still in place in the UDVM memory (this also applies if
   byte_copy_left or byte_copy_right are overwritten).


8.  Decompressing a SigComp message

   This chapter lists the steps involved in the decompression of a
   single SigComp message.

8.1.  Invoking the UDVM

   Whenever the dispatcher receives a message to be decompressed, it
   invokes a new instance of the UDVM. The UDVM_memory_size is
   initialized using the corresponding application-defined parameter.
   The following steps are then taken:

   1.)   The number of remaining CPU cycles is set equal to the
   application-defined parameter cycles_per_message.





Price, Hannu, et al.                                           [Page 23]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Notes:

   The amount of compressed data available to the UDVM is exactly one
   compressed message. If the transport is stream-based then SigComp
   uses the reserved byte string 0xFFFF to delimit the compressed
   messages: the dispatcher takes the data between a pair of neighboring
   reserved byte strings to be a single compressed message. The reserved
   byte string itself is not considered to be part of the compressed
   message.

   The compressed data is not provided to the UDVM by default. Instead,
   the UDVM requests compressed data using the INPUT instructions
   (useful when running over a stream-based transport since there is no
   need to wait for the entire compressed message before decompression
   can begin).

   The dispatcher MUST NOT make more than one compressed message
   available to a given instance of the UDVM. In particular, the
   dispatcher MUST NOT concatenate two messages to form a single
   compressed message. This is because compressed messages are typically
   padded with trailing zero bits so that they are a whole number of
   bytes long. Concatenating two messages would cause these padding bits
   to be incorrectly interpreted as compressed data.

   2.)   Next, the instructions contained within the UDVM memory are
   executed beginning at the address specified by the state as per
   Section 4.2.

   Notes:

   The instructions are executed consecutively unless otherwise
   indicated (for example when the UDVM encounters a JUMP instruction).

   If the next instruction to be executed lies outside the available
   memory then decompression failure occurs (see Section 8.2).

   3.)   Each time an instruction is executed the number of available
   CPU cycles is decreased by the amount specified in Chapter 9.
   Additionally, if the UDVM requests n bits of compressed data (using
   one of the INPUT instructions) then the number of available CPU
   cycles is increased by n * cycles_per_bit.

   Notes:

   This means that the total number of CPU cycles available for
   processing a compressed message is given by the formula:

    maximum_cycles = cycles_per_message + message_size * cycles_per_bit

   The reason that this total is not allocated to the UDVM when it is
   invoked is that the UDVM can begin to decompress a message that has




Price, Hannu, et al.                                           [Page 24]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   only been partially received. So the total message size may not be
   known when the UDVM is initialized.

   4.)   The UDVM stops executing instructions when it encounters an
   END-MESSAGE instruction or if decompression failure occurs.

   Notes:

   The UDVM passes uncompressed data to the dispatcher using the OUTPUT
   instruction. The OUTPUT instruction can be used to output a partially
   decompressed message; it is a dispatcher decision whether to use the
   data immediately or whether to buffer and wait until the entire
   message has been decompressed.

   The UDVM passes state creation requests to the state handler using
   the END-MESSAGE instruction. This means that it is only possible to
   make a state creation request once the message has been decompressed,
   which is necessary since the application typically determines the
   validity of these requests based on the contents of the decompressed
   message.

8.2.  Decompression failure

   If a compressed message given to the UDVM is corrupted (either
   accidentally or maliciously) then the UDVM may terminate with a
   decompression failure.

   Reasons for decompression failure include the following:

   *    A compressed or uncompressed message exceeds the maximum size
        defined by the application.

   *    The UDVM exceeds the available CPU cycles for decompressing a
        message.

   *    The UDVM attempts to read a memory address beyond the overall
        memory size.

   *    An unknown instruction type is encountered.

   *    An unknown operand type is encountered.

   *    An instruction is encountered that cannot be processed
        successfully by the UDVM (for example a RETURN instruction when
        no CALL instruction has previously been encountered).

   *    The UDVM attempts to access non-existent state.

   *    A manual decompression failure is triggered using the
        DECOMPRESSION-FAILURE instruction.





Price, Hannu, et al.                                           [Page 25]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   If a decompression failure occurs when decompressing a message then
   the UDVM informs the dispatcher and takes no further action. It is
   the responsibility of the dispatcher to decide how to cope with the
   decompression failure. In general a dispatcher SHOULD discard the
   compressed message and any decompressed data that has been outputted.

9.  UDVM instruction set

   The UDVM currently understands 30 instructions, chosen to support the
   widest possible range of compression algorithms with the minimum
   possible overhead.

   Figure 10 lists the different instructions and the bytecode values
   used to store the instructions at the UDVM. The cost of each
   instruction in CPU cycles is also given:

   Instruction:     Bytecode value:   Cost in CPU cycles:

   DECOMPRESSION-FAILURE     0          1
   AND                       1          1
   OR                        2          1
   NOT                       3          1
   ADD                       4          1
   SUBTRACT                  5          1
   MULTIPLY                  6          1
   DIVIDE                    7          1
   SORT-ASCENDING            8          1 + k * ceiling(log2(k))
   SORT-DESCENDING           9          1 + k * ceiling(log2(k))
   MD5                       10         1 + length
   LOAD                      11         1
   MULTILOAD                 12         1 + n
   COPY                      13         1 + length
   COPY-LITERAL              14         1 + length
   COPY-OFFSET               15         1 + length + offset
   JUMP                      16         1
   COMPARE                   17         1
   CALL                      18         1
   RETURN                    19         1
   SWITCH                    20         1 + n
   CRC                       21         1 + length
   END-MESSAGE               22         1 + state length
   OUTPUT                    23         1 + output_length
   NBO                       24         1
   INPUT-BYTECODE            25         1 + length
   INPUT-FIXED               26         1
   INPUT-HUFFMAN             27         1 + n
   STATE-REFERENCE           28         1 + state_length
   STATE-EXECUTE             29         1 + state length

      Figure 10: UDVM instructions and corresponding bytecode values





Price, Hannu, et al.                                           [Page 26]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Each UDVM instruction costs a minimum of 1 CPU cycle. Certain high-
   level instructions may cost additional cycles depending on the value
   of one of the instruction operands.

   The only exception when calculating the number of CPU cycles is that
   the STATE-EXECUTE instruction takes (1 + state_length) cycles even
   though it does not have a state_length operand; instead the value of
   state length is provided by the state handler as part of the state
   being accessed.

   All instructions are stored as a single byte to indicate the
   instruction type, followed by 0 or more bytes containing the operands
   required by the instruction. The instruction specifies which of the
   three operand types of Section 7.2 is used in each case. For example,
   the ADD instruction is followed by two operands as shown below:

   ADD ($operand_1, %operand_2)

   When converted into bytecode the number of bytes required by the ADD
   instruction depends on the size of each operand value, and whether
   the second (multitype) operand contains the operand value itself or a
   memory address where the actual value of the operand can be found.

   The instruction set available for the UDVM offers a mix of low-level
   and high-level instructions. The high-level instructions can all be
   emulated using the low-level instructions provided, but given a
   choice it is generally preferable to use a single instruction rather
   than a large number of general-purpose instructions. The resulting
   bytecode will be more compact (leading to a higher overall
   compression ratio) and decompression will typically be faster because
   the implementation of the compression-specific instructions can be
   optimized for the UDVM.

   Each instruction is explained in more detail below:

9.1.  Mathematical instructions

   The following instructions provide a number of mathematical
   operations including bit manipulation, arithmetic and sorting.

9.1.1.  Bit manipulation

   The AND, OR and NOT instructions provide simple bit manipulation on
   2-byte words.

   AND ($operand_1, %operand_2)
   OR ($operand_1, %operand_2)
   NOT ($operand_1)

   After the operation is complete, the value of the first operand is
   overwritten with the result. Note that since this operand is a




Price, Hannu, et al.                                           [Page 27]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   reference, the memory address specified by the operand is always
   overwritten and not the operand itself.

9.1.2.  Arithmetic

   The ADD, SUBTRACT, MULTIPLY and DIVIDE instructions perform
   arithmetic on 2-byte words.

   ADD ($operand_1, %operand_2)
   SUBTRACT ($operand_1, %operand_2)
   MULTIPLY ($operand_1, %operand_2)
   DIVIDE ($operand_1, %operand_2)

   After the operation is complete, the first operand is overwritten
   with the result.

   Note that in all cases the arithmetic operation is performed modulo
   2^16. So for example, subtracting 1 from 0 gives the result 65535.

   For the SUBTRACT instruction the second operand is subtracted from
   the first. Similarly, for the DIVIDE instruction the first operand is
   divided by the second operand. Note that if the second operand does
   not divide exactly into the first operand then the remainder is
   ignored.

9.1.3.  Sorting

   The SORT-ASCENDING and SORT-DESCENDING instructions sort lists of 2-
   byte words.

   SORT-ASCENDING (%start, %n, %k)
   SORT-DESCENDING (%start, %n, %k)

   The start operand specifies the starting memory address of the block
   of data to be sorted.
   The block of data itself is divided into n lists each containing k
   words. The SORT-ASCENDING instruction applies a certain permutation
   to the lists, such that the first list is sorted into ascending order
   (treating each data word as an integer). The same permutation is
   applied to all n lists, so lists other than the first will not
   necessarily be sorted into order.

   For example, the first list might contain a set of integers to be
   sorted, whilst the second list might be used to keep track of the
   integers:

      Before sorting              After sorting

   List 1        List 2        List 1        List 2

      8             1             1             2




Price, Hannu, et al.                                           [Page 28]


INTERNET-DRAFT                  SigComp                   March 1, 2002


      1             2             1             3
      1             3             3             4
      3             4             8             1

   In the case of two words of data with the same value, the original
   ordering of the list is preserved.

   The SORT-DESCENDING instruction behaves as above, except that the
   first list is sorted into descending order.

9.1.4.  MD5

   The MD5 instruction calculates an MD5 hash over the specified area of
   UDVM memory.

   MD5 (%position, %length, %destination)

   The position and length operands define the string of bytes over
   which the MD5 hash is calculated. Byte copying rules are enforced as
   per Section 7.3.

   The destination operand gives the starting address to which the
   resulting 16-byte hash will be copied.

9.2.  Memory management instructions

   The following instructions are used to manipulate the UDVM memory.
   Bytes can be copied from one area of memory to another, and areas of
   memory can be write-protected to make it easier for UDVM code to be
   compiled.

9.2.1.  LOAD

   The LOAD instruction sets a 2-byte variable to a certain specified
   value. The format of a LOAD instruction is as follows:

   LOAD (%address, %value)

   The first operand specifies the starting address of the 2-byte
   variable, whilst the second operand specifies the value to be loaded
   into this variable. As usual, MSBs are stored before LSBs in the UDVM
   memory.

9.2.2.  MULTILOAD

   The MULTILOAD instruction sets a contiguous block of 2-byte variables
   to specified values.

   MULTILOAD (%address, #n, %value_0, ..., %value_n-1)
   The first operand specifies the starting address of the contiguous
   variables, whilst the operands value_0 through to value_n-1 specify




Price, Hannu, et al.                                           [Page 29]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   the values to load into these variables (in the same order as they
   appear in the instruction).

9.2.3.  COPY

   The COPY instruction is used to copy a string of bytes from one part
   of the UDVM memory to another.

   COPY (%position, %length, %destination)

   The position operand specifies the memory address of the first byte
   in the string to be copied, and the length operand specifies the
   number of bytes to be copied.

   The destination operand gives the address to which the first byte in
   the string will be copied.

   Note that byte copying is performed as per the rules of Section 7.3.

9.2.4.  COPY-LITERAL

   A modified version of the COPY instruction is given below:

   COPY-LITERAL (%position, %length, $destination)

   The COPY-LITERAL instruction behaves as a COPY instruction except
   that after copying, the destination operand is replaced with the
   memory address immediately following the address to which the final
   byte was copied. If the final byte was copied to the memory address
   specified in byte_copy_right, the destination operand is set to the
   memory address specified in byte_copy_left.

9.2.5.  COPY-OFFSET

   A further version of the COPY-LITERAL instruction is given below:

   COPY-OFFSET (%offset, %length, $destination)

   The COPY-OFFSET instruction behaves as a COPY-LITERAL instruction
   except that an offset operand is given instead of a position operand.

   To derive a suitable position operand, starting at the memory address
   specified by destination, the UDVM counts backwards a total of offset
   memory addresses. If the memory address specified in byte_copy_left
   is reached, the next memory address is taken to be byte_copy_right.

   The COPY-OFFSET instruction then behaves as a COPY-LITERAL
   instruction, taking the position operand to be the last memory
   address reached in the above step.






Price, Hannu, et al.                                           [Page 30]


INTERNET-DRAFT                  SigComp                   March 1, 2002


9.3.  Program flow instructions

   The following instructions alter the flow of UDVM code. Each
   instruction jumps to one of a number of memory addresses based on a
   certain specified criterion. Note that all of the instructions give
   the memory addresses in the form of deltas relative to the memory
   address of the instruction. The actual memory address is calculated
   as follows:

   memory_address = (memory_address_of_instruction + delta) modulo 2^16

   Note that certain I/O instructions (see Section 9.4) can also alter
   program flow.

9.3.1.  JUMP

   The JUMP instruction moves program execution to the specified memory
   address.

   JUMP (%delta)

   Note that if the address (specified as a delta from the address of
   the JUMP instruction) lies beyond the overall UDVM memory size then
   decompression failure occurs.

9.3.2.  COMPARE

   The COMPARE instruction compares two operands and then jumps to one
   of three specified memory addresses depending on the result.

   COMPARE (%operand_1, %operand_2, %delta_1, %delta_2, %delta_3)

   If operand_1 < operand_2 then the UDVM continues instruction
   execution at the (relative) memory address specified by delta 1. If
   operand_1 = operand_2 then it jumps to the address specified by
   delta_2. If operand_1 > operand_2 then it jumps to the address
   specified by delta_3.

9.3.3.  CALL and RETURN

   The CALL and RETURN instructions provide support for compression
   algorithms with a nested structure.

   CALL (%delta)

   RETURN

   The CALL and RETURN instructions make use of a stack of 2-byte
   variables stored at the memory address specified by the well-known
   variable stack_location. The stack contains the following variables:





Price, Hannu, et al.                                           [Page 31]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Name:           Starting memory address:

   stack_free            stack_location
   stack[0]              stack_location + 2
   stack[1]              stack_location + 4
   stack[2]              stack_location + 6
      :                       :

   The MSBs of these variables are stored before the LSBs in the UDVM
   memory.

   When the UDVM reaches a CALL instruction, it finds the memory address
   of the instruction immediately following the CALL instruction and
   copies this 2-byte value into stack[stack_free] ready for later
   retrieval. It then increases stack_free by 1 and continues
   instruction execution at the (relative) memory address specified by
   the operand.

   When the UDVM reaches a RETURN instruction it decreases stack_free by
   1, and then continues instruction execution at the byte position
   stored in stack[stack_free].

   If the variable stack_free is ever increased beyond 65535 or
   decreased below 0 then a bad compressed message has been received and
   decompression failure occurs (see Section 8.2).

   Decompression failure also occurs if one of the above instructions is
   encountered and the value of stack_location is smaller than 6 (this
   prevents the stack from overwriting the well-known variables).

9.3.4.  SWITCH

   The SWITCH instruction performs a conditional jump based on the value
   of one of its operands.

   SWITCH (#n, %j, %delta_0, %delta_1, ... , %delta_n-1)

   When a SWITCH instruction is encountered the UDVM reads the value of
   j. It then continues instruction execution at the (relative) address
   specified by delta j.

   If j specifies a value of n or more, a bad compressed message has
   been received and decompression failure occurs.

9.3.5.  CRC

   The CRC instruction verifies a string of bytes using a 2-byte CRC.

   CRC (%value, %position, %length, %delta)






Price, Hannu, et al.                                           [Page 32]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   The actual CRC calculation is performed using the generator
   polynomial x^16 + x^12 + x^5 + 1, which coincides with the 2-byte
   Frame Check Sequence (FCS) of [RFC-1662].

   The position and length operands define the string of bytes over
   which the CRC is evaluated. Byte copying rules are enforced as per
   Section 7.3.

   Important note: Since a CRC calculation is always performed over a
   bitstream, for interoperability it is necessary to define the order
   in which bits are supplied within each individual byte. In this case
   the MSBs of the byte MUST be supplied to the CRC calculation before
   the LSBs.

   The value operand contains the expected integer value of the 2-byte
   CRC. If the calculated CRC matches the expected value then the UDVM
   continues at the following instruction. Otherwise the UDVM jumps to
   the (relative) memory address specified by delta.

9.4.  I/O instructions

   The following instructions allow the UDVM to interface with its
   environment. Note that in the overall SigComp architecture all of
   these interfaces pass to the decompressor dispatcher or to the state
   handler.

9.4.1.  END-MESSAGE

   The END-MESSAGE instruction successfully terminates the UDVM and
   passes state information to the state handler.

   END-MESSAGE (%hash_length, %state_start, %state_length,
   %state_instruction, %announcement_location)

   Note that the actual uncompressed message is outputted separately
   using the OUTPUT instruction; this conserves memory at the UDVM
   because there is no need to buffer an entire uncompressed message
   before it can be passed to the dispatcher.

   Note that if the announcement_location operand is set to 0 then no
   announcement information is provided, otherwise it points to the
   starting memory address of the announcement information as per
   Section 6.3.

   The END-MESSAGE instruction requests the creation of state using the
   operands state start and state length, which together denote a byte
   string state_value. Provided that the application gives permission,
   state_value is byte copied from the UDVM memory (obeying the rules of
   Section 7.3) and stored together with a 16-byte state identifier that
   can be used to access the state by a later compressed message.





Price, Hannu, et al.                                           [Page 33]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   To provide security against malicious access, the identifier for any
   item of state created by the UDVM is derived from the [MD5] hash of
   the state_value to be stored. The state identifier is constructed by
   taking the 16-byte [MD5] hash and replacing all but the first
   hash_length most significant bytes with zeroes. Note that if
   hash_length is 16 then the unmodified [MD5] hash is the state
   identifier. Decompression failure occurs if hash_length is less than
   the application-defined parameter minimum_hash_size or greater than
   16.

   If a state identifier already exists (hash collision occurs), the
   decompressor should check whether the requested state is identical to
   the established state, and count the state creation request as
   successful if this is the case.

   If not then the state creation request is unsuccessful. The existing
   state MUST NOT be replaced with the requested state to be saved. This
   is to avoid the situation where a compressed message cannot be
   decompressed because a needed item of state has been replaced
   (possibly by a malicious sender).

9.4.2.  DECOMPRESSION-FAILURE

   The DECOMPRESSION-FAILURE instruction triggers a manual decompression
   failure. This is useful if the UDVM program discovers that it cannot
   successfully decompress the message (e.g. by using the CRC
   instruction).

   This instruction has no operands.

9.4.3.  OUTPUT

   The OUTPUT instruction provides successfully decompressed data to the
   dispatcher.

   OUTPUT (%output_start, %output_length)

   The operands define the starting memory address and length of the
   byte string to be provided to the dispatcher. Note that the OUTPUT
   instruction can be used to output a partially decompressed message;
   each time the instruction is encountered it appends a byte string to
   the end of the data previously passed to the dispatcher via the
   OUTPUT instruction.

   The string of data is byte copied from the UDVM memory obeying the
   rules of Section 7.3.

   Decompression failure occurs if the cumulative number of bytes
   provided to the dispatcher exceeds the application-defined parameter
   maximum_uncompressed_size.





Price, Hannu, et al.                                           [Page 34]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Since there is technically a difference between outputting a 0-byte
   decompressed message, and not outputting a decompressed message at
   all, the OUTPUT instruction needs to distinguish between the two
   cases. Thus, if the UDVM terminates before encountering an OUTPUT
   instruction it is considered not to have outputted a decompressed
   message. If it encounters one or more OUTPUT instructions, each of
   which provides 0 bytes of data to the dispatcher, then it is
   considered to have outputted a 0-byte decompressed message.

9.4.4.  NBO

   The NBO instruction modifies the order in which compressed bits are
   passed to the UDVM.

   As the INPUT-FIXED and INPUT-HUFFMAN instructions read individual
   bits from within a byte, to avoid ambiguity it is necessary to define
   the order in which these bits are read. The default operation is to
   read the MSBs before the LSBs, but if the NBO instruction is
   encountered then the LSBs are read before the MSBs. Both cases are
   illustrated below:

    MSB         LSB MSB         LSB     MSB         LSB MSB         LSB

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 1 2 3 4 5 6 7|8 9 ...        |   |7 6 5 4 3 2 1 0|        ... 9 8|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Byte 0          Byte 1              Byte 0          Byte 1

           Default operation            After NBO instruction

   The NBO instruction can only be used before bitwise compressed data
   is passed to the UDVM. Therefore, a decompression failure occurs if
   it is encountered after an INPUT-FIXED or an INPUT-HUFFMAN
   instruction has been used.

9.4.5.  INPUT-BYTECODE

   The INPUT-BYTECODE instruction requests a certain number of bytes of
   compressed data from the dispatcher.

   INPUT-BYTECODE (%length, %destination, %delta)

   The length operand indicates the requested number of bytes of
   compressed data, and the destination operand specifies the starting
   memory address to which they should be copied. Byte copying is
   performed as per the rules of Section 7.3.

   If the instruction requests data that lies beyond the end of the
   compressed message, no data is returned. Instead the UDVM moves





Price, Hannu, et al.                                           [Page 35]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   program execution to the memory address specified by the formula
   (memory_address_of_INPUT-BYTECODE_instruction + delta) modulo 2^16.

   The INPUT-BYTECODE instruction can only be used before bitwise
   compressed data is passed to the UDVM. Therefore, a decompression
   failure occurs if it is encountered after an INPUT-FIXED or an INPUT-
   HUFFMAN instruction has been used.

9.4.6.  INPUT-FIXED

   The INPUT-FIXED instruction requests a certain number of bits of
   compressed data from the dispatcher.

   INPUT-FIXED (%length, %destination, %delta)

   The length operand indicates the requested number of bits. If this
   operand does not lie between 1 and 16 inclusive then a decompression
   failure occurs.

   The destination operand specifies the memory address to which the
   compressed data should be copied. Note that the requested bits are
   interpreted as a 2-byte integer ranging from 0 to 2^length - 1. Under
   default operation the MSBs of this integer are provided first, but if
   an NBO instruction has been executed then the LSBs are provided
   first.

   If the instruction requests data that lies beyond the end of the
   compressed message, no data is returned. Instead the UDVM moves
   program execution to the memory address specified by the formula
   (memory_address_of_INPUT-FIXED_instruction + delta) modulo 2^16.

9.4.7.  INPUT-HUFFMAN

   The INPUT-HUFFMAN instruction requests a variable number of bits of
   compressed data from the dispatcher. The instruction initially
   requests a small number of bits and compares the result against a
   certain criterion; if the criterion is not met then additional bits
   are requested until the criterion is achieved.

   The INPUT-HUFFMAN instruction is followed by three mandatory operands
   plus n additional sets of operands. Every additional set contains
   four operands as shown below:

   INPUT-HUFFMAN (%destination, %delta, #n, %bits_1, %lower_bound_1,
   %upper_bound_1, %uncompressed_1, ... , %bits_n, %lower_bound_n,
   %upper_bound_n, %uncompressed_n)

   Note that if n = 0 then the INPUT-HUFFMAN instruction is ignored by
   the UDVM. If bits_1 = 0 or (bits_1 + ... + bits_n) > 16 then
   decompression failure occurs.





Price, Hannu, et al.                                           [Page 36]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   In all other cases, the behavior of the INPUT-HUFFMAN instruction is
   defined below:

   1.)   Set j = 1.

   2.)   Request an additional bits_j compressed bits. Interpret the
   total (bits_1 + ... + bits_j) bits of compressed data requested so
   far as an integer H, with the first bit to be supplied as the MSB and
   the last bit to be supplied as the LSB (note that this is always the
   case, independently of whether the NBO instruction has been used).

   3.)   If data is requested that lies beyond the end of the compressed
   message, terminate the INPUT-HUFFMAN instruction and move program
   execution to the memory address specified by the formula
   (memory_address_of_INPUT-HUFFMAN_instruction + delta) modulo 2^16.

   4.)   If (H < lower_bound_j) or (H > upper_bound_j) then set j = j +
   1. Then go back to Step 2, unless j > n in which case decompression
   failure occurs.

   5.)   Copy (H + uncompressed_j - lower_bound_j) modulo 2^16 to the
   memory address specified by the destination operand.

9.4.8.  STATE-REFERENCE

   The STATE-REFERENCE instruction retrieves some previously stored
   state information.

   STATE-REFERENCE (%id_start, %id_length, %state_start, %state_length,
   %state_destination)

   The id_start and id_length operands specify the location of the state
   identifier used to retrieve the state information. The state
   identifier is always 16 bytes long; if id_length is less than 16 then
   the remaining least significant bytes of the identifier are padded
   with zeroes.

   Decompression failure occurs if id_length is greater than 16.
   Decompression failure also occurs if no state information matching
   the state identifier can be found.

   Note that when accessing state information that has been previously
   created by the UDVM, the state identifier is always taken from an
   [MD5] hash of the state to be retrieved. However this is not
   necessarily the case for application-defined state as per Section
   3.2.

   The state_start and state_length operands define the starting byte
   and number of bytes to copy from the state_value contained in the
   identified item of state. If more state is requested than is actually
   available then decompression failure occurs.




Price, Hannu, et al.                                           [Page 37]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   The state_destination operand contains a UDVM memory address. The
   requested state is byte copied to this memory address using the rules
   of Section 7.3.

9.4.9.  STATE-EXECUTE

   The STATE-EXECUTE instruction retrieves and runs some previously
   stored state information.

   STATE-EXECUTE (%id_start, %id_length)

   The id_start and id_length operands function as per the STATE-
   REFERENCE instruction.

   STATE-EXECUTE is similar to STATE-REQUEST except that it does not
   require the amount of state being requested or the proposed
   destination for the state to be specified explicitly. Instead, it
   simply puts the state_value back into the UDVM memory using the
   operands state_start and state_length contained as part of the state
   information.

   The entire state_value (all state length bytes of it) is byte copied
   into the memory address specified by state start. The UDVM then jumps
   to the (absolute) memory address specified by state_instruction.

   Note that state start, state length and state_instruction are all
   stored together with state_value as part of an item of state
   information.


10. Security considerations

10.1.  Security goals

   The overall security goal of the SigComp architecture is to not
   create risks that are in addition to those already present in the
   application protocols. There is no intention for SigComp to enhance
   the security of the protocols, as it always can be circumvented by
   not using compression. More specifically, the high-level security
   goals can be described as:

   -- do not worsen security of existing application protocol

   -- do not create any new security issues

   -- do not hinder deployment of application security









Price, Hannu, et al.                                           [Page 38]


INTERNET-DRAFT                  SigComp                   March 1, 2002


10.2.  Security risks and mitigations

   This subsection identifies the potential security risks associated
   with the overall SigComp architecture, and details the proposed
   solution for each risk.


   ** Confidentiality risks

   *** Attacking SigComp by snooping into state of other users

   State can only be accessed using a state identifier, which is a
   (prefix of a) cryptographic hash of the state being referenced. This
   implies that the referencing packet already needs knowledge about the
   state. To enforce this, a reference length of 72 bits is defined.
   This also minimizes the probability of an accidental state collision.

   Generally, ways to obtain knowledge about the state identifier (e.g.,
   passive attacks) will also easily provide knowledge about the state
   referenced, so no new vulnerability results.

   The application needs to handle state identifiers with the same care
   it would handle the state itself.

   ** Integrity risks

   The SigComp approach assumes that there is appropriate integrity
   protection below and/or above the SigComp layer. However, the state
   establishment mechanism provides additional potential to compromise
   the integrity of the messages (which, however, would most likely be
   detectable at the application layer).

   *** Attacking SigComp by faking state or making unauthorized changes
   to state

   State cannot be destroyed or changed by a malicious sender -- it can
   only add new state. Faking state is only possible if the hash allows
   intentional collision.

   ** Availability risks (avoid DoS vulnerabilities)

   *** Use of SigComp as a tool in a DoS attack to another target

   SigComp cannot easily be used as an amplifier in a reflection attack,
   as it only generates one decompressed message per incoming compressed
   message. This packet is then handed to the application; the utility
   as a reflection amplifier is therefore limited by the utility of the
   application.

   However, it must be noted that SigComp can be used to generate larger
   packets as input to the application than have to be sent from the




Price, Hannu, et al.                                           [Page 39]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   malicious sender; this therefore can send smaller packets (at a lower
   bandwidth) than are delivered to the application. Depending on the
   reflection characteristics of the application, this can be considered
   a mild form of amplification. The application MUST limit the number
   of packets reflected to a potential target -- even if SigComp is used
   to generate a large amount of information from a small incoming
   attack packet.
   *** Attacking SigComp as the DoS target by filling it with state

   Excessive state can only be installed by a malicious sender (or a set
   of malicious senders) with the consent of the application. The system
   consisting of SigComp and application is thus approximately as
   vulnerable as the application itself, unless it allows the
   installation of state from a message where it would not have
   installed state itself.

   If this is desirable to increase the compression ratio, the effect
   can be mitigated by adding feedback at the application level that
   indicates whether the state requested was actually installed -- This
   allows a system under attack to gracefully degrade by no longer
   installing compressor state that is not matched by application state.

   *** Attacking the UDVM by faking state or making unauthorized changes
   to state

    (See "Integrity risks" above.)

   *** Attacking the UDVM by sending it looping code

   The application sets an upper limit to the number of "CPU cycles"
   that can be used per compressed message and per input bit in the
   compressed message. The damage inflicted by sending packets with
   looping code is therefore limited, although this may still be
   substantial if a large number of CPU cycles are offered by the UDVM.
   However, this would be true for any decompressor that can receive
   packets from anywhere.


11. IANA considerations

   The SigComp solution currently requires two identifiers to be
   assigned by IANA: the UDVM_version and the state identifier.

   Upgraded versions of the UDVM will contain additional instructions to
   improve the performance of the overall SigComp solution; new
   UDVM_version parameters will be needed in this case.


12. Acknowledgements

   Thanks to




Price, Hannu, et al.                                           [Page 40]


INTERNET-DRAFT                  SigComp                   March 1, 2002



            Abigail Surtees (abigail.surtees@roke.co.uk)
            Mark A West (mark.a.west@roke.co.uk)
            Lawrence Conroy (lwc@roke.co.uk)
            Christian Schmidt (christian.schmidt@icn.siemens.de)
            Max Riegel (maximilian.riegel@icn.siemens.de)
            Lars-Erik Jonsson (lars-erik.jonsson@epl.ericsson.se)
            Stefan Forsgren (stefan.forsgren@epl.ericsson.se)
            Krister Svanbro (krister.svanbro@epl.ericsson.se)
            Miguel Garcia (miguel.a.garcia@ericsson.com)
            Christopher Clanton (christopher.clanton@nokia.com)
            Khiem Le (khiem.le@nokia.com)
            Ka Cheong Leung (kacheong.leung@nokia.com)

   for valuable input and review.


13. Authors' addresses

   Richard Price         Tel: +44 1794 833681
   Email:                richard.price@roke.co.uk

   Roke Manor Research Ltd
   Romsey, Hants, SO51 0ZN
   United Kingdom


   Hans Hannu            Tel: +46 920 20 21 84
   Email:                hans.hannu@epl.ericsson.se

   Box 920
   Ericsson Erisoft AB
   SE-971 28 Lulea, Sweden


   Carsten Bormann       Tel: +49 421 218 7024
   Email:                cabo@tzi.org

   Universitaet Bremen TZI
   Postfach 330440
   D-28334 Bremen, Germany


   Jan Christoffersson   Tel: +46 920 20 28 40
   Email:                jan.christoffersson@epl.ericsson.se

   Box 920
   Ericsson Erisoft AB
   SE-971 28 Lulea, Sweden






Price, Hannu, et al.                                           [Page 41]


INTERNET-DRAFT                  SigComp                   March 1, 2002


   Zhigang Liu           Tel: +1 972 894-5935
   Email:                zhigang.liu@nokia.com

   Nokia Research Center
   6000 Connection Drive
   Irving, TX 75039
   USA


   Jonathan Rosenberg
   Email:                jdrosen@dynamicsoft.com

   dynamicsoft
   72 Eagle Rock Avenue
   First Floor
   East Hanover, NJ 07936

14. References

   [SIP]       "SIP: Session Initiation Protocol", Handley et al,
               RFC 2543, Internet Engineering Task Force, March 1999

   [RTSP]      "Real Time Streaming Protocol (RTSP)", H. Schulzrinne, A.
               Rao and R. Lanphier, , RFC 2326, April 1998

   [HTTP]      "HyperText Transfer Protocol, HTTP/1.1", R. Fielding et
               al.", RFC 2616, June 1999

   [SIPsrv]    "SIP: Locating SIP Servers", J. Rosenberg, H.
               Schulzrinne, draft-ietf-sip-srv-04.txt, January 2002,
               work in progress

   [DEFLATE]   "DEFLATE Compressed Data Format Specification version
               1.3", P. Deutsch, RFC 1951, Internet Engineering Task
               Force, May 1996

   [SCTP]      "Stream Control Transmission Protocol", Stewart et al,
               RFC 2960, Internet Engineering Task Force, October 2000

   [MD5]       "The MD5 Message-Digest Algorithm", R. Rivest, RFC 1321,
               Internet Engineering Task Force, April 1992

   [RFC-1662]  "PPP in HDLC-like Framing", Simpson et al, Internet
               Engineering Task Force, July 1994

   [RFC-2026]  "The Internet Standards Process - Revision 3", Scott
               Bradner, Internet Engineering Task Force, October 1996

   [RFC-2119]  "Key words for use in RFCs to Indicate Requirement
               Levels", Scott Bradner, Internet Engineering Task Force,
               March 1997




Price, Hannu, et al.                                           [Page 42]


INTERNET-DRAFT                  SigComp                   March 1, 2002


Appendix A. Document history

   - October 19, 2001, version 00

   First version. The draft describes the current ideas, from people
   involved in the ROHC WG, of how to perform compression of
   application signaling messages.

   - October 31, 2001, version 01

   Second version. Additional section, 5.2.1, which describes when a
   message identifier can be reused.

   - November 21, 2001, version 02

   Third version. Section 6 has been moved to a separate draft. The
   third version describes a modular solution, providing flexibility
   for implementers to decide which functions they want to integrate.

   - January 28, 2002, version 03

   Fourth version. SigComp version 02 is divided into this draft, a UDVM
   draft and an extended operation mechanisms draft.
   Compressor/decompressor (UDVM) state approach has been introduced for
   security reasons.

   - February 14, 2002, version 04

   Fifth version. Describes the complete base SigComp solution including
   the UDVM.

   - March 1, 2002, version 05

   Sixth version. Comments from several authors and contributors have
   been taken into account. Announcement mechanism has been updated.















   This Internet-Draft expires in September 2002.




Price, Hannu, et al.                                           [Page 43]