Robust Header Compression R. Finking
Internet-Draft Siemens/Roke Manor
Expires: December 25, 2005 G. Pelletier
Ericsson
June 23, 2005
Formal Notation for Robust Header Compression (ROHC-FN)
draft-ietf-rohc-formal-notation-09.txt
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 25, 2005.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract
This document defines ROHC-FN (RObust Header Compression - Formal
Notation): a formal notation to unambiguously specify header
compression field encodings, when defining new profiles within the
ROHC framework. ROHC-FN offers a library of encoding methods that
are often used in ROHC profiles, and can thereby help simplifying
future profile development work.
Finking & Pelletier Expires December 25, 2005 [Page 1]
Internet-Draft ROHC-FN June 2005
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Overview of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5
3.1 Scope of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5
3.2 Fundamentals of ROHC-FN . . . . . . . . . . . . . . . . . 6
3.2.1 Fields and Encodings . . . . . . . . . . . . . . . . . 6
3.2.2 Structures . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Example using IPv4 . . . . . . . . . . . . . . . . . . . . 9
4. Normative Definition of ROHC-FN . . . . . . . . . . . . . . . 12
4.1 Overall Structure of a Specification . . . . . . . . . . . 12
4.2 Identifiers . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Constant Definitions . . . . . . . . . . . . . . . . . . . 14
4.4 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4.1 Attribute References . . . . . . . . . . . . . . . . . 15
4.4.2 Negative Field Values . . . . . . . . . . . . . . . . 16
4.5 Expressions . . . . . . . . . . . . . . . . . . . . . . . 16
4.5.1 Integer Literals . . . . . . . . . . . . . . . . . . . 17
4.5.2 Boolean Literals . . . . . . . . . . . . . . . . . . . 17
4.5.3 Boolean Operators . . . . . . . . . . . . . . . . . . 17
4.5.4 Integer Operators . . . . . . . . . . . . . . . . . . 17
4.5.5 Comparison Operators . . . . . . . . . . . . . . . . . 18
4.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.7 Encoding Methods Library . . . . . . . . . . . . . . . . . 19
4.7.1 uncompressed_value . . . . . . . . . . . . . . . . . . 19
4.7.2 compressed_value . . . . . . . . . . . . . . . . . . . 20
4.7.3 irregular . . . . . . . . . . . . . . . . . . . . . . 21
4.7.4 static . . . . . . . . . . . . . . . . . . . . . . . . 21
4.7.5 lsb . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.7.6 crc . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.8 "let" Statements . . . . . . . . . . . . . . . . . . . . . 23
4.9 Structures . . . . . . . . . . . . . . . . . . . . . . . . 24
4.9.1 "this" . . . . . . . . . . . . . . . . . . . . . . . . 25
4.9.2 Simple Structures . . . . . . . . . . . . . . . . . . 25
4.9.3 Arguments and Structures . . . . . . . . . . . . . . . 28
4.9.4 Multiple Formats . . . . . . . . . . . . . . . . . . . 28
4.9.5 Control Fields . . . . . . . . . . . . . . . . . . . . 33
4.10 Profile-specific Encoding Methods . . . . . . . . . . . . 34
5. Security considerations . . . . . . . . . . . . . . . . . . . 35
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35
7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 35
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 35
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.1 Normative References . . . . . . . . . . . . . . . . . . . 36
9.2 Informative References . . . . . . . . . . . . . . . . . . 36
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 36
A. Bit-level Worked Example . . . . . . . . . . . . . . . . . . . 36
Finking & Pelletier Expires December 25, 2005 [Page 2]
Internet-Draft ROHC-FN June 2005
A.1 Example Packet Format . . . . . . . . . . . . . . . . . . 37
A.2 Initial Encoding . . . . . . . . . . . . . . . . . . . . . 37
A.3 Basic Compression . . . . . . . . . . . . . . . . . . . . 38
A.4 Inter-packet compression . . . . . . . . . . . . . . . . . 39
A.5 Multiple Packet Formats . . . . . . . . . . . . . . . . . 41
A.6 Variable Length Discriminators . . . . . . . . . . . . . . 43
A.7 Default encoding . . . . . . . . . . . . . . . . . . . . . 46
A.8 Control Fields . . . . . . . . . . . . . . . . . . . . . . 48
A.9 Use Of "let" Statements As Conditionals . . . . . . . . . 50
Intellectual Property and Copyright Statements . . . . . . . . 53
Finking & Pelletier Expires December 25, 2005 [Page 3]
Internet-Draft ROHC-FN June 2005
1. Introduction
ROHC-FN is a formal notation designed to help with the definition of
ROHC [RFC3095] header compression profiles. ROHC-FN offers a library
of encoding methods that are often used in ROHC profiles, so new
profiles can be specified without the need to redefine this library
from scratch.
Informally, an encoding method is a function that maps between
uncompressed data and compressed data. The simplest encoding methods
only have one input and one output: the uncompressed field and the
compressed version of the field. More complex encoding methods can
handle multiple fields at the same time.
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
o Control field
Control fields are transmitted from a ROHC compressor to a ROHC
decompressor, but are not part of the uncompressed header itself.
o Encoding method
Encoding methods are functions that can be applied to compress and
decompress fields in a protocol header.
o Field
With ROHC-FN, the protocol header to be compressed is divided into
a set of contiguous bit patterns known as fields. It should be
noted that the way the header is divided into fields is decided by
the profile designer, and it is not necessary for the field
divisions to be identical to the ones given by the
specification(s) for the protocol header being compressed.
o Library of encoding methods
The library of encoding methods contains a number of commonly used
encoding methods for compressing header fields.
o ROHC-FN specification
Finking & Pelletier Expires December 25, 2005 [Page 4]
Internet-Draft ROHC-FN June 2005
The specification of a ROHC profile's packet formats using
ROHC-FN.
o Profile
A ROHC [RFC3095] profile is a description of how to compress a
certain protocol stack over a certain type of link. Each profile
is built up of packet formats (defining the bits on the wire)
along with a set of rules that control compressor and decompressor
behaviour.
3. Overview of ROHC-FN
This section gives an overview of ROHC-FN. It also explains how
ROHC-FN can be used to specify the compression of header fields as
part of a ROHC profile.
3.1 Scope of ROHC-FN
This section describes the scope of the ROHC-FN. It explains how the
formal notation relates to the ROHC framework and to specific ROHC
profiles.
The ROHC framework provides the general principles for performing
robust header compression. It defines the concept of a profile,
which makes ROHC a general platform for different compression
schemes. It sets link layer requirements, and in particular
negotiation requirements for all ROHC profiles. It defines a set of
common functions such as Context Identifiers (CIDs), padding and
segmentation. It also defines common packet formats (IR, IR-DYN,
Feedback, Short-CID expander, etc.), and finally it defines a
generic, profile independent, feedback mechanism.
A ROHC profile is a description of how to compress a certain protocol
stack over a certain type of link. For example, ROHC profiles are
available for RTP/UDP/IP and many other protocol stacks.
On a high level each ROHC profile is built up of packet formats
(defining the bits on the wire) along with a set of rules that
control compressor and decompressor behaviour. The purpose of the
packet formats is to define how to compress and decompress headers.
The packet formats define one or more compressed versions of each
uncompressed header; inversely, the packet formats define how to
relate a compressed header back to the original uncompressed header.
The packet formats will typically define compression of headers
relative to a context of field values from previous headers in a
flow, improving the overall compression by taking into account
Finking & Pelletier Expires December 25, 2005 [Page 5]
Internet-Draft ROHC-FN June 2005
redundancies between headers of successive packets. Therefore, in
addition to defining the packet formats, a profile has to:
o specify how to manage these contexts at the compressor and the
decompressor,
o define when and what to send in potential feedback messages from
decompressor to compressor,
o outline compression strategy principles to make the profile robust
against bit errors and dropped packets.
All this is needed to ensure that the compressor and decompressor
contexts are kept synchronised, while still facilitating best
possible compression performance.
The ROHC-FN is designed to help in the specification of the packet
formats used in ROHC profiles. It offers a library of encoding
methods for compressing fields, and a mechanism for combining these
encoding methods to create packet formats tailored to a specific
protocol stack. However, the scope of ROHC-FN is limited to
specifying the packet formats, while all the control logic for the
profile behaviour is to be defined by other means, to form a complete
profile specification.
3.2 Fundamentals of ROHC-FN
There are two fundamental elements to the formal notation:
1. Fields and their encodings, which define the mapping between a
header's uncompressed and compressed forms.
2. Structures, which define the way headers are broken down into
fields. Structures define lists of uncompressed fields and the
lists of compressed fields they map onto.
These two fundamental elements are at the core of the notation and
are outlined below.
3.2.1 Fields and Encodings
Headers are made up of fields. For example IP version number, header
length and sequence number are all fields used in real protocols.
The bindings between the compressed and uncompressed values of a
field are specified with encoding methods, using the following
notation:
field ::= encoding_method
In the above statement, the symbol "::=" means "is encoded by". This
statement does not represent an assignment operation from the right
hand side to the left side. Instead, it is a two-way mapping in that
Finking & Pelletier Expires December 25, 2005 [Page 6]
Internet-Draft ROHC-FN June 2005
it both represents the compression and the decompression operation in
a single statement, through a process of two-way matching. Two-way
matching is a binary operation that attempts to make the operands the
same (similar to the unification process in logic). The operands
represent one unspecified data object and one specified object.
Values can be matched from either operand.
During compression, the uncompressed value of the field is already
defined. The given encoding matches the compressed value against
that. During decompression the compressed value of the field is
already defined, so the uncompressed value is matched to the
compressed value, using the given encoding method. Thus both
compression and decompression are defined by a single statement.
Fields have attributes. Attributes describe various things about the
field, including the length of the field and whereabouts the field
appears in the header. For example:
field:uncomp_length
indicates how long this field is before it is compressed. See
Section 4.4 for more details on field attributes.
An encoding method (including any parameters specified) creates a
reversible binding between the attributes of a field. At the
compressor, a packet format can be used if a set of bindings that's
successful for all the attributes in all its fields can be found. At
the decompressor, the operation is reversed using the same bindings
and the attributes in each field are filled according to the
specified bindings.
For example, the 'static' encoding method creates a binding between
the attribute corresponding to the uncompressed value of the field
and the attribute corresponding to the value of the field in the
context.
o For the compressor, the 'static' binding is successful when both
the context value and the uncompressed value are the same. If the
two values differ then the binding fails.
o For the decompressor, the 'static' binding succeeds for a packet
type only if a valid context entry containing the value of the
uncompressed field exists. Otherwise, the binding will fail.
Both the compressed and uncompressed forms of each field are
represented in the same way: as an unsigned string of bits, most
significant bit first.
Finking & Pelletier Expires December 25, 2005 [Page 7]
Internet-Draft ROHC-FN June 2005
3.2.2 Structures
Structures provide a mechanism for combining fields and their
encoding methods into larger units. Structures are defined using the
"===" symbol. These can then be used as encoding methods in other
structures:
example_structure ===
{
uc_format = field_1,
field_2,
field_3;
control_fields = ctrl_field_1,
ctrl_field_2;
default_methods =
{
field_1 ::= encoding_method_1;
ctrl_field_2 ::= encoding_method_2;
};
co_format_0 = field_1,
field_2,
ctrl_field_1,
ctrl_field_2
{
field_2 ::= encoding_method_2;
ctrl_field_1 ::= encoding_method_3;
};
co_format_1 = field_1,
field_2,
field_3,
ctrl_field_2,
ctrl_field_3
{
field_2 ::= encoding_method_3;
field_3 ::= encoding_method_4;
ctrl_field_2 ::= encoding_method_5;
ctrl_field_3 ::= encoding_method_6;
};
};
In the example above, the comma separated list "uc_format" indicates
the order of fields in the uncompressed header. After this is
another comma separated list, "control_fields", which defines one or
more control fields. Following this is a list of default encoding
Finking & Pelletier Expires December 25, 2005 [Page 8]
Internet-Draft ROHC-FN June 2005
methods to use (see below). Finally, a number of packet formats for
the compressed data follow, each beginning with the reserved prefix
"co_format". These also have a comma separated list, which consists
of:
o fields that occur in the uncompressed header; or
o "control fields", that are additional information added to the
compressed packet during compression.
Each of these comma separated lists is a "field order list". In the
example, packet formats defined by "co_format" have, in addition to a
field order list, a list of field encodings. The field encodings
list follows immediately after the corresponding field order list.
This is typical usage. The field encodings list contains the
encoding methods for each field. The encoding methods are defined
inside braces for the fields in the preceding field order list.
Fields that have no encoding methods defined in this field order list
are encoded using the default encoding methods specified in
"default_methods" (see Section 4.9.4.3).
Fields from the uncompressed header have the same name as they do in
the compressed header. If there are any fields which are present
exclusively in the compressed header but which do have an
uncompressed value, they must be declared in the "control_fields"
section of the structure (see Section 4.9.5 for more details on
defining control fields).
In the example above, ctrl_field_3 is unusual. All other fields
which appear in the compressed header are also found in the
uncompressed field order list, or the control field list. It is
possible to have fields which, like ctrl_field_3, have no
"uncompressed" value at all. Fields such as a checksum on the
compressed header fall into this category. Fields which have no
uncompressed value do not appear in an uncompressed field order list
and don't have to appear in the control field list either. Instead
they are only declared in the compressed field order lists where they
are used.
3.3 Example using IPv4
This section gives an overview of how the notation is used by means
of an example. The example will develop the formal notation for an
encoding method capable of compressing a single, well-known header:
the IPv4 header.
The first step is to specify the overall structure of the IPv4
header. To do this, we use a structure which we will call
"ipv4_header". Structures are defined in Section 4.9. This is
Finking & Pelletier Expires December 25, 2005 [Page 9]
Internet-Draft ROHC-FN June 2005
notated as follows:
ipv4_header ===
{
The statement above defines the encoding method "ipv4_header" as a
structure, the definition of which follows the opening brace.
Definitions within the pair of braces are local to "ipv4_header".
This scoping mechanism helps to clarify which fields belong to which
headers: it is also useful when compressing complex protocol stacks
with several headers and fields, often sharing the same field names.
The next step is to specify the fields contained in the uncompressed
IPv4 header. This is accomplished using ROHC-FN as follows:
uc_format = version, %[ 4 ]
header_length, %[ 4 ]
tos, %[ 6 ]
ecn, %[ 2 ]
length, %[ 16 ]
id, %[ 16 ]
reserved, %[ 1 ]
dont_frag, %[ 1 ]
more_fragments, %[ 1 ]
offset, %[ 13 ]
ttl, %[ 8 ]
protocol, %[ 8 ]
checksum, %[ 16 ]
src_addr, %[ 32 ]
dest_addr; %[ 32 ]
The numbers in square brackets give the field width in bits. Note
that these are merely comments and do not have any formal meaning.
The fields contained in the compressed header can then be specified.
Exactly what appears in this list of fields depends on the encoding
methods used to encode the uncompressed fields -- it may be possible
to compress certain fields down to 0 bits, in which case they do not
need to be sent in the compressed header at all.
Finking & Pelletier Expires December 25, 2005 [Page 10]
Internet-Draft ROHC-FN June 2005
co_format = src_addr, %[ 32 ]
dest_addr, %[ 32 ]
length, %[ 16 ]
id, %[ 16 ]
ttl, %[ 8 ]
protocol, %[ 8 ]
tos, %[ 6 ]
ecn, %[ 2 ]
dont_frag %[ 1 ]
{
Note that the order of the fields in the compressed header is
independent of the order of the fields in the uncompressed header.
The next step is to specify the encoding methods for each field in
the IPv4 header. These are taken from encoding methods in the
ROHC-FN library, as well as from additional encoding methods defined
in the profile specification itself. Since the intention here is to
illustrate the use of the notation, rather than to describe the
optimum method of compressing IPv4 headers, this example uses only
three predefined encoding methods.
The "uncompressed_value" encoding method (defined in Section 4.7.1)
can compress any field whose uncompressed length and value are fixed,
or can be calculated using an expression. No compressed bits need to
be sent because the uncompressed field can be reconstructed using its
known size and value. The "uncompressed_value" encoding method is
used to compress five fields in the IPv4 header, as described below:
version ::= uncompressed_value(4, 4);
header_length ::= uncompressed_value(4, 5);
reserved ::= uncompressed_value(1, 0);
more_fragments ::= uncompressed_value(1, 0);
offset ::= uncompressed_value(13, 0);
The first parameter indicates the length of the uncompressed field in
bits, and the second parameter gives its integer value.
The "irregular" encoding method (defined in Section 4.7.3) can be
used to encode any field whose length is fixed, or can be calculated
using an expression. It is a general encoding method that can be
used for fields to which no other encoding method applies. All of
the bits in the uncompressed form of the field are present in the
compressed form as well; hence this encoding does not achieve any
compression.
Finking & Pelletier Expires December 25, 2005 [Page 11]
Internet-Draft ROHC-FN June 2005
tos ::= irregular(6);
ecn ::= irregular(2);
length ::= irregular(16);
id ::= irregular(16);
dont_frag ::= irregular(1);
ttl ::= irregular(8);
protocol ::= irregular(8);
src_addr ::= irregular(32);
dest_addr ::= irregular(32);
Finally, the third encoding method is specific only to IPv4 headers,
"inferred_ip_v4_header_checksum":
checksum ::= inferred_ip_v4_header_checksum;
};
};
This is a specific encoding method for calculating the IP checksum
from the rest of the header values. Like the "uncompressed_value"
encoding method, no compressed bits need to be sent, since the field
value can be reconstructed at the decompressor.
However, unlike "uncompressed_value", the meaning of
"inferred_ip_v4_header_checksum" is not defined in the ROHC-FN
library of encoding methods, nor is it defined by another structure
elsewhere in the formal notation given in the example above. Its
definition can be given either using plain English text or using the
formal notation as part of the profile definition itself.
Finally the definition of the structure is closed with a closing
brace. At this point, the above example has defined the format of
the compressed IPv4 header, and provided enough information to allow
an implementation to construct the compressed header from an
uncompressed header and vice versa.
4. Normative Definition of ROHC-FN
This section gives the normative definition of ROHC-FN. ROHC-FN is a
referentially transparent, declarative language with no side effects.
4.1 Overall Structure of a Specification
The specification of a ROHC profile's packet formats using ROHC-FN is
called a ROHC-FN specification. ROHC-FN specifications are written
in the 7-bit ASCII character set and consist of a sequence of zero or
more constant definitions (Section 4.3), an optional global control
field list (Section 4.9.5) and one or more encoding method
definitions, given in the form of structures (Section 4.9). Each of
Finking & Pelletier Expires December 25, 2005 [Page 12]
Internet-Draft ROHC-FN June 2005
these is terminated by a semi-colon.
Structures define an encoding method by giving one or more formats
for uncompressed packets and one or more formats for compressed
packets. These formats are linked by "fields", each of which
describes a certain part of an uncompressed and/or a compressed
format. In addition to the packet formats each structure may contain
control fields and default field encodings sections. Each of these
sections inside a structure is terminated by a semi-colon.
The properties of a field are defined by defining an encoding method
for it and/or by use of "let" statements. Each of these is
terminated by a semi-colon. Encoding methods can be defined in
ROHC-FN using a structure or can be predefined encoding methods.
Predefined encoding methods can be defined in the text accompanying a
formal specification or they can be those defined in the present
document.
4.2 Identifiers
All identifiers must start with an alphabetic character.
Identifiers may be of any length and may contain any combination of
alphanumeric characters and underscores.
Identifiers for constants may not use lower case letters, all other
identifiers may not use upper case letters.
It is illegal to use any of the following as identifiers:
o "let", "this"
o "uncomp_hdr_start", "uncomp_length", "uncomp_value"
o "comp_hdr_start", "comp_length", "comp_value"
o identifiers starting either with "uc_format", "co_format",
"control_fields" or "default_methods"
All identifiers used in ROHC-FN have a "scope". The scope of an
identifier defines the parts of the specification where that
identifier applies and from which it can be referred to. If an
identifier has "global" scope, then it applies throughout the
specification which contains it and can be referred to from anywhere
within it. If an identifier has "local" scope, then it only applies
to the structure in which it is defined, it can not be referenced
from outside that structure. If an identifier has local scope, it
can therefore be used in multiple different local scopes to refer to
different items.
All instances of an identifier within its scope refer to the same
Finking & Pelletier Expires December 25, 2005 [Page 13]
Internet-Draft ROHC-FN June 2005
item. It is not possible to have different items referred to by a
single identifier within any given scope. For this reason, if there
is an identifier which has global scope it can not be used separately
in a local scope, since a globally scoped identifier is already
applicable in all local scopes.
The identifiers for each encoding method and each constant all have
global scope. Each field also has an identifier. The scope of field
identifiers is local, with the exception of global control fields
which have global scope.
4.3 Constant Definitions
Constant values can be defined using the "=" operator. Identifiers
for constants must be all upper case. For example:
SOME_CONSTANT = 3;
Constants are defined by an expression (see Section 4.5) on the right
hand side of the "=" operator. The expression must yield a constant
value. That is, the expression must be one whose terms are all
either constants or literals and not structure parameters or field
attributes.
Constants have global scope. Constants must be defined at the top
level, outside of any structure definition (noting that "=" has a
different meaning inside a structure, see Section 4.9). Because the
FN is referentially transparent, constants are entirely equivalent to
the value they refer to, and are completely interchangeable with that
value. Similarly, since the language has no side effects a constant
may never change its value.
4.4 Fields
Fields are the basic building blocks of a ROHC-FN specification.
Fields are the units which headers are divided into. Each field may
have two representations: a compressed representation and an
uncompressed representation. Both representations take the same
form, an unsigned string of bits, most significant bit first.
The properties of the compressed representation of a field are
defined by an encoding method. This encoding method entirely
characterises the relationship between the uncompressed and
compressed representation of that field. This is achieved by
specifying the relationships between the field's attributes.
The notation defines six field attributes, three for the uncompressed
representation and a corresponding three for the compressed
Finking & Pelletier Expires December 25, 2005 [Page 14]
Internet-Draft ROHC-FN June 2005
representation. The attributes available for each field are as
follows:
uncompressed attributes of a field:
o "uncomp_value", "uncomp_length" and "uncomp_hdr_start",
compressed attributes of a field:
o "comp_value", "comp_length" and "comp_hdr_start".
The two value attributes contain the respective numerical values of
the field, i.e. "uncomp_value" gives the numerical value of the
uncompressed aspect of the field, and the attribute "comp_value"
gives the numerical value of the compressed aspect of the field. The
numerical values are derived by interpreting the bit string
representations of the field as unsigned binary integers, most-
significant bit first.
The two length attributes indicate the length in bits of the
associated bit string; "uncomp_length" for the uncompressed
representation, and "comp_length" for the compressed representation.
Finally, the two "hdr_start" attributes indicate the offset in bits
of the start of the field from the start of the header;
"uncomp_hdr_start" for the position in the uncompressed header, and
"comp_hdr_start" for the position of the field in the compressed
header.
Attributes are undefined unless they are bound to a value in which
case they become defined. Bindings are permanent, the defined value
of an attribute can not be changed. Defined values are required for
all compressed attributes of fields which appear in the compressed
header and for all uncompressed attributes of fields which appear in
the uncompressed header. If two conflicting bindings are given for a
field attribute then the binding fails along with the packet format
in which the binding was defined.
Note that uncompressed attributes do not always reflect an aspect of
the uncompressed header. Some fields do not originate from the
uncompressed header, but are control fields. In particular note that
the "uncomp_hdr_start" attribute has no useful meaning if the field
is a control field (see Section 4.9.5).
4.4.1 Attribute References
Attributes of a particular field are referred to formally by using
the field's name followed by a ":" and the attribute's identifier.
For example:
Finking & Pelletier Expires December 25, 2005 [Page 15]
Internet-Draft ROHC-FN June 2005
ip_seq_number:uncomp_value
gives the uncompressed value of the ip_seq_number field. The primary
reason for referencing attributes is for use in expressions, which
are explained in Section 4.5.
4.4.2 Negative Field Values
Since fields are represented using unsigned integers which cannot be
negative, negative values assigned to a field, simply wrap around at
zero. Therefore negative values are automatically represented in the
usual manner used in binary arithmetic, two's complement.
For example if a field's "comp_length" attribute was 8, and its
"comp_value" attribute was -1, the compressed representation of the
field would wrap around zero and be 11111111 (binary). Adding 1 to
this will yield to zero. However the representation is still
unsigned, so that 11111111 (binary) actually evaluates to 255
(decimal), not -1 as may be expected.
It should be noted that ROHC-FN does support negative values for use
in expressions (see Section 4.5), but the interpretation of bits on
the wire is always in unsigned form.
4.5 Expressions
ROHC-FN includes the usual infix style of expressions, with
parentheses "(" and ")" used for grouping. Expressions can be made
up of any of the components described in the following subsections.
In summary, the semantics of expressions are generally as in the
ANSI-C programming language, with the following additions and
exceptions:
o There is no limit on the range of integers.
o For modulo, the expression "mod(k,v)" is used instead of C
language "k % v". Note that the '%' is a comment character in
ROHC-FN.
o "x ^ y" evaluates to x raised to the power of y.
o "log2(w)" evaluates to the smallest integer k where w <= 2^k, i.e.
it returns the smallest number of bits in which value v can be
stored.
Expressions may refer to any of the attributes of each field (as
described in Section 4.4), and also to any defined constant (see
Section 4.3).
If any of the attributes or constants used in the expression are
Finking & Pelletier Expires December 25, 2005 [Page 16]
Internet-Draft ROHC-FN June 2005
undefined, the value of the expression is undefined. Undefined
expressions cause the environment (e.g. the packet format) in which
they are used to fail if a defined value is required. Defined values
are required for all compressed attributes of fields which appear in
the compressed header and for all uncompressed attributes of fields
which appear in the uncompressed header.
Note that expressions cannot be used as encoding methods directly
because they do not completely characterise a field. Expressions
only specify a single value whereas a field is made up of several
values: its attributes. For example, the following is illegal:
tcp_list_length ::= (data_offset + 20) / 4;
There is only enough information here to define a single attribute of
"tcp_list_length". Although this makes no sense formally, this could
intuitively be read as defining the "uncomp_value" attribute.
However, that would still leave the length of the uncompressed field
undefined at the decompressor. Such usage is therefore prohibited.
4.5.1 Integer Literals
Integers can be expressed as decimal values, binary values (prefixed
by 0b), or hexadecimal values (prefixed by 0x). Negative integers
are prefixed by a "-" sign. For example "10", "0b1010" and "-0x0a"
are all valid integer literals, having the values ten, ten and minus
ten respectively.
4.5.2 Boolean Literals
The boolean literals are "false", and "true".
4.5.3 Boolean Operators
The following "boolean" operators are available, which take boolean
arguments and return a boolean result:
o &&, for logical "and". Returns true if both arguments are true.
Returns false otherwise.
o ||, for logical "or". Returns true if at least one argument is
true. Returns false otherwise.
o !, for logical not. Returns true if its argument is false.
Returns false otherwise.
4.5.4 Integer Operators
The following "integer" operators are available, which take integer
arguments and return an integer result:
Finking & Pelletier Expires December 25, 2005 [Page 17]
Internet-Draft ROHC-FN June 2005
o ^, for exponentiation. "x ^ y" returns the value of "x" to the
power of "y".
o *, / for multiplication and division. "x * y" returns the product
of "x" and "y". "x / y" returns the quotient, rounded down to the
next integer (the next one towards negative infinity).
o +, - for addition and subtraction. "x + y" returns the sum of "x"
and "y". "x - y" returns the difference.
o mod(k, v) for modulo. "mod(x,y)" returns "x" modulo "y"; x - y *
(x / y).
o log2(w) for logarithm to base 2. Log2(x) returns the smallest
integer k where x <= 2^k, i.e. it returns the smallest number of
bits in which value x can be stored.
4.5.5 Comparison Operators
The following "comparison" operators are available, which take
integer arguments and return a boolean result:
o ==, !=, for equality and its negative. "x == y" returns true if x
is equal to y. Returns false otherwise. "x != y" returns true if
x is not equal to y. Returns false otherwise.
o <, >, for less than and greater than. "x < y" returns true if x is
less than y. Returns false otherwise. "x > y" returns true if x
is greater than y. Returns false otherwise.
o >=, <=, for less than or equal and greater than or equal, the
inverse functions of <, >. "x >= y" returns false if x is less
than y. Returns true otherwise. "x <= y" returns false if x is
greater than y. Returns true otherwise.
4.6 Comments
Free English text can be inserted into a ROHC-FN specification to
explain why something has been done a particular way, to clarify the
intended meaning of the notation, or to elaborate on some point. To
this end comment syntax is provided.
The FN uses an end of line comment style, which makes use of the "%"
comment character. Any text between the "%" character and the end of
the line has no formal meaning. For example:
Finking & Pelletier Expires December 25, 2005 [Page 18]
Internet-Draft ROHC-FN June 2005
%-----------------------------------------------------------------
% IR-REPLICATE packet formats
%-----------------------------------------------------------------
% The following fields are included in all of the IR-REPLICATE
% packet formats:
%
uc_format = discriminator, %[ 8 ]
tcp_seq_number, %[ 32 ]
tcp_flags_ecn, %[ 2 ]
Comments do not affect the formal meaning of what is notated, but can
be used to improve readability. Their use is optional.
Comments may help to provide clarifications to the reader, and serve
different purposes to implementers. Comments should thus not be
considered of lesser importance when inserting them into a ROHC-FN
specification; they should be consistent with the normative part of
the specification.
4.7 Encoding Methods Library
ROHC [RFC3095] contains a number of different techniques for
compressing header fields (LSB encoding, value encoding, etc.). Most
of these techniques are part of the ROHC-FN library so that they can
be reused when creating new ROHC-FN specifications. The notation for
these is described below. As an alternative to library encoding
methods, encoding methods can be defined using structures (see
Section 4.9). It is also possible for a ROHC-FN specification to
define its own set of encoding methods using the formal notation or
using a textual definition (see Section 4.10).
4.7.1 uncompressed_value
The "uncompressed_value" encoding method is used to encode header
fields for which the uncompressed value can be defined using a
mathematical expression (including constant values):
field ::= uncompressed_value(<uncomp_length_expression>,
<uncomp_value_expression>);
where the value of the "uncomp_length_expression" binds with the
field's "uncomp_length" attribute, and the value of the
"uncomp_value_expression" binds with the field's "uncomp_value"
attribute. The "comp_length" attribute is bound to zero since the
field does not appear in the compressed header. Note however that it
is still legal to refer to it in a compressed format field order
list, but it has a length of zero. The "comp_value" attribute is not
Finking & Pelletier Expires December 25, 2005 [Page 19]
Internet-Draft ROHC-FN June 2005
bound by this encoding method.
As an example of the usage of "uncompressed_value" encoding, the IPv6
header version number is a four bit field that always has the value
6:
version ::= uncompressed_value(4, 6);
Here is another example of value encoding, using an expression to
calculate the length:
padding ::= uncompressed_value(nbits - 8, 0);
In this example the expression uses a structure parameter, "nbits"
(which specifies how many significant bits there are in the data) to
calculate how many pad bits to use. See Section 4.9.3 for more
information on structure parameters.
4.7.2 compressed_value
The "compressed_value" encoding method is used to define fields in
the compressed header for which there is no counter-part in the
uncompressed header. It can be used to specify compressed fields
whose value can be defined using a mathematical expression (including
constant values):
field ::= compressed_value(<comp_length_expression>,
<comp_value_expression>);
where the value of the "comp_length_expression" binds with the
field's "comp_length" attribute, and the value of the
"comp_value_expression" binds with the field's "comp_value"
attribute. The "uncomp_length" attribute is bound to zero since the
field does not appear in the uncompressed header. Note however that
it is still legal to refer to it in an uncompressed format field
order list, but it has a length of zero. The "uncomp_value"
attribute is not bound by this encoding method.
One possible use of this encoding method is to define padding in the
compressed header:
pad_to_octet_boundary ::= compressed_value(3, 0);
A more common use is to define a discriminator field to make it
possible to differentiate between different packet formats within a
structure (see Section 4.9). For convenience, the notation provides
syntax for specifying "compressed_value" encoding in the form of a
binary string. The binary string to be encoded is simply given in
Finking & Pelletier Expires December 25, 2005 [Page 20]
Internet-Draft ROHC-FN June 2005
single quotes. For example:
discriminator ::= '01101';
This has exactly the same meaning as:
discriminator ::= compressed_value(5, 13);
4.7.3 irregular
The "irregular" encoding method is used to encode a field in the
compressed packet with a bit pattern identical to the original field
in the uncompressed packet:
field ::= irregular(<expression>);
where the value of "expression" binds with the both the
"uncomp_length" and "comp_length" attributes of the field, and the
"comp_value" and "uncomp_value" attributes are bound to each other.
For example, the checksum field of the TCP header is a sixteen bits
field that does not follow any pattern:
tcp_checksum ::= irregular(16);
Note that the length does not have to be constant, for example the
length expression can be used to derive the length of the field from
the value of another field.
4.7.4 static
The "static" encoding method compresses a field whose length and
value are the same as for a previous header in the flow, i.e. where
the field completely matches an existing entry in the context:
field ::= static;
The field's "uncomp_value" and "uncomp_length" attributes bind with
their respective values in the context and the "comp_length"
attribute is bound to zero.
Since the field value is the same as a previous field value, the
entire field can be reconstructed from the context, so it is
compressed to zero bits and does not appear in the compressed header.
For example, the source port of the TCP header is a field whose value
does not change from one packet to the next for a given flow:
Finking & Pelletier Expires December 25, 2005 [Page 21]
Internet-Draft ROHC-FN June 2005
src_port ::= static;
4.7.5 lsb
The Least Significant Bit encoding method, "lsb", compresses a field
whose value differs by a small amount from the value stored in the
context. The least significant bits of the field value are
transmitted instead of the original field value.
field ::= lsb(num_lsbs_param, offset_param);
Here, "num_lsbs_param" is the number of least significant bits to
use, and "offset_param" is the interpretation interval offset. The
parameter "num_lsbs_param" binds with the "comp_length" attribute,
and the "uncomp_value" attribute binds with (context_value -
offset_param + comp_value).
The "lsb" encoding method can therefore compress a field whose value
lies between (context_value - offset_param) and (context _value -
offset_param + 2^num_lsbs_param - 1) inclusive. In particular, if
offset_param = 0 then the field value can only stay the same or
increase relative to the previous header in the flow. If
offset_param = -1 then it can only increase, whereas if offset_param
= 2^num_lsbs_param then it can only decrease.
The compressed field takes up the specified number of bits in the
compressed header (i.e. num_lsbs_param). For example, the tcp
sequence number:
tcp_sequence_number ::= lsb(14, 8192);
This takes up 14 bits, and can communicate any value which is between
8192 lower than the value of the field stored in context and 8191
above it.
The compressor may not be able to determine the exact context value
that will be used by the decompressor, since some packets that would
have updated the context may have been lost or damaged. However,
from feedback received or by making assumptions, the compressor can
limit the candidate set of values. The compressor then chooses an
encoding such that no matter which context value in the candidate set
the decompressor uses, the resulting decompression is correct. If
that is not possible, the lsb encoding method fails (which typically
results in a less efficient packet format being chosen by the
compressor). As "reasonable" assumptions may not always be correct,
lsb encoding is intended to be used in conjunction with methods that
validate the output of the decompression process, such as the crc
Finking & Pelletier Expires December 25, 2005 [Page 22]
Internet-Draft ROHC-FN June 2005
method described in Section 4.7.6.
4.7.6 crc
The "crc" encoding method provides a CRC calculated over a block of
data. The block of data is represented using either the
"uncomp_value" or "comp_value" attribute of a field. The "crc"
method takes a number of parameters:
o the number of bits for the CRC (crc_bits),
o the bit-pattern for the polynomial (bit_pattern),
o the initial value for the CRC register (initial_value),
o the value of the block of data (block_data_value); and
o the size in octets of the block of data (block_data_length).
I.e.:
field ::= crc(num_bits, bit_pattern, initial_value,
block_data_value, block_data_length);
The CRC is calculated in least significant bit (LSB) order.
The following CRC polynomials are defined in [RFC3095], in Sections
5.9.1 and 5.9.2:
8-bit
C(x) = x^0 + x^1 + x^2 + x^8
bit_pattern = 0xe0
7-bit
C(x) = x^0 + x^1 + x^2 + x^3 + x^6 + x^7
bit_pattern = 0x79
3-bit
C(x) = x^0 + x^1 + x^3
bit_pattern = 0x06
For example:
% 3 bit CRC, C(x) = x^0 + x^1 + x^3
crc_field ::= crc(3, 0x6, 0xF, this:comp_value, this:comp_length);
Usage of the "this" keyword as shown in the example, is typical when
using "crc" encoding (see Section 4.9.1).
4.8 "let" Statements
A "let" statement shares some similarities with an encoding method.
Finking & Pelletier Expires December 25, 2005 [Page 23]
Internet-Draft ROHC-FN June 2005
Specifically, whereas an encoding method binds several field
attributes at once, a "let" statement typically binds just one of
them. In fact all encoding methods can be expressed in terms of a
collection of "let" statements. Here is an example "let" statement
which binds the "uncomp_value" attribute of a field to 5.
let(field:uncomp_value == 5);
A "let" statement must only be used inside a field encodings list
(see Section 4.9).
Like an encoding method, a "let" statement can only be successfully
used in a format if the binding it describes is achievable. A format
containing the example "let" statement above would not be usable if
the field had also been bound with "uncompressed_value" encoding
which gave it a different uncompressed value.
A "let" statement takes a boolean expression as a parameter. It can
be used to assert that the expression is true, in order to choose a
particular packet format from a list of possible formats specified in
a structure (see Section 4.9), or just to bind an expression as in
the example above. The general form of a "let" statement is
therefore:
let(<boolean expression>)
There are three possible conditions that the expression may be in:
1. The boolean expression evaluates to false, in which case the
packet format which contains the "let" statement can not be used,
2. The boolean expression evaluates to true, in which case the
packet format is usable,
3. The value of the boolean expression is undefined. In this case
the packet format can be used.
In all three cases, any undefined terms become bound by the
expression. Generally speaking a "let" statement is either being
used as an assignment (condition 3 above) or else it is being used to
test if a particular packet format is usable, as is the case with
conditions 1 and 2.
4.9 Structures
Structures are used for defining new encoding methods in a formal
specification. They compose groups of individual fields into
contiguous blocks. Structures can be thought of as compound encoding
methods; they have names and may have parameters and can be used in
the same way as any other encoding method. Since structures can
Finking & Pelletier Expires December 25, 2005 [Page 24]
Internet-Draft ROHC-FN June 2005
contain references to other structures, complicated headers can be
broken down into manageable pieces.
This section describes the various features of structures, starting
out with the simplest.
4.9.1 "this"
Within a structure it is possible to refer to the field it is
encoding, using the keyword "this". This is useful for gaining
access to the attributes of the field being encoded. For example it
is often useful to know the total uncompressed length of the header
which is being encoded.
4.9.2 Simple Structures
A structure can be used to specify a single fixed encoding. This is
its simplest form. For example:
compound_encoding_method ===
{
uc_format = field_1, %[ 4 ]
field_2; %[ 12 ]
co_format = field_2, %[ 0 ]
field_1 %[ 4 ]
{
field_1 ::= irregular(4);
field_2 ::= uncompressed_value(12, 9);
};
};
The above begins with the structure's identifier,
"compound_encoding_method". The identifier is followed by "===",
which indicates that this is a structure definition. The definition
of the structure then follows inside curly braces, "{" and "}". The
first item in the definition is the "uc_format" field order list,
which gives the order of the fields in the uncompressed header. This
is followed by the compressed header field order list. This list is
in turn followed by the field encodings list for the compressed
header, which gives the encoding method for each field. The
different components of this example are described in more detail
below.
4.9.2.1 Uncompressed Format
The uncompressed field order list is defined by "uc_format", which
specifies the fields of the uncompressed header in the order that
Finking & Pelletier Expires December 25, 2005 [Page 25]
Internet-Draft ROHC-FN June 2005
they appear in the uncompressed header. In the example, this is
"field_1" followed by "field_2". This means that a field being
encoded by this structure is divided into two subfields, "field_1"
and "field_2". The total uncompressed lengths of these two fields
therefore equals the length of the field being encoded. Formally:
field_1:uncomp_length + field_2:uncomp_length == this:uncomp_length
In the example we have just two fields but any number of subfields
may be used. This relationship applies to however many fields are
actually used. Note that the arrangement of fields specified in the
uncompressed field order list is up to the notator. Any arrangement
of fields that correctly describes the content of the uncompressed
header may be chosen -- this need not be the same as the one
described in the specifications for the protocol header being
compressed. However, the bits of the uncompressed format must remain
in the same order.
For example, there may be a protocol whose header contains a 16 bit
sequence number, but whose sessions tend to be short lived. This
would mean that the high bits of the sequence number are almost
always constant. The "uc_format" could reflect this by splitting the
original uncompressed field into two fields, one field to represent
the insignificant almost-always-zero part of the sequence number, and
a second field to represent the significant part.
An uncompressed format may contain a field encodings list. Encoding
methods specified therein are used whenever a packet with that
uncompressed format is being encoded. The encoding of a packet with
a given uncompressed format can only succeed if all of its encoding
methods and "let" statements succeed (see Section 4.8).
The total length of an uncompressed header must be defined. The
length of each of the fields in an uncompressed header must also be
defined. This means that the bindings in the "uc_format",
"co_format", "control_fields" (see below) and "default_methods" (see
below) field encodings lists must between them define the
"uncomp_length" attribute of every field in an uncompressed header so
that there is an unambiguous mapping from the bits in the
uncompressed header to the fields listed in each "uc_format" field
order list.
4.9.2.2 Compressed Format
Similar to the uncompressed field order list, the compressed data
will appear in the order specified by the compressed field order list
given for a compressed format. Each individual field is encoded in
the manner given for that field in the field encodings list, which is
Finking & Pelletier Expires December 25, 2005 [Page 26]
Internet-Draft ROHC-FN June 2005
in braces and follows immediately after the compressed field order
list. The total length of the compressed data will be the total of
the compressed lengths of all the individual fields. In the example,
the annotation for these fields indicates that they are zero and 4
bits long, making a total of 4 bits.
Note that the order of the fields specified in a compressed format
field order list, does not have to match the order they appear in the
"uc_format" field order list. It may be desirable to reorder the
fields in the compressed header to align the compressed header to the
octet boundary, or for other reasons. In the above example, the
order is in fact the opposite of that in the uncompressed header.
The field encodings list specifies that the encoding for "field_1" is
"irregular", and takes up four bits in both the compressed header and
uncompressed header. The encoding for "field_2" is
"uncompressed_value", which means that the field has a fixed value,
so it can be compressed to zero bits. The value it takes is 9, and
it is 12 bits wide in the uncompressed header.
Fields like "field_2", which compress to zero bits in length, may be
omitted from the compressed field order list. This is because their
position in the list is not significant. So, without changing the
meaning, the above example could be notated as follows:
compound_encoding_method ===
{
uc_format = field_1, %[ 4 ]
field_2; %[ 12 ]
co_format = field_1 %[ 4 ]
{
field_1 ::= irregular(4);
field_2 ::= uncompressed_value(12, 9);
};
};
The total length of a compressed header must be defined. The length
of each of the fields in a compressed header must also be defined.
This means that the bindings in the "uc_format", "co_format",
"control fields" (see below) and "default_methods" (see below) field
encodings lists must between them define the "comp_length" attribute
of every field in a compressed header so that there is an unambiguous
mapping from the bits in the compressed header to the fields listed
in each "co_format" field order list.
Finking & Pelletier Expires December 25, 2005 [Page 27]
Internet-Draft ROHC-FN June 2005
4.9.3 Arguments and Structures
Structures may take arguments, which have some control over the
mapping between compressed and uncompressed fields. These are
specified immediately after the structure name, in parentheses, as a
comma separated list. For example:
poor_mans_lsb(variable_length) ===
{
uc_format = constant_bits,
variable_bits;
co_format = variable_bits
{
constant_bits ::= static;
variable_bits ::= irregular(variable_length);
};
};
As with any encoding method, all arguments are values, rather than
fields. Although entire fields cannot be passed as arguments, it is
possible to pass their attributes instead.
4.9.4 Multiple Formats
Structures can also define multiple formats for a given header. This
allows different compression methods to be used depending on what is
the most efficient way of compressing a particular header.
For example, a field may have a fixed value most of the time, but the
fixed value may occasionally change. Using a single format for the
structure, this field would have to be encoded using "irregular" (see
Section 4.7.3), even though the value only changes rarely. However,
by using the structure to define multiple formats, we can provide two
alternative encodings; one for when the value remains fixed and
another for when the value changes.
This is the topic of the following sub-sections.
4.9.4.1 Naming Convention
When compressed formats are defined, they must be defined using names
beginning with the reserved prefix "co_format". Similarly
uncompressed formats must be defined using names beginning with
"uc_format".
Format names must be unique within the structure to which they
belong.
Finking & Pelletier Expires December 25, 2005 [Page 28]
Internet-Draft ROHC-FN June 2005
4.9.4.2 Format Discrimination
Each of the compressed formats has its own field order list and field
encodings list. A compressor may pick any of these alternative
formats to compress a header, as long as the field encodings it
employs can be used with the uncompressed header. For example, the
compressor could not choose to use a compressed format that had a
"static" encoding for a field whose value had just changed.
More formally, the compressor can choose any combination of an
uncompressed format and a compressed format for which all fields
"succeed", i.e. the encoding methods and let-statements succeed (see
Section 4.8). If there are multiple successful combinations, the
compressor can choose any one. Otherwise if there are no successful
combinations, the encoding method defined by the structure "fails".
Because the compressor has a choice, it must be possible for the
decompressor to discriminate between the different packet formats
that the compressor could have chosen. A simple approach to this
problem is for each compressed format to include a "discriminator"
that uniquely identifies that particular "co_format". A
discriminator is a control field; it is not derived from any of the
uncompressed field values (see Section 4.7.2).
4.9.4.3 Default Encoding Methods - default_methods
When using multiple packet formats, default bindings may be specified
for each field or attribute. The default encoding methods specify
the encoding method to use for a field if no encoding method is given
for that field elsewhere. This is helpful to keep the definition of
the packet formats concise, as the same encoding method need not be
repeated for every format.
Default bindings are optional and may be given for any combination of
fields and attributes which are in scope.
The syntax for specifying default bindings is similar to that used to
specify a compressed or uncompressed format. However there is no
field order list for the default encoding methods, only the field
encodings list is given. This is because the field order is
specified individually for each "co_format" and "uc_format". For
example:
Finking & Pelletier Expires December 25, 2005 [Page 29]
Internet-Draft ROHC-FN June 2005
default_methods =
{
field_1 ::= uncompressed_value(4,1);
field_2 ::= uncompressed_value(4,2);
field_3 ::= lsb(3,-1);
let(field_4:uncomp_length == 4);
};
Here default bindings are specified for fields 1 to 3. A default
binding for the "uncomp_length" attribute of field 4 is also
specified.
Fields for which there is a default encoding method do not need to be
specified in the field encodings list of any format that uses the
default encoding method for that field. Any format which does not
use the default encoding method must specify explicitly which
encoding method is used for that field.
If a field is omitted from a compressed format's field encodings list
so that the default encoding method is used, and the default encoding
method always compresses the field down to zero bits, the field can
also be omitted from the compressed format's field order list. Like
any other zero bit field, its position in the field order list is not
significant.
The field encodings list of default_methods may also contain default
bindings for individual attributes by using "let" statements. A
default binding for an individual attribute will only be used if
there is no binding given for that attribute nor the field to which
it belongs. If there is a "let" statement binding that attribute, or
an encoding method binding the field to which it belongs, the default
binding for the attribute will not be used. Note that this applies
even if the specified encoding method does not define the particular
attribute given in the default_methods section.
Assuming the default methods given in the example above, the first
three of the following four compressed packet formats would not use
the default binding for "field_4:uncomp_length":
Finking & Pelletier Expires December 25, 2005 [Page 30]
Internet-Draft ROHC-FN June 2005
co_format_1 = field_4
{
let(field_4:uncomp_length == 3); % set uncomp_length to 3
};
co_format_2 = field_4
{
field_4 ::= irregular(3); % set uncomp_length to 3
};
co_format_3 = field_4
{
field_4 ::= '1010'; % set uncomp_length to undefined
};
co_format_4 = field_4
{
let(field_4:uncomp_value == 12); % use default uncomp_length
};
The fourth format is the only one which uses the default binding for
"field_4:uncomp_length".
It is allowed to use one default binding but not use another. Using
one of the default bindings does not imply that they all have to be
used, even though they all appear in the "default_methods" field
encodings list together.
Note that a structure's default methods are only used for packet
formats which do not already specify an encoding for all of their
fields. For the packet formats that do use the default methods, only
those fields and attributes whose bindings are not specified are
looked up in the default methods.
4.9.4.4 Example of Multiple Formats
Putting this altogether, here is a complete example of a structure
with multiple compressed formats:
Finking & Pelletier Expires December 25, 2005 [Page 31]
Internet-Draft ROHC-FN June 2005
test_multiple_formats ===
{
uc_format = field_1, %[ 4 ]
field_2, %[ 4 ]
field_3; %[ 24 ]
default_methods =
{
field_1 ::= static;
field_2 ::= uncompressed_value(4, 2);
field_3 ::= lsb(4, 0);
};
co_format_0 = discriminator, %[ 1 ]
field_3 %[ 4 ]
{
discriminator ::= '0';
};
co_format_1 = discriminator, %[ 1 ]
field_1, %[ 4 ]
field_3 %[ 24 ]
{
discriminator ::= '1';
field_1 ::= irregular(4);
field_3 ::= irregular(24);
};
};
Note the following:
o "field_1" and "field_3" both have default encoding methods
specified for them, which are used in "co_format_0", but is
overridden in "co_format_1"; "field_2" however is not overridden.
o "field_1" and "field_2" have default encoding methods which
compress to zero bits. When these are used in "co_format_0", the
field names do not appear in either the field order list or in the
field encodings list.
o "field_3" has an encoding method which does not compress to zero
bits, so whilst "field_3" is absent from the field encoding list
of "co_format_0"', it still needs to appear in the field order
list to specify whereabouts it goes in the compressed packet.
o in the example, all the uncompressed header fields have default
encoding methods specified for them, but this is not a
requirement. It is perfectly allowable to only specify default
encodings for some or even none of the uncompressed header fields.
o in the example all the default encoding methods are on fields from
the uncompressed header, but this is not a requirement. It is
also perfectly allowable to specify default encoding methods for
Finking & Pelletier Expires December 25, 2005 [Page 32]
Internet-Draft ROHC-FN June 2005
control fields.
4.9.5 Control Fields
Control fields are defined using the "control_fields" list. The
control fields list specifies all fields that do not appear in the
uncompressed header but which have an uncompressed value
(specifically those with an uncomp_length greater than zero). Such
fields may be used to help compress fields from the uncompressed
header more efficiently. A control field could be used to improve
efficiency by representing some commonality between a number of the
uncompressed fields, or by representing some information about the
flow that is not explicitly contained in the protocol headers.
For example in IP, the behaviour of the IP-ID field in a flow varies
depending on how the endpoints handle IP-IDs. Sometimes the
behaviour is effectively random, sometimes the IP-ID follows a
predictable sequence, and at other times it stays fixed at zero. The
type of IP-ID behaviour is information that is never communicated
explicitly in the uncompressed header. However, a profile can still
be designed to identify the behaviour and adjust the compression
strategy according to the identified behaviour, thereby improving the
compression performance. To do so, the ROHC_FN specification can
introduce an explicit field to communicate the IP-ID behaviour in
compressed headers, this is done by introducing a control field:
ipv4 ===
{
uc_format = version, %[ 4 ]
hdr_length, %[ 4 ]
protocol, %[ 8 ]
tos_tc, %[ 6 ]
ip_ecn_flags,%[ 2 ]
ttl_hopl, %[ 8 ]
df, %[ 1 ]
mf, %[ 1 ]
rf, %[ 1 ]
frag_offset, %[ 13 ]
ip_id, %[ 16 ]
src_addr, %[ 32 ]
dst_addr, %[ 32 ]
checksum, %[ 16 ]
length; %[ 16 ]
control_fields = ip_id_behavior; %[ 2 ]
:
:
};
Finking & Pelletier Expires December 25, 2005 [Page 33]
Internet-Draft ROHC-FN June 2005
The control_fields list is equivalent to the "uc_format" field order
list for fields that do not appear in the uncompressed header. It
defines a field that has the same properties (the same attributes
etc.) as fields appearing in the uncompressed header.
Control fields are initialised by using the appropriate encoding
methods and/or by using "let" statements. This may be done inside
the control_fields' own field encodings list. For example:
example_structure ===
{
uc_format = field_1;
control_fields = scaled_field
{
let(scaled_field:uncomp_value == field_1:uncomp_value / 8);
let(scaled_field:uncomp_length == field_1:uncomp_length - 3);
};
co_format = scaled_field
{
scaled_field ::= lsb(4, 0);
};
};
This control field is used to scale down a field in the uncompressed
header by a factor of 8 before encoding it with LSB encoding.
Scaling it down makes the LSB encoding more efficient.
Control fields may also be used with global scope. In this case
their declaration must be outside of any structure. They are then
visible within any structure thus allowing information to be shared
between structures directly.
4.10 Profile-specific Encoding Methods
The library of encoding methods defined by ROHC-FN provides a basic
and generic set of field encoding methods. When using a ROHC-FN
specification in a ROHC profile, some additional encodings specific
to the particular protocol header being compressed may however be
needed, such as methods that infer the value of a field from other
values. These methods are specific to the properties of the protocol
being compressed, and will thus have to be defined within the profile
specification itself. Such profile-specific encoding methods,
defined either in ROHC-FN syntax or rigorously in plain text, can be
referred to in the ROHC-FN specification of the profile's packet
formats in the same way as any other method in the ROHC-FN library
(see Section 4.7).
Finking & Pelletier Expires December 25, 2005 [Page 34]
Internet-Draft ROHC-FN June 2005
5. Security considerations
This draft describes a formal notation similar to ABNF [RFC2234], and
hence is not believed to raise any security issues.
6. IANA Considerations
No information in this specification is currently subject to IANA
registration. However, users of the FN producing a ROHC profile
should note that a ROHC profile identifier must be reserved by the
IANA for all official ROHC profiles. See the IANA Considerations
section of [RFC3095] for more details.
7. Contributors
Although no longer listed as an author, Richard Price did almost all
of the foundational work on the formal notation and also produced the
original formal notation internet draft on which this document is
based. Many thanks to him for doing that groundwork on which this
document stands.
8. Acknowledgements
A number of important concepts and ideas have been borrowed from ROHC
[RFC3095].
Thanks to Lars-Erik Jonsson for his extensive and comprehensive
review comments and for supplying alternative text to problematic
parts of the document.
Thanks to Mark West and particularly to Kristofer Sandlund for their
cooperation and feedback from notating the TCP profile. Additional
thanks to Kristofer for his excellent review comments.
Thanks also to Eilert Brinkmann and Carsten Bormann for their
feedback and comments, and also for supplying the editor with a
ROHC-FN parser with which to formally check the grammar of all the
examples.
Thanks to Rob Hancock and Stephen McCann for early work on the formal
notation. The authors would also like to thank Christian Schmidt,
Qian Zhang, Hongbin Liao and Max Riegel for their comments and
valuable input.
Finally thanks to Stewart Sadler, Caroline Daniels and Alan Finney
for doing some excellent last minute review work.
9. References
Finking & Pelletier Expires December 25, 2005 [Page 35]
Internet-Draft ROHC-FN June 2005
9.1 Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
9.2 Informative References
[RFC2234] Crocker, D. and P. Overall, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, November 1997.
[RFC3095] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H.,
Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le,
K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K.,
Wiebke, T., Yoshimura, T., and H. Zheng, "RObust Header
Compression (ROHC): Framework and four profiles: RTP, UDP,
ESP, and uncompressed", RFC 3095, July 2001.
Authors' Addresses
Robert Finking
Siemens/Roke Manor
Roke Manor Research Ltd.
Romsey, Hampshire SO51 0ZN
UK
Phone: +44 (0)1794 833189
Email: robert.finking@roke.co.uk
URI: http://www.roke.co.uk
Ghyslain Pelletier
Ericsson
Box 920
Lulea SE-971 28
Sweden
Phone: +46 (0) 8 404 29 43
Email: ghyslain.pelletier@ericsson.com
Appendix A. Bit-level Worked Example
This section gives a worked example at the bit level, showing how a
simple ROHC-FN specification describes the compression of real data
from an imaginary protocol header. The example used has been kept
fairly simple, whilst still aiming to illustrate some of the
intricacies that arise in use of the notation. In particular, fields
have been kept short to make it possible to read the binary
Finking & Pelletier Expires December 25, 2005 [Page 36]
Internet-Draft ROHC-FN June 2005
representation of the headers by eye, without too much difficulty.
A.1 Example Packet Format
Our imaginary header is just 16 bits long, and consists of the
following fields:
1. version number - 2 bits
2. type - 2 bits
3. flow id - 4 bits
4. sequence number - 4 bits
5. flag bits - 4 bits
So for example 0101000100010000 indicates a packet with a version
number of one, a type of one, a flow id of one, a sequence number of
one, and all flag bits set to zero.
A.2 Initial Encoding
An initial definition based solely on the above information is:
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
flag_bits; %[ 4 ]
co_format_initial = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
flag_bits %[ 4 ]
{
version_no ::= irregular(2);
type ::= irregular(2);
flow_id ::= irregular(4);
sequence_no ::= irregular(4);
flag_bits ::= irregular(4);
};
};
This defines the packet format nicely, but doesn't actually offer any
compression. If we use it to encode the above header, we get:
Uncompressed header: 0101000100010000
Compressed header: 0101000100010000
Finking & Pelletier Expires December 25, 2005 [Page 37]
Internet-Draft ROHC-FN June 2005
This is because we have stated that all fields are irregular - i.e.
we haven't specified anything about their behaviour.
A.3 Basic Compression
In order to achieve any compression we need to notate more knowledge
about the header and it's behaviour in a flow. For example, we may
know the following facts about the header:
1. version number - indicates which version of the protocol this is,
always one for this version of the protocol
2. type - may take any value.
3. flow id - may take any value.
4. sequence number - make take any value
5. flag bits - contains three flags, a, b and c, each of which may
be set or clear, and a reserved flag bit, which is always clear
(i.e. zero).
We could notate this knowledge as follows:
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
co_format_basic = version_no, %[ 0 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag %[ 0 ]
{
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= irregular(4);
sequence_no ::= irregular(4);
abc_flag_bits ::= irregular(3);
reserved_flag ::= uncompressed_value(1,0);
};
};
Using this simple scheme, we have successfully encoded the fact that
one of the fields has a permanently fixed value of one, and therefore
contains no useful information. We have also encoded the fact that
Finking & Pelletier Expires December 25, 2005 [Page 38]
Internet-Draft ROHC-FN June 2005
the final flag bit is always zero, which again contains no useful
information. Both of these facts have been notated using the
uncompressed_value encoding method (see Section 4.7.1)
Note that we could have omitted the "0 bits" fields from the field
order list of "co_format_basic" if we wished. The only purpose of
that list is to indicate the order of the fields in the compressed
header. Since zero bit fields don't actually appear, they can be
omitted.
Using this new encoding on the above header, we get:
Uncompressed header: 0101000100010000
Compressed header: 0100010001000
Which reduces the amount of data we need to transmit by roughly 20%.
However, this encoding fails to take advantage of relationships
between values of a field in one packet and its value in subsequent
packets. For example, every header in the following sequence is
compressed by the same amount despite the similarities between them:
Uncompressed header: 0101000100010000
Compressed header: 0100010001000
Uncompressed header: 0101000101000000
Compressed header: 0100010100000
Uncompressed header: 0110000101110000
Compressed header: 1000010111000
A.4 Inter-packet compression
The profile we have defined so far has not compressed the sequence
number or flow ID fields at all, since they can take any value.
However the value of these fields in one header has a very simple
relationship to their value in previous headers:
o the sequence number is unusual, it increases by three each time,
o the flow_id stays the same, it always has the same value that it
did in the previous header in the flow,
o the abc_flag_bits stay the same most of the time, they usually
have the same value that they did in the previous header in the
flow,
An obvious way of notating this is as follows:
Finking & Pelletier Expires December 25, 2005 [Page 39]
Internet-Draft ROHC-FN June 2005
% This obvious encoding will not work (correct encoding below)
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
co_format_obvious = type, %[ 2 ]
abc_flag_bits %[ 3 ]
{
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
sequence_no ::= lsb(0,-3);
abc_flag_bits ::= irregular(3);
reserved_flag ::= uncompressed_value(1,0);
};
};
The dependency on previous packets is notated using the static and
LSB encoding methods (see Section 4.7.4 and Section 4.7.5
respectively).
However there are a few problems with the above notation. Firstly,
and most importantly, the flow_id field is notated as "static" which
means that it doesn't change from packet to packet. However, the
notation does not indicate how to communicate the value of the field
initially. It's all very well saying "it's the same value as last
time", but there must have been a first time where we define what
that value is, so that it can be referred back to. The above
notation provides no way of communicating that. Similarly with the
sequence number - there needs to be a way of communicating its
initial value.
Secondly, the sequence number field is communicated very efficiently
in zero bits, but it is not at all robust against packet loss. If a
packet is lost then there is no way to handle the missing sequence
number.
Finally, although the flag bits are usually the same as in the
previous header in the flow, the profile doesn't make any use of this
fact; since they are sometimes not the same as those in the previous
header, it is not safe to say that they are always the same, so
static encoding can't be used exclusively. We solve all three of
these problems below, robustness first since it is simplest, and the
Finking & Pelletier Expires December 25, 2005 [Page 40]
Internet-Draft ROHC-FN June 2005
remainder in the following section.
When communicating sequence numbers, or any other field encoding with
LSB encoding, a very important consideration for the notator is how
robust against packet loss the compressed protocol should be. This
will vary a lot from protocol stack to protocol stack. For example
RTP has a high setup cost, so the compressed stream needs to be
robust against fairly high packet loss. Things are different with
TCP, where robustness to loss of just a few packets is sufficient.
For the example protocol we'll assume short, low overhead flows and
say we need to be robust to the loss of just one packet, which we can
achieve with two bits of LSB encoding (one bit isn't enough since the
sequence number increases by three each time - see Section 4.7.5 ).
A.5 Multiple Packet Formats
To communicate initial values for the sequence number and flow ID
fields, and to take advantage of the fact that the flag bits are
usually the same as in the previous header, we need to depart from
the single packet format encoding we are currently using and instead
use multiple packet formats:
Finking & Pelletier Expires December 25, 2005 [Page 41]
Internet-Draft ROHC-FN June 2005
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
co_format_irregular = discriminator, %[ 1 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits %[ 3 ]
{
discriminator ::= '0';
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= irregular(4);
sequence_no ::= irregular(4);
abc_flag_bits ::= irregular(3);
reserved_flag ::= uncompressed_value(1,0);
};
co_format_compressed = discriminator, %[ 1 ]
type, %[ 2 ]
sequence_no %[ 2 ]
{
discriminator ::= '1';
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
sequence_no ::= lsb(2,-3);
abc_flag_bits ::= static;
reserved_flag ::= uncompressed_value(1,0);
};
};
Note that we have had to add a discriminator field, so that the
decompressor can tell which packet format has been used by the
compressor. The format with a static flow ID and LSB encoded
sequence number, is now 5 bits long, a saving of over 60% on the size
of the single packet format, almost a 70% saving on the size of the
uncompressed header. Note that despite having to add the
discriminator field, this format is still the same size as the
original incorrect naive notation, because this notation takes
advantage of the fact that the abc flag bits rarely change.
Finking & Pelletier Expires December 25, 2005 [Page 42]
Internet-Draft ROHC-FN June 2005
However, the original packet format (with an irregular flow ID and
sequence number) has also grown by one bit due to the addition of the
discriminator. An important consideration when creating multiple
packet formats is whether each format occurs frequently enough that
the average compressed header length is shorter as a result of its
usage. For example, if in fact the flag bits always changed between
packet headers, the static encoding could never be used; all we would
have achieved is to lengthen the irregular packet format by one bit.
Using the above notation, we now get:
Uncompressed header: 0101000100010000
Compressed header: 00100010001000
Uncompressed header: 0101000101000000
Compressed header: 10100 ; 00100010100000
Uncompressed header: 0110000101110000
Compressed header: 11011 ; 01000010111000
The first header in the stream is compressed the same way as before,
except that it now has the extra 1 bit discriminator at the start
(0). When a second header arrives, with the same flow ID as the
first and its sequence number three higher, it can now be compressed
in two possible ways, either using "co_format_compressed" or in the
same way as previously, using "co_format_irregular".
Note that we show all possible encodings of a packet as defined by
the ROHC-FN specification, separated by semi-colons. Either of the
above encodings for the packet could be produced by a valid
implementation, although a good implementation would always aim to
make the compressed header size as small as possible and an optimum
implementation would pick the encoding which led to the best
compression of the entire packet stream (which is not necessarily the
smallest encoding for a particular packet).
A.6 Variable Length Discriminators
Suppose we do some analysis on flows of our example protocol and
discover that whilst it is usual for successive packets to have the
same flags, on the occasions when they don't, the packet is almost
always a "flags set" packet, in which all three of the abc flags are
set. To encode the flow more efficiently a packet format needs to be
written to reflect this.
This now gives a total of three packet formats, which means we need
Finking & Pelletier Expires December 25, 2005 [Page 43]
Internet-Draft ROHC-FN June 2005
three discriminators to differentiate between them. The obvious
solution here is to increase the number of bits in the discriminator
from one to two and for example use discriminators 00, 01, and 10.
However we can do slightly better than this.
Any uniquely identifiable discriminator will suffice, so we can use
00, 01 and 1. If the discriminator starts with 1, that's the whole
thing. If it starts with 0 the decompressor knows it has to check
one more bit to determine the packet kind.
Note that care must be taken when using variable length
discriminators. For example it would be erroneous to use 0, 01 and
10 as discriminators since after reading an initial 0, the
decompressor would have no way of knowing if the next bit was a
second bit of discriminator, or the first bit of the next field in
the packet stream. 0, 10 and 11 however would be OK as the first bit
again indicates whether or not there are further discriminator bits
to follow.
This gives us the following:
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
co_format_irregular = discriminator, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits %[ 3 ]
{
discriminator ::= '00';
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= irregular(4);
sequence_no ::= irregular(4);
abc_flag_bits ::= irregular(3);
reserved_flag ::= uncompressed_value(1,0);
};
co_format_flags_set = discriminator, %[ 2 ]
type, %[ 2 ]
sequence_no %[ 2 ]
Finking & Pelletier Expires December 25, 2005 [Page 44]
Internet-Draft ROHC-FN June 2005
{
discriminator ::= '01';
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
sequence_no ::= lsb(2,-3);
abc_flag_bits ::= uncompressed_value(3,7);
reserved_flag ::= uncompressed_value(1,0);
};
co_format_flags_static = discriminator, %[ 1 ]
type, %[ 2 ]
sequence_no %[ 2 ]
{
discriminator ::= '1';
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
sequence_no ::= lsb(2,-3);
abc_flag_bits ::= static;
reserved_flag ::= uncompressed_value(1,0);
};
};
Here is some example output:
Uncompressed header: 0101000100010000
Compressed header: 000100010001000
Uncompressed header: 0101000101000000
Compressed header: 10100 ; 000100010100000
Uncompressed header: 0110000101110000
Compressed header: 11011 ; 001000010111000
Uncompressed header: 0111000110101110
Compressed header: 011110 ; 001100011010111
Here we have a very similar sequence to last time, except that there
is now an extra message on the end which has the flag bits set. The
encoding for the first message in the stream is now one bit larger,
the encoding for the next two messages is the same as before, since
that packet format has not grown, thanks to the use of variable
length discriminators. Finally the packet that comes through with
all the flag bits set can be encoded in just six bits, only one bit
Finking & Pelletier Expires December 25, 2005 [Page 45]
Internet-Draft ROHC-FN June 2005
more than the most common packet format. Without the extra packet
format, this last packet would have to be encoded using the longest
packet format and would have taken up 14 bits. This represents a
saving of almost 60% for this kind of packet.
A.7 Default encoding
There is some redundancy in the notation used so far. For a number
of fields, the same encoding method is used several times in
different formats, but the field encoding is redefined explicitly
each time. If the encoding for any of these fields changed in the
future (e.g. if the reserved flag took on some new role), then every
packet format would have to be modified to reflect this change.
This problem can be avoided by specifying default encoding methods
for these fields. Doing so also leads to a more concisely notated
profile:
Finking & Pelletier Expires December 25, 2005 [Page 46]
Internet-Draft ROHC-FN June 2005
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
default_methods =
{
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
sequence_no ::= lsb(2,-3);
reserved_flag ::= uncompressed_value(1,0);
};
co_format_irregular = discriminator, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits %[ 3 ]
{
discriminator ::= '00';
flow_id ::= irregular(4); % overrides default
sequence_no ::= irregular(4); % overrides default
abc_flag_bits ::= irregular(3);
};
co_format_flags_set = discriminator, %[ 2 ]
type, %[ 2 ]
sequence_no %[ 2 ]
{
discriminator ::= '01';
abc_flag_bits ::= uncompressed_value(3,7);
};
co_format_flags_static = discriminator, %[ 1 ]
type, %[ 2 ]
sequence_no %[ 2 ]
{
discriminator ::= '1';
abc_flag_bits ::= static;
};
};
The above profile behaves in exactly the same way as the one notated
Finking & Pelletier Expires December 25, 2005 [Page 47]
Internet-Draft ROHC-FN June 2005
previously, since it has the same meaning. Note that the purposes
behind the different formats become clearer with the default encoding
methods factored out; all that remains are the encodings which are
specific to each format. Note also that default encoding methods
which compress down to zero bits have become completely implicit.
For example the compressed formats mention "version_no" neither in
their field order lists (no need, it's zero bits long) nor their
field encodings lists (no need it's specified in the default encoding
methods).
A.8 Control Fields
One inefficiency in the compression scheme we have produced thus far
is that it uses two bits to provide the LSB encoded sequence number
with robustness for the loss of just one packet. In theory only one
bit should be needed. The root of the problem is the unusual
sequence number that the protocol uses - it counts up in increments
of three. In order to encode it at maximum efficiency we need to
translate this into a field that increments by one each time. We do
this using a control field.
Control fields are extra data that are communicated in the compressed
packet, which are not direct encodings of fields in the uncompressed
header. They can be used to communicate extra information in the
compressed packet, which allows other fields to be compressed more
efficiently.
The control field which we introduce scales the sequence number down
by a factor of three. Instead of encoding the original sequence
number in the compressed packet, we encode the scaled sequence
number, allowing us to have robustness to the loss of one packet by
using just one bit of LSB encoding:
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
control_fields = scaled_seq_no
{
% need modulo maths to calculate scaling correctly,
% due to 4 bit wrap around
let(scaled_seq_no:uncomp_value
== ((mod(15 - sequence_no:uncomp_value, 3) * 16
Finking & Pelletier Expires December 25, 2005 [Page 48]
Internet-Draft ROHC-FN June 2005
+ sequence_no:uncomp_value) / 3));
};
default_methods =
{
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
reserved_flag ::= uncompressed_value(1,0);
scaled_seq_no ::= lsb(1,-1);
};
co_format_irregular = discriminator, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
scaled_seq_no, %[ 4 ]
abc_flag_bits %[ 3 ]
{
discriminator ::= '00';
flow_id ::= irregular(4); % overrides default
scaled_seq_no ::= irregular(4); % overrides default
abc_flag_bits ::= irregular(3);
};
co_format_flags_set = discriminator, %[ 2 ]
type, %[ 2 ]
scaled_seq_no %[ 1 ]
{
discriminator ::= '01';
abc_flag_bits ::= uncompressed_value(3,7);
};
co_format_flags_static = discriminator, %[ 1 ]
type, %[ 2 ]
scaled_seq_no %[ 1 ]
{
discriminator ::= '1';
abc_flag_bits ::= static;
};
};
Here is some example output:
Uncompressed header: 0101000100010000
Compressed header: 000100011011000
Uncompressed header: 0101000101000000
Finking & Pelletier Expires December 25, 2005 [Page 49]
Internet-Draft ROHC-FN June 2005
Compressed header: 1010 ; 000100011100000
Uncompressed header: 0110000101110000
Compressed header: 1101 ; 001000011101000
Uncompressed header: 0111000110101110
Compressed header: 01110 ; 001100011110111
In this form, we see that this gives us a saving of a further bit in
most packets, reducing the size of the most common packet by 20%.
Assuming the bulk of a flow is made up of "co_format_flags_static"
headers, the mean size of the headers in a compressed flow is now
just over a quarter of their size in an uncompressed flow.
A.9 Use Of "let" Statements As Conditionals
Earlier, we created a new packet format, "co_format_flags_set" to
handle packets with all three of the flags bits set. As it happens
these three flags are always all set for "type 3" packets, and are
never all set for other types of packet (a "type 3" packet is one
where the type field is set to three).
This allows extra efficiency in encoding such packets. We know the
type is three, so we don't need to encode the type field in the
compressed header. The type field was previously encoded as
"irregular(2)" which is two bits long. Removing this reduces the
compressed size of the "co_format_flags_set" header from five bits to
three, making it the smallest packet format in the structure.
In order to notate that the "flags set" format should only be used
for "type 3" headers, and the "flags static" format only when the
type isn't three it is necessary to state these conditions inside
each format. This can be done with a "let" statement:
eg_header ===
{
uc_format = version_no, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
sequence_no, %[ 4 ]
abc_flag_bits, %[ 3 ]
reserved_flag; %[ 1 ]
control_fields = scaled_seq_no %[ 4 ]
{
% need modulo maths to calculate scaling correctly,
Finking & Pelletier Expires December 25, 2005 [Page 50]
Internet-Draft ROHC-FN June 2005
% due to 4 bit wrap around
scaled_seq_no ::= uncompressed_value
(4, (mod(15 - sequence_no:uncomp_value, 3) * 16
+ sequence_no:uncomp_value) / 3);
};
default_methods =
{
version_no ::= uncompressed_value(2,1);
type ::= irregular(2);
flow_id ::= static;
reserved_flag ::= uncompressed_value(1,0);
scaled_seq_no ::= lsb(1,-1);
};
co_format_irregular = discriminator, %[ 2 ]
type, %[ 2 ]
flow_id, %[ 4 ]
scaled_seq_no, %[ 4 ]
abc_flag_bits %[ 3 ]
{
discriminator ::= '00';
flow_id ::= irregular(4);
scaled_seq_no ::= irregular(4);
abc_flag_bits ::= irregular(3);
};
co_format_flags_set = discriminator, %[ 2 ]
scaled_seq_no %[ 1 ]
{
let(type:uncomp_value == 3); % redundant condition
discriminator ::= '01';
type ::= uncompressed_value(2,3);
abc_flag_bits ::= uncompressed_value(3,7);
};
co_format_flags_static = discriminator, %[ 1 ]
type, %[ 2 ]
scaled_seq_no %[ 1 ]
{
let(type:uncomp_value != 3);
discriminator ::= '1';
abc_flag_bits ::= static;
};
};
The two "let" statements in the latter two formats act as "guards".
Guards prevent packet formats from being used under the wrong
Finking & Pelletier Expires December 25, 2005 [Page 51]
Internet-Draft ROHC-FN June 2005
circumstances. In fact the "let" statement in "co_format_flags_set"
is redundant. The condition it guards for is already enforced by the
new encoding method used for the "type" field. The encoding method
"uncompressed_value(2,3)" binds the "uncomp_value" attribute to
three. This is exactly what the "let" statement does, so it can be
removed without any change in meaning. The "uncompressed_value"
encoding method on the other hand is not redundant. It specifies
other bindings on the type field in addition to the one which the
"let" statement specifies. Therefore it would not be possible to
remove the encoding method and leave just the "let" statement.
Note that a guard is solely preventative. A guard can never force a
packet format to be chosen by the compressor. A format can only be
guaranteed to be chosen in a given situation if there are no other
formats which can be used instead. This is demonstrated in the
example output below. The compressor can still choose the irregular
format if it wishes:
Uncompressed header: 0101000100010000
Compressed header: 000100011011000
Uncompressed header: 0101000101000000
Compressed header: 1010 ; 000100011100000
Uncompressed header: 0110000101110000
Compressed header: 1101 ; 001000011101000
Uncompressed header: 0111000110101110
Compressed header: 010 ; 001100011110111
This saves just two extra bits in the example flow, having reduced
the size of the "flags set" format by 40%.
Finking & Pelletier Expires December 25, 2005 [Page 52]
Internet-Draft ROHC-FN June 2005
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Finking & Pelletier Expires December 25, 2005 [Page 53]