SIPCLF                                                      G. Salgueiro
Internet-Draft                                             Cisco Systems
Intended status: Standards Track                              V. Gurbani
Expires: March 16, 2012                        Bell Labs, Alcatel-Lucent
                                                             A. B. Roach
                                                                 Tekelec
                                                      September 13, 2011


Format for the Session Initiation Protocol (SIP) Common Log Format (CLF)
                      draft-ietf-sipclf-format-02

Abstract

   The SIPCLF Workgroup has defined a common log format framework for
   Session Initiation Protocol (SIP) servers.  This common log format
   mimics the wildly successful event logging mechanism found in well-
   known web servers like Apache and web proxies like Squid.  This
   document proposes an indexed text encoding format for the SIP Common
   Log Format (CLF) that retains the key advantages of a text-based
   format, while significantly increasing processing performance over a
   purely text-based implementation.  This file format adheres to the
   SIP CLF data model and provides an effective encoding scheme for all
   mandatory and optional fields that appear in a SIP CLF record.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on March 16, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal



Salgueiro, et al.        Expires March 16, 2012                 [Page 1]


Internet-Draft             Format for SIP CLF             September 2011


   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Document Conventions . . . . . . . . . . . . . . . . . . . . .  4
   4.  Format . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.1.  Index Pointers . . . . . . . . . . . . . . . . . . . . . .  8
     4.2.  Mandatory Fields . . . . . . . . . . . . . . . . . . . . . 10
     4.3.  Optional Fields  . . . . . . . . . . . . . . . . . . . . . 13
       4.3.1.  Pre-Defined Optional Fields  . . . . . . . . . . . . . 14
       4.3.2.  Vendor-Specific Optional Fields  . . . . . . . . . . . 16
   5.  Example SIP CLF Record . . . . . . . . . . . . . . . . . . . . 18
   6.  Text Tool Considerations . . . . . . . . . . . . . . . . . . . 20
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 21
   8.  Operational Guidance . . . . . . . . . . . . . . . . . . . . . 21
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 22
     11.2. Informative References . . . . . . . . . . . . . . . . . . 22
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23




















Salgueiro, et al.        Expires March 16, 2012                 [Page 2]


Internet-Draft             Format for SIP CLF             September 2011


1.  Introduction

   The extensive list of benefits and the widespread adoption of the
   Apache Common Log Format (CLF) has prompted the development of a
   functionally equivalent event logging mechanism for the Session
   Initiation Protocol [RFC3261] (SIP).  Implementing a logging scheme
   for SIP is a considerable challenge.  This is due in part to the fact
   that the behavior of a SIP entity is more complex as compared to an
   HTTP entity.  Additionally, there are shortcomings to the purely
   text-based HTTP Common Log Format that need to be addressed in order
   to allow for real-time inspection of SIP log files.  Experience with
   Apache Common Log Format has shown that dealing with large quantities
   of log data can be very processor intensive, as doing so necessarily
   requires reading and parsing every byte in the log file(s) of
   interest.

   An implementation independent framework for the SIP CLF has been
   defined in [I-D.ietf-sipclf-problem-statement].  This memo describes
   an indexed text file format for logging SIP messages received and
   sent by SIP clients, servers, and proxies that adheres to the data
   model presented in Section 8 of [I-D.ietf-sipclf-problem-statement].
   This document defines a format that is no more difficult to generate
   by logging entities, while being radically faster to process.  In
   particular, the format is optimized for both rapidly scanning through
   log records, as well as quickly locating commonly accessed data
   fields.

   Further, the format proposed by this document retains the key
   advantage of being human readable and able to be processed using the
   various Unix text processing tools, such as sed, awk, perl, cut, and
   grep.


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   [RFC3261] defines additional terms used in this document that are
   specific to the SIP domain such as "proxy"; "registrar"; "redirect
   server"; "user agent server" or "UAS"; "user agent client" or "UAC";
   "back-to-back user agent" or "B2BUA"; "dialog"; "transaction";
   "server transaction".

   This document uses the term "SIP Server" that is defined to include
   the following SIP entities: user agent server, registrar, redirect
   server, a SIP proxy in the role of user agent server, and a B2BUA in



Salgueiro, et al.        Expires March 16, 2012                 [Page 3]


Internet-Draft             Format for SIP CLF             September 2011


   the role of a user agent server.

   The reader is expected to be familiar with the terminology and
   concepts defined in [I-D.ietf-sipclf-problem-statement].


3.  Document Conventions

   This document defines the logging syntax for the SIP CLF.  This
   syntax is demonstrated through the use of various examples.  The
   formatting described here does not permit these examples to be
   unambiguously rendered due to the constraints imposed by the
   formatting rules for Internet-Drafts.  To avoid ambiguity and to meet
   the Internet-Draft layout requirements this document uses the
   <allOneLine/> markup convention established in [RFC4475].

   For the sake of clarity and completeness, the entire text defining
   this markup convention from Section 2.1 of [RFC4475] is quoted below:

      Several of these examples contain unfolded lines longer than 72
      characters.  These are captured between <allOneLine/> tags.  The
      single unfolded line is reconstructed by directly concatenating
      all lines appearing between the tags (discarding any line feeds or
      carriage returns).  There will be no whitespace at the end of
      lines.  Any whitespace appearing at a fold-point will appear at
      the beginning of a line.

      The following represent the same string of bits:

         Header-name: first value, reallylongsecondvalue, third value

         <allOneLine>
         Header-name: first value,
          reallylongsecondvalue
         , third value
         </allOneLine>

         <allOneLine>
         Header-name: first value,
          reallylong
         second
         value,
          third value
         </allOneLine>

      Note that this is NOT SIP header-line folding, where different
      strings of bits have equivalent meaning.




Salgueiro, et al.        Expires March 16, 2012                 [Page 4]


Internet-Draft             Format for SIP CLF             September 2011


   The ip addresses used in the examples in this document adhere to the
   best practices outlined in [RFC5735] and correspond to the
   documentation address block 192.0.2.0/24 (TEST-NET-1) as described in
   [RFC5737].


4.  Format

   The Common Log Format for the Session Initiation Protocol
   [I-D.ietf-sipclf-problem-statement] defines a data model to which
   this logging format format adheres.  Each SIP CLF record MUST consist
   of all the mandatory data model elements outlined in Section 8.1 of
   [I-D.ietf-sipclf-problem-statement].

   All SIP CLF records MUST have the following format:


        0          7 8        15 16       23 24         31
        +-----------+-----------+-----------+-----------+
        |  Version  |           Record Length           | 0 - 3
        +-----------+-----------+-----------+-----------+
        |       Record Length (cont)        |    0x2C   | 4 - 7
        +-----------+-----------+-----------+-----------+
        |            Flags Field            |    0x2C   | 8 - 11
        +-----------+-----------+-----------+-----------+
        |              CSeq Pointer (Hex)               | 12 - 15
        +-----------+-----------+-----------+-----------+
        |      Response Status-Code Pointer (Hex)       | 16 - 19
        +-----------+-----------+-----------+-----------+
        |              R-URI Pointer (Hex)              | 20 - 23
        +-----------+-----------+-----------+-----------+
        |   Destination IP address:port Pointer (Hex)   | 24 - 27
        +-----------+-----------+-----------+-----------+
        |     Source IP address:port Pointer (Hex)      | 28 - 31
        +-----------+-----------+-----------+-----------+
        |             To URI Pointer (Hex)              | 32 - 35
        +-----------+-----------+-----------+-----------+
        |             To Tag Pointer (Hex)              | 36 - 39
        +-----------+-----------+-----------+-----------+
        |            From URI Pointer (Hex)             | 40 - 43
        +-----------+-----------+-----------+-----------+
        |            From Tag Pointer (Hex)             | 44 - 47
        +-----------+-----------+-----------+-----------+
        |             Call-Id Pointer (Hex)             | 48 - 51
        +-----------+-----------+-----------+-----------+
        |           Server-Txn Pointer (Hex)            | 52 - 55
        +-----------+-----------+-----------+-----------+
        |           Client-Txn Pointer (Hex)            | 56 - 59



Salgueiro, et al.        Expires March 16, 2012                 [Page 5]


Internet-Draft             Format for SIP CLF             September 2011


        +-----------+-----------+-----------+-----------+
        |            TLV Start Pointer (Hex)            | 60 - 63
        +-----------+-----------+-----------+-----------+
        |    0x0A   |                                   | 64 - 67
        +-----------+                                   +
        |                   Timestamp                   | 68 - 71
        +                                   +-----------+
        |                                   |    0x2E   | 72 - 75
        +-----------+-----------+-----------+-----------+
        |         Fractional Seconds        |    0x09   | 76 - 79
        +-----------+-----------+-----------+-----------+
        |                                               |
        |                                               |
        |      Mandatory Fields (variable length)       |
        |                                               |
        |                                               |
        +-----------+-----------+-----------+-----------+\
        |    0x09   |             Tag (Hex)             | \
        +-----------+-----------+-----------+-----------+  \   Repeated
        | Tag (cont)|    0x2C   |     Length (Hex)      |   \  as many
        +-----------+-----------+-----------+-----------+    > times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/
        |    0x09   |                                   |\
        +-----------+                                   | \
        |          Vendor-ID (variable length)          |  \
        +           +-----------+-----------+-----------+   \  Repeated
        |           |    0x2C   |     Length (Hex)      |    \ as many
        +-----------+-----------+-----------+-----------+    / times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/
        |    0x0A   |
        +-----------+


                      Figure 1: SIP Common Log Format

   The format presented in Figure 1 is for a single SIP CLF log entry.
   While there is no actual subdivision in practice, this format can be
   logically subdivided into the following three distinct components:







Salgueiro, et al.        Expires March 16, 2012                 [Page 6]


Internet-Draft             Format for SIP CLF             September 2011


      1.  Index Pointers - The first 64-bytes of this format.  This
      portion is primarily composed of a list of pointers that indicate
      the beginning of both the variable length mandatory and optional
      fields that are logged as part of this record.  These pointers are
      implemented as a mechanism to improve processing of these records
      and to allow a reader to expeditiously skip right to the desired
      field without unnecessarily going through the entire record.  This
      logical subdivision within the SIP CLF format will be referenced
      in this document with the <IndexPointers> tag.

      2.  Mandatory Fields - The next logical grouping in this format is
      a tab delimited listing of the mandatory fields as described in
      Section 8.1 of [I-D.ietf-sipclf-problem-statement] and in the
      order listed in <IndexPointers>.  This logical subdivision within
      the SIP CLF format will be referenced in this document with the
      <MandatoryFields> tag.

      3.  Optional Fields - The last logical component MAY be present as
      it is an OPTIONAL extension to the SIP CLF format.  Its purpose is
      to provide flexibility to the developer of this SIP CLF to log any
      desired fields not included in <MandatoryFields>.  This includes
      SIP bodies and any vendor-specific extensions.  This logical
      subdivision within the SIP CLF format will be referenced in this
      document with the <OptionalFields> tag.

   This logical structure of the SIP CLF record format can be
   graphically represented as shown in Figure 2 below:


                                 <IndexPointers>
                                 <MandatoryFields>
                                 <OptionalFields>


             Figure 2: Logical Structure of the SIP CLF Record

   Note that Figure 1 and Figure 2 plus the terminating line-feed are
   different representations of the same format but are functionally
   equivalent.

   In the following sections note that indications of "hexadecimal
   encoded" indicate that the value is to be written out in human-
   readable base-16 numbers using the ASCII characters 0x30 through 0x39
   ('0' through '9') and 0x41 through 0x46 ('A' through 'F').
   Similarly, indications of "decimal encoded" indicate that the value
   is to be written out in human readable base-10 number using the ASCII
   characters 0x30 through 0x39 ('0' through '9').  In both encodings,
   numbers always take up the number of bytes indicated, and are padded



Salgueiro, et al.        Expires March 16, 2012                 [Page 7]


Internet-Draft             Format for SIP CLF             September 2011


   on the left with ASCII '0' characters to fill the entire space.

4.1.  Index Pointers

   The <IndexPointers> portion of the SIP CLF record (shown in Figure 3)
   is a 64-byte header that indicates meta-data about the record.


            0          7 8        15 16       23 24         31
            +-----------+-----------+-----------+-----------+
            |  Version  |           Record Length           | 0 - 3
            +-----------+-----------+-----------+-----------+
            |       Record Length (cont)        |    0x2C   | 4 - 7
            +-----------+-----------+-----------+-----------+
            |            Flags Field            |    0x2C   | 8 - 11
            +-----------+-----------+-----------+-----------+
            |              CSeq Pointer (Hex)               | 12 - 15
            +-----------+-----------+-----------+-----------+
            |      Response Status-Code Pointer (Hex)       | 16 - 19
            +-----------+-----------+-----------+-----------+
            |              R-URI Pointer (Hex)              | 20 - 23
            +-----------+-----------+-----------+-----------+
            |   Destination IP address:port Pointer (Hex)   | 24 - 27
            +-----------+-----------+-----------+-----------+
            |     Source IP address:port Pointer (Hex)      | 28 - 31
            +-----------+-----------+-----------+-----------+
            |             To URI Pointer (Hex)              | 32 - 35
            +-----------+-----------+-----------+-----------+
            |             To Tag Pointer (Hex)              | 36 - 39
            +-----------+-----------+-----------+-----------+
            |            From URI Pointer (Hex)             | 40 - 43
            +-----------+-----------+-----------+-----------+
            |            From Tag Pointer (Hex)             | 44 - 47
            +-----------+-----------+-----------+-----------+
            |             Call-Id Pointer (Hex)             | 48 - 51
            +-----------+-----------+-----------+-----------+
            |           Server-Txn Pointer (Hex)            | 52 - 55
            +-----------+-----------+-----------+-----------+
            |           Client-Txn Pointer (Hex)            | 56 - 59
            +-----------+-----------+-----------+-----------+
            |            TLV Start Pointer (Hex)            | 60 - 63
            +-----------+-----------+-----------+-----------+


                         Figure 3: Index Pointers






Salgueiro, et al.        Expires March 16, 2012                 [Page 8]


Internet-Draft             Format for SIP CLF             September 2011


   The fields that make up <IndexPointers> are described below:

   Version (1 byte):  0x41 for this document; hexadecimal encoded.

   Record Length (6 bytes):  Hexadecimal encoded total length of this
      log record, including "Version", "Record Length", "Flags" fields
      and terminating line-feed.

   Flags Field (3 bytes):

      byte 1 -   Request/Response flag

         R = request
         r = response

      byte 2 -   Retransmission flag

         o = original transmission
         d = duplicate transmission
         s = server is stateless [i.e., retransmissions are not
         detected]

      byte 3 -   Sent/Received flag

         u = received UDP mesage
         t = received TCP mesage
         l = received TLS mesage
         U = sent UDP mesage
         T = sent TCP mesage
         L = sent TLS mesage

   Bytes 12 through 59 contain hexadecimal encoded pointers that point
   to the starting location of each of the variable-length mandatory
   fields.  Note that there are no delimiters between these pointer
   values -- they are packed together as a single, 52-character
   hexadecimal encoded string.  The "Pointer" fields indicate absolute
   byte values within the record, and MUST be >=80.  They point to the
   start of the corresponding value within the <MandatoryFields>
   portion.  A description of each of the mandatory fields that these
   pointer values point to can be found in Section 4.2.

   TLV Start Pointer:  This final pointer indicates the location within
      the SIP CLF record where the OPTIONAL Tag/Length/Value (TLV)
      groups of <OptionalFields> begin, if present.  The "TLV Start
      Pointer" points to the ASCII tab (0x09) character for the first
      entry in the <OptionalFields> portion.  If the OPTIONAL TLV groups
      are not implemented, then the "TLV Start Pointer" field MUST be
      set to zero (0x0000).



Salgueiro, et al.        Expires March 16, 2012                 [Page 9]


Internet-Draft             Format for SIP CLF             September 2011


4.2.  Mandatory Fields

   The <MandatoryFields> portion of the SIP CLF record is shown below:


            0          7 8        15 16       23 24         31
            +-----------+-----------+-----------+-----------+
            |    0x0A   |                                   | 64 - 67
            +-----------+                                   +
            |                   Timestamp                   | 68 - 71
            +                                   +-----------+
            |                                   |    0x2E   | 72 - 75
            +-----------+-----------+-----------+-----------+
            |         Fractional Seconds        |    0x09   | 76 - 79
            +-----------+-----------+-----------+-----------+
            |                                               |
            |                                               |
            |      Mandatory Fields (variable length)       |
            |                                               |
            |                                               |
            +-----------+-----------+-----------+-----------+


                        Figure 4: Mandatory Fields

   Following the pointers in <IndexPointers>, two fixed-length fields
   are encoded to specify the exact time of the log entry.  As before,
   all fields are completely filled, pre-pending values with '0'
   characters as necessary.

   Timestamp (10 bytes):  Date and time of the request or response
      represented as the number of seconds since the Unix epoch (i.e.
      seconds since midnight, January 1st, 1970, GMT).  Decimal encoded.

   Fractional Seconds (3 bytes):  Fractional seconds portion of the
      Timestamp field to millisecond accuracy.  Decimal encoded.

   After the "Timestamp" and Fractional Seconds" fields are the actual
   values for the mandatory fields specified in Section 8.1 of
   [I-D.ietf-sipclf-problem-statement], which are described below:

   CSeq:  The Command Sequence header field, including the CSeq number
      and method name.

   Response Status-Code:  Set to the value of the SIP response status
      code for responses.  Set to a single ASCII dash (0x2D) for
      requests.




Salgueiro, et al.        Expires March 16, 2012                [Page 10]


Internet-Draft             Format for SIP CLF             September 2011


   R-URI:  The Request-URI in the start line (mandatory in request),
      including any URI parameters.

   Destination IP address:port  The IP address of the downstream server,
      including the port number.  The port number MUST be separated from
      the IP address by a single ':'.

   Source IP address:port  The IP address of the upstream client,
      including the port number over which the SIP message was received.
      The port number MUST be separated from the IP address by a single
      ':'.

   To URI:  Value of the URI in the To header field.

   To Tag:  Value of the tag parameter (if present) in the To header
      field.

   From URI:  Value of the URI in the From header field.

   From Tag:  Value of the tag parameter in the From header field.

   Whilst one may question the value of the From URI in light of
   [RFC4474], the From URI, nonetheless, imparts some information.  For
   one, the From tag is important and, in the case of a REGISTER
   request, the From URI can provide information on whether this was a
   third-party registration or a first-party one.

   Call-Id:  The value of the Call-ID header field.

   Server-Txn:  Server transaction identification code - the transaction
      identifier associated with the server transaction.
      Implementations can reuse the server transaction identifier (the
      topmost branch-id of the incoming request, with or without the
      magic cookie), or they could generate a unique identification
      string for a server transaction (this identifier needs to be
      locally unique to the server only.)  This identifier is used to
      correlate ACKs and CANCELs to an INVITE transaction; it is also
      used to aid in forking.  (See Section 9.4 of
      [I-D.ietf-sipclf-problem-statement] for usage.)

   Client-Txn:  Client transaction identification code - this field is
      used to associate client transactions with a server transaction
      for forking proxies or B2BUAs.  Upon forking, implementations can
      reuse the value they inserted into the topmost Via header's branch
      parameter, or they can generate a unique identification string for
      the client transaction.  (See Section 9.4 of
      [I-D.ietf-sipclf-problem-statement] for usage.)




Salgueiro, et al.        Expires March 16, 2012                [Page 11]


Internet-Draft             Format for SIP CLF             September 2011


   This data MUST appear in the order listed in <IndexPointers>, and
   each field MUST be present.  Fields are separated by a single ASCII
   tab character (0x09).  Any tab characters present in the data to be
   written will be replaced by an ASCII space character (0x20) prior to
   being logged.

   Table 1 of Section 8.2 of [I-D.ietf-sipclf-problem-statement]
   summarizes how the mandatory fields are logged by the various SIP
   entities.  This illustrates the fact that there are instances when a
   given mandatory field is not applicable for logging in the SIP CLF
   because it does not make sense based on the role the entity is
   playing in the SIP ecosystem.  In such circumstances, if a given
   mandatory field is not present then that empty field MUST be encoded
   as a single horizontal dash ("-").

   In the event that a field failed to parse it MUST be encoded as a
   single question mark ("?").  If these characters are part of a
   sequence of other characters, then there is no ambiguity.  If the
   field being logged contains only one character, and that character is
   the literal "-", the implementation SHOULD insert an escaped %2D for
   that field in the SIP CLF record.  Similarly, if the field contains
   only one character, and that character is the literal "?", the
   implementation SHOULD insert an escaped %3F for that field in the SIP
   CLF record.



























Salgueiro, et al.        Expires March 16, 2012                [Page 12]


Internet-Draft             Format for SIP CLF             September 2011


4.3.  Optional Fields

   The <OptionalFields> portion of the SIP CLF record is shown below:


        0          7 8        15 16       23 24         31
        +-----------+-----------+-----------+-----------+\
        |    0x09   |             Tag (Hex)             | \
        +-----------+-----------+-----------+-----------+  \   Repeated
        | Tag (cont)|    0x2C   |     Length (Hex)      |   \  as many
        +-----------+-----------+-----------+-----------+    > times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/
        |    0x09   |                                   |\
        +-----------+                                   | \
        |          Vendor-ID (variable length)          |  \
        +           +-----------+-----------+-----------+   \  Repeated
        |           |    0x2C   |     Length (Hex)      |    \ as many
        +-----------+-----------+-----------+-----------+    / times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/


                         Figure 5: Optional Fields

   Optional fields are those SIP message elements that are not a part of
   the mandatory fields list detailed in Section 8.1 of
   [I-D.ietf-sipclf-problem-statement].  After the <MandatoryFields>
   section, there are two OPTIONAL Tag/Length/Value groups (shown in
   Figure 5) that appear zero or more times.  These two TLV groups
   provide extensibility to the SIP CLF.  They allow SIP CLF
   implementers the flexibility to extend the logging capability of the
   indexed-ASCII representation beyond just the mandatory log elements
   described in Section 8.1 of [I-D.ietf-sipclf-problem-statement].  The
   location of the start of <OptionalFields> within the SIP CLF record
   is indicated by the "TLV Start Pointer" field in <IndexPointers>.

   There are two possible methods to log optional fields.  One is
   through a pre-defined list of optional elements presented in
   Section 4.3.1 of this document.  All other optional fields that do
   not appear in the list of pre-defined optional fields MUST be logged
   using the vendor-specific extension mechanism outlined in
   Section 4.3.2.




Salgueiro, et al.        Expires March 16, 2012                [Page 13]


Internet-Draft             Format for SIP CLF             September 2011


4.3.1.  Pre-Defined Optional Fields

   The pre-defined optional fields portion of <OptionalFields> is shown
   below:


        0          7 8        15 16       23 24         31
        +-----------+-----------+-----------+-----------+\
        |    0x09   |             Tag (Hex)             | \
        +-----------+-----------+-----------+-----------+  \   Repeated
        | Tag (cont)|    0x2C   |     Length (Hex)      |   \  as many
        +-----------+-----------+-----------+-----------+    > times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/


                   Figure 6: Pre-Defined Optional Fields

   Logging any of the pre-defined optional SIP elements below MUST be
   done according to the TLV format shown in Figure 6.  The fields used
   to log these pre-defined optional fields are defined below:

   Tag Field (4 bytes):  Indicates the type of value coded by this TLV;
      hexadecimal encoded.  Currently defined tags are:

      0x0000 - Contact Header
         Contains the entire value of Contact header field

      0x0001 - Remote Host
         The DNS name of the IP address from which the message was
         received (if "Sent/Received flag" is set to "u,t,l").  The DNS
         name of the IP address to which the message is being sent (if
         "Sent/Received flag" is set to "U,T,L")

      0x0002 - Authenticated User
         Logs the user name by which the user has been authenticated

      0x0003 - Complete SIP Message
         Contains complete SIP message.  Can be repeated multiple times
         to accommodate SIP messages that exceed 65535 bytes in length.









Salgueiro, et al.        Expires March 16, 2012                [Page 14]


Internet-Draft             Format for SIP CLF             September 2011


      0x0004 - SIP Message Body
         Logs SIP message bodies with the following body types:

         (1)  Session Description Protocol (SDP) [RFC4566] (Content-
              Type: application/sdp)

         (2)  Extensible Markup Language (XML) [W3C.REC-xml-20081126]
              payloads (Content-Type: application/*+xml)

         (3)  binary (Content-Type: application/{isup,qsig})

         (4)  miscellaneous text content (Content-Type: message/sipfrag,
              message/http, text/plain, ...)

         In this TLV (with Tag=0x0004), the associated "Value" field is
         populated with the Content-Type itself plus the SIP message
         body separated with a linear white space (LWS) separator.  In
         this manner, everything about all four body types is self-
         described using a single tag as compared to enumerating a
         separate tag for each body type.  Additionally the
         corresponding "Length" field includes the SIP message body, the
         length of the embedded Content-Type, and the LWS separator
         between the MIME type and the body content.  Note that binary
         bodies would have to be byte encoded to render them in the
         ASCII file.

         An example of an SDP body to be logged as an optional field:


           v=0
           o=alice 2890844526 2890844526 IN IP4 host.example.com
           s=-
           c=IN IP4 host.example.com
           t=0 0
           m=audio 49170 RTP/AVP 0 8 97


         This body has a Content-Type of application/sdp and is of
         length of 123 including all the line-feeds.  When logging this
         body the "Value" field is composed of the Content-Type and the
         body separated by a LWS, which gives it a combined length of
         139 (0x8B).  The TLV used to log this SIP body as an optional
         field would look like:








Salgueiro, et al.        Expires March 16, 2012                [Page 15]


Internet-Draft             Format for SIP CLF             September 2011


           <allOneLine>
           0004,008B,application/sdp v=0\r\no=alice 2890844526
           2890844526 IN IP4 host.example.com\r\ns=-\r\n
           c=IN IP4 host.example.com\r\nt=0 0\r\n
           m=audio 49170 RTP/AVP 0 8 97\r\n
           </allOneLine>


         Note that the octets in the "Value" field are all logically on
         one line and the line-feeds are escaped using \r\n to delimit
         the lines.

         TODO: Is it necessary that we document an escape mechanism for
         line-feeds for both logging bodies and complete SIP messages?
         If we agree on \r\n we need to think about how to represent
         \r\n in a text-based body.

   Length Field (4 bytes):  Indicates the length of the value coded in
      this TLV, hexadecimal encoded.  This length does NOT include the
      TLV header.

   Value Field (0 to 65535 bytes):  Contains the actual value of this
      TLV.  As with the mandatory fields, ASCII Tab characters (0x09)
      are replaced with ASCII space characters (0x20).

4.3.2.  Vendor-Specific Optional Fields

   The vendor-specific optional fields portion of <OptionalFields> is
   shown below:


        0          7 8        15 16       23 24         31
        +-----------+-----------+-----------+-----------+
        |    0x09   |                                   |\
        +-----------+                                   | \
        |          Vendor-ID (variable length)          |  \
        +           +-----------+-----------+-----------+   \  Repeated
        |           |    0x2C   |     Length (Hex)      |    \ as many
        +-----------+-----------+-----------+-----------+    / times as
        |     Length (cont)     |    0x2C   |           |   /  necessary
        +-----------+-----------+-----------+           +  /
        |            Value (variable length)            | /
        +-----------+-----------+-----------+-----------+/


                 Figure 7: Vendor-Specific Optional Fields





Salgueiro, et al.        Expires March 16, 2012                [Page 16]


Internet-Draft             Format for SIP CLF             September 2011


   The pre-defined list of optionally logged fields is a very limited
   set of some of the most useful and commonly logged SIP elements that
   fall outside the range of the mandatory fields presented in Section
   8.1 of [I-D.ietf-sipclf-problem-statement].  To make the SIP CLF
   fully extensible and customizable to implementers, the notion of
   vendor-specific optional fields is introduced.  This mechanism
   extends the logging capabilities of SIP CLF to include any element of
   a SIP message that a vendor deems necessary.  This vendor-specific
   extension to the SIP CLF has a TLV-like syntax and intentionally
   mimics the general format described in Section 4.2 for the pre-
   defined optional fields.

   Vendor-ID (0 to 65535 bytes):  The Vendor-ID has a similar purpose as
      the "Tag" field defined in Section 4.3.1.  That is, a unique
      identifier for the optional fields being logged.  The optional
      fields logged via vendor-specific extensions MUST NOT be any of
      the pre-defined optional fields detailed in Section 4.3.1.

      Format for the Vendor-ID is similar to the second format detailed
      in Section 6.3.2 of the Syslog protocol [RFC5424] for SD-ID names.
      The syntax for the Vendor-ID is name@<private enterprise number>,
      e.g., "ourVendorID@32473".  Formatting rules defining the
      Vendor-ID is quoted almost verbatim from Section 6.3.2 of
      [RFC5424]:

         The format of the part preceding the at-sign is not specified;
         however, these names MUST be printable US-ASCII strings, and
         MUST NOT contain an at-sign ('@', ABNF %d64), an equal-sign
         ('=', ABNF %d61), a closing brace (']', ABNF %d93), a quote-
         character ('"', ABNF %d34), whitespace, or control characters.
         The part following the at-sign MUST be a private enterprise
         number as specified in Section 7.2.2 of [RFC5424].  Please note
         that throughout this document the value of 32473 is used for
         all private enterprise numbers.  This value has been reserved
         by IANA to be used as an example number in documentation
         according to [RFC5612].

      Implementers of the Vendor-ID will need to use their own private
      enterprise number from the complete current list of private
      enterprise numbers [PEN] maintained by IANA.  Usage of the
      Vendor-ID allows vendor-specific customization of the SIP CLF
      beyond those pre-defined optional fields defined in Section 4.3.1.

   Length Field (4 bytes):  Indicates the length of only the "Value"
      field in this vendor-specified extension, hexadecimal encoded.






Salgueiro, et al.        Expires March 16, 2012                [Page 17]


Internet-Draft             Format for SIP CLF             September 2011


   Value Field (0 to 65535 bytes):  Contains the actual value of this
      vendor-specific optional field.  As with the mandatory fields,
      ASCII Tab characters (0x09) are replaced with ASCII space
      characters (0x20).


5.  Example SIP CLF Record

   The following SIP message is an INVITE request sent by a SIP client:










































Salgueiro, et al.        Expires March 16, 2012                [Page 18]


Internet-Draft             Format for SIP CLF             September 2011


       INVITE sip:192.0.2.10 SIP/2.0
       To: <sip:192.0.2.10>
       Call-ID: DL70dff590c1-1079051554@example.com
       <allOneLine>
       From: "Alice" <sip:1001@example.com:5060>;
       tag=DL88360fa5fc;epid=0x34619b0
       </allOneLine>
       CSeq: 1 INVITE
       Max-Forwards: 70
       <allOneLine>
       Via: SIP/2.0/TCP 192.0.2.200:5060;
       branch=z9hG4bK-1f6be070c4-DL
       </allOneLine>
       Contact: "1001" <sip:1001@192.0.2.200:5060>
       <allOneLine>
       Allow: INVITE,CANCEL,ACK,OPTIONS,INFO,SUBSCRIBE,NOTIFY,BYE,
       MESSAGE,UPDATE,REFER
       </allOneLine>
       Supported: replaces,norefersub
       User-Agent: Some Vendor
       Content-Type: application/sdp
       Content-Length: 418

       v=0
       o=1001 1456139204 0 IN IP4 192.0.2.200
       s=-
       c=IN IP4 192.0.2.200
       b=AS:2048
       t=0 0
       m=audio 13756 RTP/AVP 0 101
       a=rtpmap:0 PCMU/8000
       a=rtpmap:101 telephone-event/8000
       a=fmtp:101 0-16
       a=x-mpdp:192.0.2.200:13756
       m=video 13758 RTP/AVP 96
       a=rtpmap:96 H264/90000
       <allOneLine>
       a=fmtp:96 profile-level-id=420015; max-mbps=47520; max-fs=1584;
       max-dpb=7680
       </allOneLine>
       a=x-mpdp:192.0.2.200:13758










Salgueiro, et al.        Expires March 16, 2012                [Page 19]


Internet-Draft             Format for SIP CLF             September 2011


   Shown below is approximately how this message would appear as a
   single record in a SIP CLF logging file if encoded according to the
   syntax described in this document.  Due to internet-draft
   conventions, this log entry has been split into seven lines, instead
   of the two lines that actually appear in a log file; and the tab
   characters have been padded out using spaces to simulate their
   appearance in a text terminal.


       <allOneLine>
       A0000FC,Rou,
       0051005A005C006B007B008D009C009E00B800C500E900F30000
       </allOneLine>
       <allOneLine>
       0000000000.010  1 INVITE        -       sip:192.0.2.10
       192.0.2.10:5060 192.0.2.200:56485       sip:192.0.2.10
       -       sip:1001@example.com:5060       DL88360fa5fc
       DL70dff590c1-1079051554@example.com     server-tx
       </allOneLine>


   A Base64 encoded version of this log entry (without the changes
   required to format it for an internet-draft) is shown below:


       begin-base64 644 clf_record
       QTAwMDBGQyxSb3UsMDA1MTAwNUEwMDVDMDA2QjAwN0IwMDhEMDA5QzAwOUUwMEI4
       MDBDNTAwRTkwMEYzMDAwMAowMDAwMDAwMDAwLjAxMAkxIElOVklURQktCXNpcDox
       OTIuMC4yLjEwCTE5Mi4wLjIuMTA6NTA2MAkxOTIuMC4yLjIwMDo1NjQ4NQlzaXA6
       MTkyLjAuMi4xMAktCXNpcDoxMDAxQGV4YW1wbGUuY29tOjUwNjAJREw4ODM2MGZh
       NWZjCURMNzBkZmY1OTBjMS0xMDc5MDUxNTU0QGV4YW1wbGUuY29tCXNlcnZlci10
       eAljbGllbnQtdHgK
       ====




6.  Text Tool Considerations

   This format has been designed to allow text tools to easily process
   logs without needing to understand the indexing format.  Index lines
   may be rapidly discarded by checking the first character of the line:
   index lines will always start with an alphabetical character, while
   field lines will start with a numerical character.







Salgueiro, et al.        Expires March 16, 2012                [Page 20]


Internet-Draft             Format for SIP CLF             September 2011


   Within a field line, script tools can quickly split fields at the tab
   characters.  The first 12 fields are positional, and the meaning of
   any subsequent fields can be determined by checking the first four
   characters of the field.  Alternately, these non-positional fields
   can be located using a regular expression.  For example, the "Contact
   value" in a request can be found by searching for the perl regex
   /\t0000,....,([^\t]*)/.

   Note also that requests can be distinguished from responses by
   checking the third positional field -- for requests, it will always
   be set to "000"; any other value indicates a response.


7.  Security Considerations

   This document does not introduce any new security considerations
   beyond those discussed in [I-D.ietf-sipclf-problem-statement].


8.  Operational Guidance

   SIP CLF log files will take up substantive amount of disk space
   depending on traffic volume at a processing entity and the amount of
   information being logged.  As such, any enterprise using SIP CLF
   should establish operational procedures for file rollovers as
   appropriate to the needs of the organization.

   Listing such operational guidelines in this document is out of scope
   for this work.


9.  IANA Considerations

   This document does not require any considerations from IANA.


10.  Acknowledgements

   The authors of this document would like to acknowledge and thank
   Peter Musgrave for his support, guidance, and continued invaluable
   feedback.

   This work benefited from the discussions and invaluable input by the
   various members of the SIPCLF working group.  These include Brian
   Trammell, Eric Burger, Cullen Jennings, Benoit Claise, Saverio
   Niccolini, Dan Burnett.  Special thanks to Hadriel Kaplan, Chris
   Lonvick, Paul E. Jones, John Elwell for their constructive comments,
   suggestions, and reviews that were critical to the formulation and



Salgueiro, et al.        Expires March 16, 2012                [Page 21]


Internet-Draft             Format for SIP CLF             September 2011


   refinement of this draft.

   Thanks to Anders Nygren for his early implementation, insight, and
   reviews of the SIP CLF format.


11.  References

11.1.  Normative References

   [I-D.ietf-sipclf-problem-statement]
              Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O.
              Festor, "The Common Log Format (CLF) for the Session
              Initiation Protocol (SIP)",
              draft-ietf-sipclf-problem-statement-07 (work in progress),
              June 2011.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
              A., Peterson, J., Sparks, R., Handley, M., and E.
              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
              June 2002.

   [RFC5424]  Gerhards, R., "The Syslog Protocol", RFC 5424, March 2009.

11.2.  Informative References

   [PEN]      IANA, "Private Enterprise Numbers",
              http://www.iana.org/assignments/enterprise-numbers , 2009.

   [RFC4474]  Peterson, J. and C. Jennings, "Enhancements for
              Authenticated Identity Management in the Session
              Initiation Protocol (SIP)", RFC 4474, August 2006.

   [RFC4475]  Sparks, R., Hawrylyshen, A., Johnston, A., Rosenberg, J.,
              and H. Schulzrinne, "Session Initiation Protocol (SIP)
              Torture Test Messages", RFC 4475, May 2006.

   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
              Description Protocol", RFC 4566, July 2006.

   [RFC5612]  Eronen, P. and D. Harrington, "Enterprise Number for
              Documentation Use", RFC 5612, August 2009.

   [RFC5735]  Cotton, M. and L. Vegoda, "Special Use IPv4 Addresses",
              BCP 153, RFC 5735, January 2010.



Salgueiro, et al.        Expires March 16, 2012                [Page 22]


Internet-Draft             Format for SIP CLF             September 2011


   [RFC5737]  Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks
              Reserved for Documentation", RFC 5737, January 2010.

   [W3C.REC-xml-20081126]
              Maler, E., Yergeau, F., Paoli, J., Sperberg-McQueen, C.,
              and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth
              Edition)", World Wide Web Consortium Recommendation REC-
              xml-20081126, November 2008,
              <http://www.w3.org/TR/2008/REC-xml-20081126>.


Authors' Addresses

   Gonzalo Salgueiro
   Cisco Systems
   7200-12 Kit Creek Road
   Research Triangle Park, NC  27709
   US

   Email: gsalguei@cisco.com


   Vijay Gurbani
   Bell Labs, Alcatel-Lucent
   1960 Lucent Lane
   Rm 9C-533
   Naperville, IL  60563
   US

   Email: vkg@bell-labs.com


   Adam Roach
   Tekelec
   17210 Campbell Rd.
   Suite 250
   Dallas, TX  75252
   US

   Email: adam@nostrum.com











Salgueiro, et al.        Expires March 16, 2012                [Page 23]