TCP Maintenance and Minor                                  M. Kuehlewind
Extensions (tcpm)                                University of Stuttgart
Internet-Draft                                     R. Scheffenegger, Ed.
Intended status: Standards Track                            NetApp, Inc.
Expires: September 8, 2011                                 March 7, 2011


        Additional negotiation in the TCP Timestamp Option field
                        during the TCP handshake
           draft-scheffenegger-tcpm-timestamp-negotiation-00

Abstract

   RFC 1323 defines the TSecr field of a SYN packet to be not valid and
   thus this field will always be zero.  This documents specifies the
   use of this field to signal and negotiate additional information
   about the content of the TSopt field as well as the behavior of the
   receiver.  If the receiver understands this extension, it will use
   the TSecr field of the SYN/ACK to reply.  Otherwise the receiver will
   ignore the TSecr field and set a timestamp in the TSecr field as
   specified in RFC 1323.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 8, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents



Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 1]


Internet-Draft            Timestamp Negotiation               March 2011


   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . . . 4
   2.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
   3.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 5
   4.  Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
     4.1.  Capability Flags  . . . . . . . . . . . . . . . . . . . . . 5
   5.  Discussion  . . . . . . . . . . . . . . . . . . . . . . . . . . 8
   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 8
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 9
   8.  Security Considerations . . . . . . . . . . . . . . . . . . . . 9
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . . . 9
     9.1.  Normative References  . . . . . . . . . . . . . . . . . . . 9
     9.2.  Informative References  . . . . . . . . . . . . . . . . . . 9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . . 9




























Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 2]


Internet-Draft            Timestamp Negotiation               March 2011


1.  Introduction

   The TCP Timestamps Option (TSopt) provides timestamp echoing for
   Round-trip Time (RTT) measurments.  TSopt is widely deployed and
   activated by default in many systems.  RFC 1323 [RFC1323] specifies
   TSopt the following way:

         Kind: 8

         Length: 10 bytes

         +-------+-------+---------------------+---------------------+
         |Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply (TSecr)|
         +-------+-------+---------------------+---------------------+
             1       1              4                     4

                               RFC1323 TSopt

      "The Timestamps option carries two four-byte timestamp fields.
      The Timestamp Value field (TSval) contains the current value of
      the timestamp clock of the TCP sending the option.

      The Timestamp Echo Reply field (TSecr) is only valid if the ACK
      bit is set in the TCP header; if it is valid, it echos a timestamp
      value that was sent by the remote TCP in the TSval field of a
      Timestamps option.  When TSecr is not valid, its value must be
      zero.  The TSecr value will generally be from the most recent
      Timestamp option that was received; however, there are exceptions
      that are explained below.

      A TCP may send the Timestamps option (TSopt) in an initial SYN
      segment (i.e., segment containing a SYN bit and no ACK bit), and
      may send a TSopt in other segments only if it received a TSopt in
      the initial SYN segment for the connection."

   The comparison of the timestamp in the TSecr field to the current
   time gives an estimation of the RTT.  RFC 1323 [RFC1323] specifies
   various cases when more than one timestamp is available to echo.  The
   proposed solution might not always be the best choice, e.g. when the
   TCP Selective Acknowledgment Option (SACK) is used.  Moreover, more
   and more use cases arise where one-way delay (OWD) measurements are
   needed.  These mechanism misuse usually the TSopt to estimated the
   variation in OWD.  To enable such mechanisms the TSecr field in the
   TCP SYN packet could be used for additional negotiation.







Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 3]


Internet-Draft            Timestamp Negotiation               March 2011


1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].














































Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 4]


Internet-Draft            Timestamp Negotiation               March 2011


2.  Overview


3.  Definitions

   The reader is expected to be familiar with the definitions given in
   [RFC1323].


4.  Signaling

   During the initial TCP three-way handshake, timestamp options are
   negotiated using the TSecr field.  A compliant TCP receiver will XOR
   the flags with the received TSval, when responding with the SYN+ACK.
   Timestamp Options MAY only be present when the SYN bit is set.

4.1.  Capability Flags

   In order to signal the supported capabilities, the TSecr is
   overloaded with the following flags and fields during the three-way
   handshake.  If optional capabilities such as tcp clock range are
   presented, minimal state will be required in the host to decode the
   returned Flags xor'ed with the TSval.

        Kind: 8

        Length: 10 bytes

        +-------+-------+---------------------+---------------------+
        |Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply (TSecr)|
        +-------+-------+---------------------+---------------------+
            1       1              4          |           4         |
                                             /                      |
        .-----------------------------------'                       |
       /                                                             \
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |E|R|R|B|M|     | EXP12hi |  FRAC12hi   | EXP12lo |  FRAC12lo   |
      |X|E|N|I|I| MSK +-----------------------+-----------------------+
      |O|S|G|A|R|     |      RES      |S|  EXP16  |      FRAC16       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                          timestamp option flags








Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 5]


Internet-Draft            Timestamp Negotiation               March 2011


   EXO - Extended Options
       Indicated that the sender supports extended timestamp options as
       defined by this document, and MUST be set ("1") by compliant
       implementations.

   RES - Reserved
       Reserved for future use.  MUST not be set ("0").  If a timestamp
       option is received with this bit set, the receiver MUST ignore
       the extended options field and react as if the Flags were not set
       (compatibility mode).

   RNG - Range negotiation
       Indicated that the sender is capable of adjusting the timestamp
       clock rate within the bounds of the two 12 bit fields (see ).
       Only the active sender of a TCP session is allowed to offer a
       range, while the receiver MAY choose a rate within these bounds.

   BIA - Exponent Bias
       When set, the 16 and 12 bit floating point exponents are
       presented with a bias of 21 instead of 15.  This allows
       negotiation of extremely fine-grained timestamp clock
       resolutions, for example in hardware implementations and high
       speed (>10 Gigabit/s) environments.  See section for more
       details.

   MIR - Always Mirror Timestamp
       To disambiguate segements and aid timing calculations even during
       loss episodes, the timestamp will always be mirrored regardless
       of the state of the receiver.  A sender SHOULD use this option
       only in conjunction with Selective Acknowledgements (SACK
       [RFC2018]).

   MSK - Mask Timestamps
       If the timestamp is used for congestion control purposes, an
       incentive exists for malicious receivers to reflect tampered
       timestamps.  A sender MAY choose to protect timestamps from such
       modifications by including a fingerprint (secure hash of some
       kind) in some of the least significant bits.  However, doing so
       would prevent a receiver from using the timestamp for other
       purposes.  The MASK field indicates how many least significant
       nibbles should be excluded by the receiver, when processing a
       timestamp.  Note that this does not impact the reflected
       timestamp in any way - TSecr will always be equal to a
       appropriate TSval.  Another use case would be when the sender
       does not support a timestamp clock which can guarantee unique
       timestamps for retransmitted segments.  For unambigously
       identifying regular from retransmitted segments, the timestamp
       must be unique for otherwise identical segments.  Reserving the



Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 6]


Internet-Draft            Timestamp Negotiation               March 2011


       lowest nibble for this purpose allows senders with slow running
       timestamp clocks to make use of this feature.

   S - binary16 Sign
       This is the sign bit of the IEEE 754-2008 binary16 floating point
       representation of the timestamp clock.  Timestamp clocks MUST be
       positive, thus this bit MUST be zero.

   EXP16 - binary16 Exponent
       The exponent component of a binary16 floating point number
       indicating the timestamp clock.  When BIA is not set, the
       exponent bias is 15 (identical to the binary16 definition in IEEE
       754-2008).  If OFF is set, the exponent bias is 21, allowing
       faster timestamp clock rates.  Subnormal numbers (lower
       precision), where the exponent is zero, extend the range to 2^-24
       and 2^-30 respectively.  Infinity and NaN (all exponent bits set)
       MUST NOT be invalid, and a timestamp option with NaN/Infinity
       SHOULD be ignored.

   FRAC16 - binary16 Fraction
       The fraction component of a binary16 floating point number
       indicating the timestamp clock.  The clock rate is measured in
       seconds between ticks.  The least significant bit corresponds
       therefore to a time interval of 59.6 ns with the default bias of
       15, and 0.931 ns with bias set to 21.  The longest time interval
       would be 65504 sec with default bias, and 511.75 sec with bias
       set to 21.

   EXP12hi  and

   EXP12lo - binary12 Exponent
       The exponent component of a truncated, 12 bit floating point
       number indicating the possible timestamp clock ranges.  Only the
       host initiating a TCP session MAY offer a timestamp clock range,
       while the receiver SHOULD select a timestamp clock within these
       bounds.  If the receiver can not adjust it's timestamp clock to
       match the range, it MAY use a timestamp clock rate outside these
       bounds.  If the receiver indicated a timestamp clock rate within
       the indicated bounds, the sender MUST set it's timestamp clock
       rate to the negotiated rate.  If the receiver uses a timestamp
       clock rate outside the indicated bounds, it MUST NOT use
       timestamps where knowledge of the timestamp clock rate is
       required (ie. congesion control).  The exponent bias is 15 when
       BIA is not set, and 21 otherwise.







Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 7]


Internet-Draft            Timestamp Negotiation               March 2011


   FRAC12hi  and

   FRAC12lo - binary12 Fraction
       The fraction component of a 12 bit floating point number.
       Subnormal numbers are allowed, while Inifinity/NaN MUST NOT be
       used.  Timestamp options with Infinity/NaN values SHOULD be
       ignored.  The smallest representable value is 238 ns with default
       bias, and 3.73 ns with bias set to 21, while the largest values
       would be virtually identical to the 16 bit floating point values
       (65024 and 508 sec).


5.  Discussion

   One-way delay (variation) based congestion controls would benefit
   from knowing the clock resolution on both sides.

   RTT variance during loss episodes is not deeply researched.  Current
   heuristics (RFC1122, RFC1323, Karn's algorithm, RFC2988) explicitly
   exclude (and prevent) the use of RTT samples when loss occurs.
   However, solving the retransmission ambiguity problem - and the
   related reliable ACK delivery problem - may allow the refinement of
   these algorithms further, as well as enabling new research to
   distinguish between corruption loss (without RTT / one-way delay
   impact) and congestion loss (with RTT / one-way delay impact).
   Research into this field appears to be a rather neglected, especially
   when it comes to large scale, public internet investigations.  Due to
   the very nature of this, passive investigations without signals
   contained within the headers are only of limited use in empirical
   research.

   Retransmission ambiguity detection during loss recovery would allow
   an additional level of loss recovery control without reverting to
   timer-based methods.  As with the deployment of SACK, separating
   "what" to send from "when" to send it could be driven one step
   further.  In particular, less conservative loss recovery schemes
   which do not trade principles of packet conservation against
   timeliness, require a reliable way of prompt and best possible
   feedback from the receiver about any delivered segment and their
   ordering.  SACK alone goes quite a long way, but using Timestamp
   information in addition could remove any ambiguity.  However, the
   current specs in RFC1323 make that use impossible, thus a modified
   signaling (receiver behavior) is a necessity.


6.  Acknowledgements

   The authors would like to thank Dragana Damjanovic for some initial



Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 8]


Internet-Draft            Timestamp Negotiation               March 2011


   thoughts around Timestamps and their extended potential use.


7.  IANA Considerations

   This memo includes no request to IANA.


8.  Security Considerations

   The algorithm presented in this paper shares security considerations
   with [RFC1323].

   Some implementations address the vulerabilities of [RFC1323], by
   dedicating a few low-order bits of the timestamp fields for use with
   a (secure) hash, that protects against malicious tweaking of TSecr
   values.  A Flag-field has been provided to transparently notify the
   receiver about that use of low-order bits, so that they can be
   excluded in one-way delay calculations.


9.  References

9.1.  Normative References

   [RFC1323]  Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
              for High Performance", RFC 1323, May 1992.

   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
              Selective Acknowledgment Options", RFC 2018, October 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

9.2.  Informative References

   [Chirp]    Kuehlewind, M. and B. Briscoe, "Chirping for Congestion
              Control -  Implementation Feasibility", Nov 2010, <http://
              bobbriscoe.net/projects/netsvc_i-f/chirp_pfldnet10.pdf>.

   [I-D.ietf-tcpm-tcp-security]
              Gont, F., "Security Assessment of the Transmission Control
              Protocol (TCP)", draft-ietf-tcpm-tcp-security-02 (work in
              progress), January 2011.







Kuehlewind & Scheffenegger  Expires September 8, 2011           [Page 9]


Internet-Draft            Timestamp Negotiation               March 2011


Authors' Addresses

   Mirja Kuehlewind
   University of Stuttgart
   Pfaffenwaldring 47
   Stuttgart  70569
   Germany

   Email: mirja.kuehlewind@ikr.uni-stuttgart.de


   Richard Scheffenegger (editor)
   NetApp, Inc.
   Am Euro Platz 2
   Vienna,   1120
   Austria

   Phone: +43 1 3676811 3146
   Email: rs@netapp.com
































Kuehlewind & Scheffenegger  Expires September 8, 2011          [Page 10]