Network Working Group                                            X. Xiao
Internet Draft                                                 E. Crabbe
                                                               A. Hannan
                                                   Frontier Globalcenter
                                                               V. Paxson
                                                              ACIRI/ICSI
Expiration Date: March 2000                               September 1999

               TCP Processing of the IP Precedence Field
                      <draft-xiao-tcp-prec-00.txt>

1. Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This document will expire in March 2000.

2. Abstract

   This draft describes a potential conflict between TCP [RFC793] and
   DiffServ [RFC2475] on the use of the three leftmost bits in the TOS
   octet of an IPv4 header [RFC791] or the Traffic Class octet of an
   IPv6 header [RFC2460]. In a network that contains DiffServ capable
   nodes, such a conflict can cause failures in establishing TCP
   connections or can cause some established TCP connections to be reset
   undesirably. This draft proposes a modification to TCP for resolving
   the conflict.

Xiao et al                                                      [Page 1]


ID                  TCP and the IP Precedence Field       September 1999

3. Introduction

   In TCP, each connection has a set of states associated with it. Such
   states are reflected by a set of variables stored in the TCP Control
   Block (TCB) of both ends. Such variables may include the local and
   remote socket number, precedence of the connection, security level
   and compartment, etc.  Both ends must agree on the setting of the
   precedence and security parameters in order to establish a connection
   and keep it open.

   There is no field in the TCP header that indicates the precedence of
   a segment. Instead, the precedence field in the header of the IP
   packet is used as the indication.  The security level and compartment
   are likewise carried in the IP header, but as IP options rather than
   a fixed header field.  Because of this difference, the problem with
   precedence discussed in this memo does not apply to them.

   TCP requires that the precedence (and security parameters) of a
   connection must remain unchanged during the lifetime of the
   connection. Therefore, for an established TCP connection with
   precedence, the receipt of a segment with different precedence
   indicates an error. The connection must be reset [RFC793, page 37].

   With the advent of DiffServ, intermediate nodes may modify the
   Differentiated Services Codepoint (DSCP) [RFC2474] of the IP header
   to indicate the desired Per-hop Behavior (PHB) [RFC2475, RFC2597,
   RFC2598]. The DSCP includes the three bits formerly known as the
   precedence field.  Because any modification to those three bits will
   be considered illegal by endpoints that are precedence-aware, they
   may cause failures in establishing connections, or may cause
   established connections to be reset.

4. Terminology

   Segment: the unit of data that TCP sends to IP

   Precedence Field: the three leftmost bits in the TOS octet of an IPv4
   header or the Traffic Class octet of an IPv6 header. Note that in
   DiffServ, these three bits may or may not used to denote the
   precedence of the IP packet.

   TOS Field: bits 3-6 in the TOS octet of IPv4 header [RFC 1349].

   MBZ field: Must Be Zero

   The structure of the TOS octet is depicted below:

Xiao et al                                                      [Page 2]


ID                  TCP and the IP Precedence Field       September 1999

                   0     1     2     3     4     5     6     7
                +-----+-----+-----+-----+-----+-----+-----+-----+
                |   PRECEDENCE    |          TOS          | MBZ |
                +-----+-----+-----+-----+-----+-----+-----+-----+

   Traffic Class Octet: the TOS octet's counterpart in IPv6

   DS Field: the TOS octet of an IPv4 header, or the Traffic Class octet
   of an IPv6 header, is renamed the Differentiated Services (DS) Field
   by DiffServ.

   The structure of the DS field is depicted below:

                  0   1   2   3   4   5   6   7
                +---+---+---+---+---+---+---+---+
                |         DSCP          |  CU   |
                +---+---+---+---+---+---+---+---+

   DSCP: Differentiated Service Code Point, the leftmost 6 bits in the
   DS field.

   CU:   currently unused.

   Per-hop Behavior (PHB): a description of the externally observable
   forwarding treatment applied at a differentiated services-compliant
   node to a behavior aggregate.

5. Problem Description

   The manipulation of the DSCP to achieve the desired PHB by DiffServ-
   capable nodes may conflict with TCP's use of the precedence field.
   This conflict can potentially cause problems for TCP implementations
   that conform to RFC 793.  First, page 36 of RFC 793 states:

        If the connection is in any non-synchronized state (LISTEN,
        SYN-SENT, SYN-RECEIVED), and the incoming segment acknowledges
        something not yet sent (the segment carries an unacceptable
        ACK), or if an incoming segment has a security level or compart-
        ment which does not exactly match the level and compartment
        requested for the connection, a reset is sent. If our SYN has
        not been acknowledged and the precedence level of the incoming
        segment is higher than the precedence level requested then
        either raise the local precedence level (if allowed by the user
        and the system) or send a reset; or if the precedence level of
        the incoming segment is lower than the precedence level
        requested then continue as if the precedence matched exactly (if
        the remote TCP cannot raise the precedence level to match ours

Xiao et al                                                      [Page 3]


ID                  TCP and the IP Precedence Field       September 1999

        this will be detected in the next segment it sends, and the con-
        nection will be terminated then). If our SYN has been ack-
        nowledged (perhaps in this incoming segment) the precedence
        level of the incoming segment must match the local precedence
        level exactly, if it does not a reset must be sent.

   This leads to Problem #1:  For a precedence-aware TCP module, if dur-
   ing TCP's synchronization process, the precedence fields of the SYN
   and/or ACK packets are modified by the intermediate nodes, resulting
   in the received ACK packet having a different precedence from the
   precedence picked by this TCP module, the TCP connection cannot be
   established, even if both modules actually agree on an identical pre-
   cedence for the connection.

   Then, on page 37, RFC 793 states:

        If the connection is in a synchronized state (ESTABLISHED, FIN-
        WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT),
        ...  [unrelated statements snipped] If an incoming segment has a
        security level, or compartment, or precedence which does not
        exactly match the level, and compartment, and precedence
        requested for the connection, a reset is sent and connection
        goes to the CLOSED state.

   This leads to Problem #2:  For a precedence-aware TCP module, if the
   precedence field of a received segment from an established TCP con-
   nection has been changed en route by the intermediate nodes so as to
   be different from the precedence specified during the connection
   setup, the TCP connection will be reset.

   Each of problems #1 and #2 has a mirroring problem. They cause TCP
   connections that must be reset according to RFC 793 not to be reset.

   Problem #3:  A TCP connection may be established between two TCP
   modules that pick different precedence, because the precedence fields
   of the SYN and ACK packets are modified by intermediate nodes,
   resulting in both modules thinking that they are in agreement for the
   precedence of the connection.

   Problem #4:  A TCP connection has been established normally by two
   TCP modules that pick the same precedence. But in the middle of the
   data transmission, one of the TCP modules changes the precedence of
   its segments. According to RFC 793, the TCP connection must be reset.
   In a DiffServ-capable environment, if the precedence of the segments
   is altered by intermediate nodes such that it retains the expected
   value when arriving at the other TCP module, the connection will not
   be reset.

Xiao et al                                                      [Page 4]


ID                  TCP and the IP Precedence Field       September 1999

6. Proposed Modification to TCP

   The proposed modification to TCP is that TCP must ignore the pre-
   cedence of all received segments. More specifically:

   (1) In TCP's synchronization process, the TCP modules at both ends
   must ignore the precedence fields of the SYN and ACK packets. The TCP
   connection will be established if all the conditions specified by RFC
   793 are satisfied except the precedence of the connection.

   (2) After a connection is established, each end sends segments with
   its desired precedence. The two ends' precedence may be the same or
   may be different (because precedence is ignored during connection
   setup time). The precedence fields may be changed by the intermediate
   nodes too. They will be ignored by the other end. The TCP connection
   will not be reset.

   Problems #1 and #2 are solved by this proposed modification. Problems
   #3 and #4 become non-issues because TCP must ignore the precedence.
   In a DiffServ-capable environment, the two cases described in prob-
   lems #3 and #4 should be allowed.

   The proposed modification to TCP is in conformance with TCP's design
   philosophy. In RFC 793, page 36, it is stated that:

        As a general rule, reset (RST) must be sent whenever a segment
        arrives which apparently is not intended for the current connec-
        tion.  A reset must not be sent if it is not clear that this is
        the case.

   With the deployment of DiffServ, the precedence field can be modified
   by intermediate network nodes. A change in the precedence of a
   received segment does not necessarily indicate that the segment is
   not intended for the connection. Therefore, a RST must not be sent
   based solely on the change of the precedence of a received segment,
   no matter whether the TCP connection is in a non-synchronized state
   or synchronized state.

7. Security Considerations

   The RST generation rules given on page 37 of RFC 793 appear to indi-
   cate that the reception of any segment with an incorrect precedence
   field terminates a connection.  If this is true regardless of the
   correctness of the sequence numbers in the segment's header, then the
   RFC 793 rules present a serious denial-of-service threat, as all an
   attacker must do to terminate a connection is guess the port numbers
   and then send two segments with different precedence values; one of

Xiao et al                                                      [Page 5]


ID                  TCP and the IP Precedence Field       September 1999

   them is certain to terminate the connection.  Accordingly, the change
   to TCP processing proposed in this memo would yield a significant
   gain in terms of TCP resilience.

   On the other hand, the stricter processing rules of RFC 793 in prin-
   ciple make TCP spoofing attacks more difficult, as the attacker must
   not only guess the victim TCP's initial sequence number, but also its
   precedence setting.

   Finally, the security issues of each PHB group are addressed in the
   PHB group's specification [RFC2597, RFC2598].

8. References

   [RFC791]
        Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981.

   [RFC793]
        Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
        September 1981.

   [RFC1349]
        Almquist, P., "Type of Service in the Internet Protocol Suite",
        RFC 1349, July 1992.

   [RFC2460]
        Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
        Specification", RFC 2460, December 1998.

   [RFC2474]
        Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of
        the Differentiated Services Field (DS Field) in the IPv4 and
        IPv6 Headers", RFC 2474, December 1998.

   [RFC2475]
        Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z. and W.
        Weiss, "An Architecture for Differentiated Services", RFC 2475,
        December 1998.

   [RFC2597]
        Heinanen, J., Baker, F., Weiss, W. and J. Wroclawski, "Assured
        Forwarding PHB Group", RFC 2587, June 1999.

   [RFC2598]
        Jacobson, V., Nichols, K. and K. Poduri, "An Expedited

Xiao et al                                                      [Page 6]


ID                  TCP and the IP Precedence Field       September 1999

        Forwarding PHB", RFC 2598, June 1999.

9. Authors' Addresses

   Xipeng Xiao <xipeng@globalcenter.net>
   Frontier Globalcenter
   141 Caspian Court
   Sunnyvale, CA 94087
   USA
   Phone: +1 408-543-4801

   Edward Crabbe <ecrabbe@globalcenter.net>
   Frontier Globalcenter
   141 Caspian Court,
   Sunnyvale, CA 94087
   USA
   Phone: +1 408-543-4827

   Alan Hannan <alan@globalcenter.net>
   Frontier Globalcenter
   141 Caspian Court,
   Sunnyvale, CA 94087
   USA
   Phone: +1 408-543-4891

   Vern Paxson <vern@aciri.org>
   ACIRI/ICSI
   1947 Center Street
   Suite 600
   Berkeley, CA 94704-1198
   USA
   Phone: +1 510-642-4274 x302

Xiao et al                                                      [Page 7]