TCPM WG                                                        J. Touch
Internet Draft                                                  USC/ISI
Intended status: Informational                                 M. Welzl
Expires: April 2017                                            S. Islam
                                                     University of Oslo
                                                                 J. You
                                                                 Huawei
                                                       October 28, 2016



                     TCP Control Block Interdependence
                      draft-touch-tcpm-2140bis-01.txt


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008. The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html




Touch, et al.           Expires April 28, 2017                 [Page 1]


Internet-Draft    TCP Control Block Interdependence        October 2016


   This Internet-Draft will expire on April 28, 2016.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.

Abstract

   This memo describes interdependent TCP control blocks, where part of
   the TCP state is shared among similar concurrent or consecutive
   connections. TCP state includes a combination of parameters, such as
   connection state, current round-trip time estimates, congestion
   control information, and process information. Most of this state is
   maintained on a per-connection basis in the TCP Control Block (TCB),
   but implementations can (and do) share certain TCB information
   across connections to the same host. Such sharing is intended to
   improve overall transient transport performance, while maintaining
   backward-compatibility with existing implementations. The sharing
   described herein is limited to only the TCB initialization and so
   has no effect on the long-term behavior of TCP after a connection
   has been established.

Table of Contents


   1. Introduction...................................................3
   2. Conventions used in this document..............................3
   3. Terminology....................................................4
   4. The TCP Control Block (TCB)....................................4
   5. TCB Interdependence............................................5
   6. An Example of Temporal Sharing.................................5
   7. An Example of Ensemble Sharing.................................7
   8. Compatibility Issues...........................................9
   9. Implications..................................................11
   10. Implementation Observations..................................12
   11. Security Considerations......................................13
   12. IANA Considerations..........................................14
   13. References...................................................15
      13.1. Normative References....................................15


Touch                   Expires April 28, 2017                 [Page 2]


Internet-Draft    TCP Control Block Interdependence        October 2016


      13.2. Informative References..................................15
   14. Acknowledgments..............................................17

1. Introduction

   TCP is a connection-oriented reliable transport protocol layered
   over IP [RFC793]. Each TCP connection maintains state, usually in a
   data structure called the TCP Control Block (TCB). The TCB contains
   information about the connection state, its associated local
   process, and feedback parameters about the connection's transmission
   properties. As originally specified and usually implemented, most
   TCB information is maintained on a per-connection basis. Some
   implementations can (and now do) share certain TCB information
   across connections to the same host.. Such sharing is intended to
   lead to better overall transient performance, especially for
   numerous short-lived and simultaneous connections, as often used in
   the World-Wide Web [Be94],[Br02].

   This document discusses TCB state sharing that affects only the TCB
   initialization, and so has no effect on the long-term behavior of
   TCP after a connection has been established. Path information shared
   across SYN destination port numbers assumes that TCP segments having
   the same host-pair experience the same path properties, irrespective
   of TCP port numbers. The observations about TCB sharing in this
   document apply similarly to any protocol with congestion state,
   including SCTP [RFC4960] and DCCP [RFC4340], as well as for
   individual subflows in Multipath TCP [RFC6824].



2. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   In this document, these words will appear with that interpretation
   only when in ALL CAPS. Lower case uses of these words are not to be
   interpreted as carrying significance described in RFC 2119.

   In this document, the characters ">>" preceding an indented line(s)
   indicates a statement using the key words listed above. This
   convention aids reviewers in quickly identifying or finding the
   portions of this RFC covered by these keywords.





Touch                   Expires April 28, 2017                 [Page 3]


Internet-Draft    TCP Control Block Interdependence        October 2016


3. Terminology

   Host - a source or sink of TCP segments associated with a single IP
   address

   Host-pair - a pair of hosts and their corresponding IP addresses

   Path - an Internet path between the IP addresses of two hosts

4. The TCP Control Block (TCB)

   A TCB describes the data associated with each connection, i.e., with
   each association of a pair of applications across the network. The
   TCB contains at least the following information [RFC793]:

        Local process state
            pointers to send and receive buffers
            pointers to retransmission queue and current segment
            pointers to Internet Protocol (IP) PCB
        Per-connection shared state
            macro-state
                connection state
                timers
                flags
                local and remote host numbers and ports
                TCP option state
            micro-state
                send and receive window state (size*, current number)
                round-trip time and variance
                cong. window size (snd_cwnd)*
                cong. window size threshold (ssthresh)*
                path maximum transmission unit (PMTU)*
                max window size seen*
                MSS#
                round-trip time and variance#

   The per-connection information is shown as split into macro-state
   and micro-state, terminology borrowed from [Co91]. Macro-state
   describes the finite state machine; we include the endpoint numbers
   and components (timers, flags) used to help maintain that state.
   Macro-state describes the protocol for establishing and maintaining
   shared state about the connection. Micro-state describes the
   protocol after a connection has been established, to maintain the
   reliability and congestion control of the data transferred in the
   connection.




Touch                   Expires April 28, 2017                 [Page 4]


Internet-Draft    TCP Control Block Interdependence        October 2016


   We further distinguish two other classes of shared micro-state that
   are associated more with host-pairs than with application pairs. One
   class is clearly host-pair dependent (#, e.g., MSS, RTT), and the
   other is host-pair dependent in its aggregate (*, e.g., congestion
   window information, current window sizes, etc.).

5. TCB Interdependence

   There are two cases of TCB interdependence. Temporal sharing occurs
   when the TCB of an earlier (now CLOSED) connection to a host is used
   to initialize some parameters of a new connection to that same host,
   i.e., in sequence. Ensemble sharing occurs when a currently active
   connection to a host is used to initialize another (concurrent)
   connection to that host.

6. An Example of Temporal Sharing

   The TCB data cache is accessed in two ways: it is read to initialize
   new TCBs and written when more current per-host state is available.
   New TCBs are initialized using context from past connections as
   follows:

             TEMPORAL SHARING - TCB Initialization

             Cached TCB           New TCB
             ----------------------------------------
             old_PMTU             old_PMTU

             old_MSS              old_MSS

             old_RTT              old_RTT

             old_RTTvar           old_RTTvar

             old_option           (option specific)

             old_ssthresh         old_ssthresh

             old_snd_cwnd         old_snd_cwnd

   Most cached TCB values are updated when a connection closes. Two
   exceptions are PMTU, which is updated after Path MTU Discovery
   [RFC4821], and MSS, which is updated whenever the MSS option is
   received in a TCP header.





Touch                   Expires April 28, 2017                 [Page 5]


Internet-Draft    TCP Control Block Interdependence        October 2016


   Sharing MSS information affects only data in the SYN of the next
   connection, because MSS information is typically included in most
   TCP segments.

   [TBD - complete this section with details for TFO and other options
   whose state may, must, or must not be shared] The way in which other
   TCP option state can be shared depends on the details of that
   option. E.g., TFO state includes the TCP Fast Open Cookie [RFC7413]
   or, in case TFO fails, a negative TCP Fast Open response (from [RFC
   7413]: "The client MUST cache negative responses from the server in
   order to avoid potential connection failures. Negative responses
   include the server not acknowledging the data in the SYN, ICMP error
   messages, and (most importantly) no response (SYN-ACK) from the
   server at all, i.e., connection timeout."). TFOinfo is cached when a
   connection is established.

   Other TCP option state might not be as readily cached. E.g., TCP-AO
   [RFC5925] success or failure between a host pair for a single SYN
   destination port might be usefully cached. TCP-AO success or failure
   to other SYN destination ports on that host pair is never useful to
   cache because TCP-AO security parameters can vary per service.

                TEMPORAL SHARING - Cache Updates

   Cached TCB   Current TCB     when?   New Cached TCB
   ----------------------------------------------------------------
   old_PMTU     curr_PMTU       PMTUD   current (cur)_PMTU

   old_MSS      curr_MSS        MSSopt  cur_MSS

   old_RTT      curr_RTT        CLOSE   merge(curr,old)

   old_RTTvar   curr_RTTvar     CLOSE   merge(curr,old)

   old_option   curr option     ESTAB   depends on option)

   old_ssthresh curr_ssthresh   CLOSE   merge(curr,old)

   old_snd_cwnd curr_snd_cwnd   CLOSE   merge(curr,old)

   Caching PMTU and MSS is trivial; reported values are cached, and the
   most recent values are used. The cache is updated when the MSS
   option is received or after PMTUD (i.e., when an ICMPv4
   Fraqmentation Needed [RFC1191] or ICMPv6 Packet Too Big message is
   received [RFC1981] or the equivalent is inferred, e.g. as from
   PLMTUD [RFC4821]), respectively, so the cache always has the most
   recent values from any connection. For MSS, the cache is consulted


Touch                   Expires April 28, 2017                 [Page 6]


Internet-Draft    TCP Control Block Interdependence        October 2016


   only at connection establishment and not otherwise updated, which
   means that MSS options do not affect current connections. The
   default MSS is never saved; only reported MSS values update the
   cache, so an explicit override is required to reduce the MSS. Other
   options are copied or merged depending on the details of each
   option. E.g., TFO state is updated when a connection is established
   and read before establishing a new connection.

   RTT values are updated by a more complicated mechanism
   [RFC1644][Ja86]. Dynamic RTT estimation requires a sequence of RTT
   measurements. As a result, the cached RTT (and its variance) is an
   average of its previous value with the contents of the currently
   active TCB for that host, when a TCB is closed. RTT values are
   updated only when a connection is closed. The method for merging old
   and current values needs to attempt to reduce the transient for new
   connections. [THESE MERGE FUNCTIONS NEED TO BE SPECIFIED,
   considering e.g. [DM16] - TBD].

   The updates for RTT, RTTvar and ssthresh rely on existing
   information, i.e., old values. Should no such values exist, the
   current values are cached instead.



7. An Example of Ensemble Sharing

   Sharing cached TCB data across concurrent connections requires
   attention to the aggregate nature of some of the shared state. For
   example, although MSS and RTT values can be shared by copying, it
   may not be appropriate to copy congestion window or ssthresh
   information (see section 8 for a discussion of congestion window or
   ssthresh sharing).

               ENSEMBLE SHARING - TCB Initialization

               Cached TCB           New TCB
               ----------------------------------
               old_PMTU             old_PMTU

               old_MSS              old_MSS

               old_RTT              old_RTT

               old_RTTvar           old_RTTvar

               old_option          (option-specific)



Touch                   Expires April 28, 2017                 [Page 7]


Internet-Draft    TCP Control Block Interdependence        October 2016


                   ENSEMBLE SHARING - Cache Updates

      Cached TCB  Current TCB  when?          New Cached TCB
      -----------------------------------------------------------
      old_PMTU    curr_PMTU    PMTUD/PLPMTUD  curr_PMTU

      old_MSS     curr_MSS     MSSopt         curr_MSS

      old_RTT     curr_RTT     update         rtt_update(old,cur)

      old_RTTvar  curr_RTTvar  update         rtt_update(old,cur)

      old_option  curr option  (depends)      (option specific)


   For ensemble sharing, TCB information should be cached as early as
   possible, sometimes before a connection is closed. Otherwise,
   opening multiple concurrent connections may not result in TCB data
   sharing if no connection closes before others open. The amount of
   work involved in updating the aggregate average should be minimized,
   but the resulting value should be equivalent to having all values
   measured within a single connection. The function "rtt_update" in
   the ensemble sharing table indicates this operation, which occurs
   whenever the RTT would have been updated in the individual TCP
   connection. As a result, the cache contains the shared RTT
   variables, which no longer need to reside in the TCB [Ja86].

   Congestion window size and ssthresh aggregation are more complicated
   in the concurrent case. When there is an ensemble of connections, we
   need to decide how that ensemble would have shared these variables,
   in order to derive initial values for new TCBs.

   Any assumption of this sharing can be incorrect, including this one,
   because identical endpoint address pairs may not share network
   paths. In current implementations, new congestion windows are set at
   an initial value of 4-10 segments [RFC3390][RFC6928], so that the
   sum of the current windows is increased for any new connection. This
   can have detrimental consequences where several connections share a
   highly congested link.

   There are several ways to initialize the congestion window in a new
   TCB among an ensemble of current connections to a host, as shown
   below. Current TCP implementations initialize it to four segments as
   standard [rfc3390] and 10 segments experimentally [RFC6928] and
   T/TCP hinted that it should be initialized to the old window size
   [RFC1644]. In the former cases, the assumption is that new
   connections should behave as conservatively as possible. In the


Touch                   Expires April 28, 2017                 [Page 8]


Internet-Draft    TCP Control Block Interdependence        October 2016


   latter T/TCP case, no accommodation is made for concurrent aggregate
   behavior.

   In either case, the sum of window sizes can increase, rather than
   remain constant. A different approach is to give each pending
   connection its "fair share" of the available congestion window, and
   let the connections balance from there. The assumption we make here
   is that new connections are implicit requests for an equal share of
   available link bandwidth, which should be granted at the expense of
   current connections. [TBD - a new method for safe congestion sharing
   will be described]

8. Compatibility Issues

   For the congestion and current window information, the initial
   values computed by TCB interdependence may not be consistent with
   the long-term aggregate behavior of a set of concurrent connections
   between the same endpoints. Under conventional TCP congestion
   control, if a single existing connection has converged to a
   congestion window of 40 segments, two newly joining concurrent
   connections assume initial windows of 10 segments [RFC6928], and the
   current connection's window doesn't decrease to accommodate this
   additional load and connections can mutually interfere. One example
   of this is seen on low-bandwidth, high-delay links, where concurrent
   connections supporting Web traffic can collide because their initial
   windows were too large, even when set at one segment.

   [TBD - this paragraph needs to be revised based on new
   recommendations] Under TCB interdependence, all three connections
   could change to use a congestion window of 12 (rounded down to an
   even number from 13.33, i.e., 40/3). This would include both
   increasing the initial window of the new connections (vs. current
   recommendations [RFC6928]) and decreasing the congestion window of
   the current connection (from 40 down to 12). This gives the new
   connections a larger initial window than allowed by [RFC6928], but
   maintains the aggregate. Depending on whether the previous
   connections were in steady-state, this can result in more bursty
   behavior, e.g., when previous connections are idle and new
   connections commence with a large amount of available data to
   transmit. Additionally, reducing the congestion window of an
   existing connection needs to account for the number of packets that
   are already in flight.

   Because this proposal attempts to anticipate the aggregate steady-
   state values of TCB state among a group or over time, it should
   avoid the transient effects of new connections. In addition, because
   it considers the ensemble and temporal properties of those


Touch                   Expires April 28, 2017                 [Page 9]


Internet-Draft    TCP Control Block Interdependence        October 2016


   aggregates, it should also prevent the transients of short-lived or
   multiple concurrent connections from adversely affecting the overall
   network performance. There have been ongoing analysis and
   experiments to validate these assumptions. For example, [Ph12]
   recommends to only cache ssthresh for temporal sharing when flows
   are long. Sharing ssthresh between short flows can deteriorate the
   overall performance of individual connections[Ph12, Nd16], although
   this may benefit overall network performance.  [TBD - the details of
   this issue need to be summarized and clarified herein].

   [TBD - placeholder for corresponding RTT discussion]

   Due to mechanisms like ECMP and LAG [RFC7424], TCP connections
   sharing the same host-pair may not always share the same path. This
   does not matter for host-specific information such as RWIN and TCP
   option state, such as TFOinfo. When TCB information is shared across
   different SYN destination ports, path-related information can be
   incorrect; however, the impact of this error is potentially
   diminished if (as discussed here) TCB sharing affects only the
   transient event of a connection start or if TCB information is
   shared only within connections to the same SYN destination port. In
   case of Temporal Sharing, TCB information could also become invalid
   over time. Because this is similar to the case when a connection
   becomes idle, mechanisms that address idle TCP connections (e.g.,
   [RFC7661]) could also be applied to TCB cache management.

   There may be additional considerations to the way in which TCB
   interdependence rebalances congestion feedback among the current
   connections, e.g., it may be appropriate to consider the impact of a
   connection being in Fast Recovery [RFC5861] or some other similar
   unusual feedback state, e.g., as inhibiting or affecting the
   calculations described herein.

   TCP is sometimes used in situations where packets of the same host-
   pair always take the same path. Because ECMP and LAG examine TCP
   port numbers, they may not be supported when TCP segments are
   encapsulated, encrypted, or altered - for example, some Virtual
   Private Networks (VPNs) are known to use proprietary UDP
   encapsulation methods. Similarly, they cannot operate when the TCP
   header is encrypted, e.g., when using IPsec ESP. TCB interdependence
   among the entire set sharing the same endpoint IP addresses should
   work without problems under these circumstances. Moreover, measures
   to increase the probability that connections use the same path could
   be applied: e.g., the connections could be given the same IPv6 flow
   label. TCB interdependence can also be extended to sets of host IP
   address pairs that share the same network path conditions, such as
   when a group of addresses is on the same LAN (see Section 9).


Touch                   Expires April 28, 2017                [Page 10]


Internet-Draft    TCP Control Block Interdependence        October 2016


9. Implications

   There are several implications to incorporating TCB interdependence
   in TCP implementations. First, it may reduce the need for
   application-layer multiplexing for performance enhancement
   [RFC7231]. Protocols like HTTP/2 [RFC7540] avoid connection
   reestablishment costs by serializing or multiplexing a set of per-
   host connections across a single TCP connection. This avoids TCP's
   per-connection OPEN handshake and also avoids recomputing MSS, RTT,
   and congestion windows. By avoiding the so-called, "slow-start
   restart," performance can be optimized. TCB interdependece can
   provide the "slow-start restart avoidance" of multiplexing, without
   requiring a multiplexing mechanism at the application layer.

   TCB interdependence pushes some of the TCP implementation from the
   traditional transport layer (in the ISO model), to the network
   layer. This acknowledges that some state is in fact per-host-pair or
   can be per-path as indicated solely by that host-pair. Transport
   protocols typically manage per-application-pair associations (per
   stream), and network protocols manage per-host-pair and path
   associations (routing). Round-trip time, MSS, and congestion
   information could be more appropriately handled in a network-layer
   fashion, aggregated among concurrent connections, and shared across
   connection instances [RFC3124].

   An earlier version of RTT sharing suggested implementing RTT state
   at the IP layer, rather than at the TCP layer [Ja86]. Our
   observations are for sharing state among TCP connections, which
   avoids some of the difficulties in an IP-layer solution. One such
   problem is determining the associated prior outgoing packet for an
   incoming packet, to infer RTT from the exchange. Because RTTs are
   still determined inside the TCP layer, this is simpler than at the
   IP layer. This is a case where information should be computed at the
   transport layer, but could be shared at the network layer.

   Per-host-pair associations are not the limit of these techniques. It
   is possible that TCBs could be similarly shared between hosts on a
   subnet or within a cluster, because the predominant path can be
   subnet-subnet, rather than host-host. Additionally, TCB
   interdependence can be applied to any protocol with congestion
   state, including SCTP [RFC4960] and DCCP [RFC4340], as well as for
   individual subflows in Multipath TCP [RFC6824].

   There may be other information that can be shared between concurrent
   connections. For example, knowing that another connection has just
   tried to expand its window size and failed, a connection may not
   attempt to do the same for some period. The idea is that existing


Touch                   Expires April 28, 2017                [Page 11]


Internet-Draft    TCP Control Block Interdependence        October 2016


   TCP implementations infer the behavior of all competing connections,
   including those within the same host or subnet. One possible
   optimization is to make that implicit feedback explicit, via
   extended information associated with the endpoint IP address and its
   TCP implementation, rather than per-connection state in the TCB.

   Like its initial version in 1997, this document's approach to TCB
   interdependence focuses on sharing a set of TCBs by updating the TCB
   state to reduce the impact of transients when connections begin or
   end. Other mechanisms have since been proposed to continuously share
   information between all ongoing communication (including
   connectionless protocols), updating the congestion state during any
   congestion-related event (e.g., timeout, loss confirmation, etc.)
   [RFC3124]. By dealing exclusively with transients, TCB
   interdependence is more likely to exhibit the same behavior as
   unmodified, independent TCP connections.

10. Implementation Observations

   The observation that some TCB state is host-pair specific rather
   than application-pair dependent is not new and is a common
   engineering decision in layered protocol implementations. A
   discussion of sharing RTT information among protocols layered over
   IP, including UDP and TCP, occurred in [Ja86]. Although now
   deprecated, T/TCP was the first to propose using caches in order to
   maintain TCB states (see Appendix A for more information).























Touch                   Expires April 28, 2017                [Page 12]


Internet-Draft    TCP Control Block Interdependence        October 2016


   The table below describes the current implementation status for some
   TCB information in Linux kernel version 4.6, FreeBSD 10 and Windows
   (as of October 2016).

      TCB data     Status
      -----------------------------------------------------------
      old_MSS      Cached and shared in Linux

      old_RTT      Cached and shared in FreeBSD

      old_RTTvar   Cached and shared in FreeBSD

      old PMTU     Cached and shared in FreeBSD and Windows

      old TFOinfo  Cached and shared in Linux and Windows

      old_snd_cwnd Not shared

      old_ssthresh Cached and shared in FreeBSD and Linux:
                   FreeBSD: arithmetic
                   mean of ssthresh and previous value if
                   a previous value exists;
                   Linux: depending on state,
                   max(cwnd/2, ssthresh) in most cases

11. Security Considerations

   These suggested implementation enhancements do not have additional
   ramifications for explicit attacks. These enhancements may be
   susceptible to denial-of-service attacks if not otherwise secured.
   For example, an application can open a connection and set its window
   size to zero, denying service to any other subsequent connection
   between those hosts.

   TCB sharing may be susceptible to denial-of-service attacks,
   wherever the TCB is shared, between connections in a single host, or
   between hosts if TCB sharing is implemented within a subnet (see
   Implications section). Some shared TCB parameters are used only to
   create new TCBs, others are shared among the TCBs of ongoing
   connections. New connections can join the ongoing set, e.g., to
   optimize send window size among a set of connections to the same
   host.

   Attacks on parameters used only for initialization affect only the
   transient performance of a TCP connection. For short connections,
   the performance ramification can approach that of a denial-of-
   service attack. E.g., if an application changes its TCB to have a


Touch                   Expires April 28, 2017                [Page 13]


Internet-Draft    TCP Control Block Interdependence        October 2016


   false and small window size, subsequent connections would experience
   performance degradation until their window grew appropriately.

   The solution is to limit the effect of compromised TCB values. TCBs
   are compromised when they are modified directly by an application or
   transmitted between hosts via unauthenticated means (e.g., by using
   a dirty flag). TCBs that are not compromised by application
   modification do not have any unique security ramifications. Note
   that the proposed parameters for TCB sharing are not currently
   modifiable by an application.

   All shared TCBs MUST be validated against default minimum parameters
   before used for new connections. This validation would not impact
   performance, because it occurs only at TCB initialization. This
   limits the effect of attacks on new connections to reducing the
   benefit of TCB sharing, resulting in the current default TCP
   performance. For ongoing connections, the effect of incoming packets
   on shared information should be both limited and validated against
   constraints before use. This is a beneficial precaution for existing
   TCP implementations as well.

   TCBs modified by an application SHOULD NOT be shared, unless the new
   connection sharing the compromised information has been given
   explicit permission to use such information by the connection API.
   No mechanism for that indication currently exists, but it could be
   supported by an augmented API. This sharing restriction SHOULD be
   implemented in both the host and the subnet. Sharing on a subnet
   SHOULD utilize authentication to prevent undetected tampering of
   shared TCB parameters. These restrictions limit the security impact
   of modified TCBs both for connection initialization and for ongoing
   connections.

   Finally, shared values MUST be limited to performance factors only.
   Other information, such as TCP sequence numbers, when shared, are
   already known to compromise security.

12. IANA Considerations

   There are no IANA implications or requests in this document.

   This section should be removed upon final publication as an RFC.








Touch                   Expires April 28, 2017                [Page 14]


Internet-Draft    TCP Control Block Interdependence        October 2016


13. References

13.1. Normative References

   [RFC793]  Postel, Jon, "Transmission Control Protocol," Network
             Working Group RFC-793/STD-7, ISI, Sept. 1981.

   [RFC1191] Mogul, J., Deering, S., "Path MTU Discovery," RFC 1191,
             Nov. 1990.

   [RFC1981] McCann, J., Deering. S., Mogul, J., "Path MTU Discovery
             for IP version 6," RFC 1981, Aug. 1996.

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU
             Discovery," RFC 4821, Mar. 2007.

   [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., Jain, A., "TCP Fast
             Open", RFC 7413, Dec. 2014.

13.2. Informative References

   [Br02]    Brownlee, N. and K. Claffy, "Understanding Internet
             Traffic Streams: Dragonflies and Tortoises", IEEE
             Communications Magazine p110-117, 2002.

   [Be94]    Berners-Lee, T., et al., "The World-Wide Web,"
             Communications of the ACM, V37, Aug. 1994, pp. 76-82.

   [Br94]    Braden, B., "T/TCP -- Transaction TCP: Source Changes for
             Sun OS 4.1.3,", Release 1.0, USC/ISI, September 14, 1994.

   [Co91]    Comer, D., Stevens, D., Internetworking with TCP/IP, V2,
             Prentice-Hall, NJ, 1991.

   [FreeBSD] FreeBSD source code, Release 2.10, http://www.freebsd.org/

   [Ja86]    Jacobson, V., (mail to public list "tcp-ip", no archive
             found), 1986.

   [Nd16]    Dukkipati, N., Yuchung C., and Amin V., "Research
             Impacting the Practice of Congestion Control." ACM SIGCOMM
             CCR (editorial).




Touch                   Expires April 28, 2017                [Page 15]


Internet-Draft    TCP Control Block Interdependence        October 2016


   [DM16]    Matz, D., "Optimize TCP's Minimum Retransmission Timeout
             for Low Latency Environments", Master's thesis, Technical
             University Munich, 2016.

   [Ph12]    Hurtig, P., Brunstrom, A., "Enhanced metric caching for
             short TCP flows," 2012 IEEE International Conference on
             Communications (ICC), Ottawa, ON, 2012, pp. 1209-1213.

   [RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions
             Functional Specification," RFC-1644, July 1994.

   [RFC1379] Braden, R., "Transaction TCP -- Concepts," RFC-1379,
             September 1992.

   [RFC3390] Allman, M., Floyd, S., Partridge, C., "Increasing TCP's
             Initial Window," RFC 3390, Oct. 2002.

   [RFC7231] Fielding, R., J. Reshke, Eds., "HTTP/1.1 Semantics and
             Content," RFC-7231, June 2014.

   [RFC3124] Balakrishnan, H., Seshan, S., "The Congestion Manager,"
             RFC 3124, June 2001.

   [RFC4340] Kohler, E., Handley, M., Floyd, S., "Datagram Congestion
             Control Protocol (DCCP)," RFC 4340, Mar. 2006.

   [RFC4960] Stewart, R., (Ed.), "Stream Control Transmission
             Protocol," RFC4960, Sept. 2007.

   [RFC5861] Allman, M., Paxson, V., Blanton, E., "TCP Congestion
             Control," RFC 5861, Sept. 2009.

   [RFC5925] Touch, J., Mankin, A., Bonica, R., "The TCP Authentication
             Option," RFC 5925, June 2010.

   [RFC6824] Ford, A., Raiciu, C., Handley, M., Bonaventure, O., "TCP
             Extensions for Multipath Operation with Multiple
             Addresses," RFC 6824, Jan. 2013.

   [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., Mathis, M., "Increasing
             TCP's Initial Window," RFC 6928, Apr. 2013.

   [RFC7424] Krishnan, R., Yong, L., Ghanwani, A., So, N., Khasnabish,
             B., "Mechanisms for Optimizing Link Aggregation Group
             (LAG) and Equal-Cost Multipath (ECMP) Component Link
             Utilization in Networks", RFC 7424, Jan. 2015



Touch                   Expires April 28, 2017                [Page 16]


Internet-Draft    TCP Control Block Interdependence        October 2016


   [RFC7540] Belshe, M., Peon, R., Thomson, M., "Hypertext Transfer
             Protocol Version 2 (HTTP/2)", RFC 7540, May 2015.

   [RFC7661] Fairhurst, G., Sathiaseelan, A., Secchi, R., "Updating TCP
             to Support Rate-Limited Traffic", RFC 7661, Oct. 2015

14. Acknowledgments

   The authors would like to thank for Praveen Balasubramanian for
   information regarding TCB sharing in Windows, and Yuchung Cheng and
   Michael Scharf for comments on earlier versions of the draft. This
   work has received funding from a collaborative research project
   between the University of Oslo and Huawei Technologies Co., Ltd.,
   and is partly supported by USC/ISI's Postel Center.

   This document was prepared using 2-Word-v2.0.template.dot.

Authors' Addresses

   Joe Touch
   USC/ISI
   4676 Admiralty Way
   Marina del Rey, CA 90292-6695
   USA

   Phone: +1 (310) 448-9151
   Email: touch@isi.edu


   Michael Welzl
   University of Oslo
   PO Box 1080 Blindern
   Oslo  N-0316
   Norway

   Phone: +47 22 85 24 20
   Email: michawe@ifi.uio.no












Touch                   Expires April 28, 2017                [Page 17]


Internet-Draft    TCP Control Block Interdependence        October 2016


   Safiqul Islam
   University of Oslo
   PO Box 1080 Blindern
   Oslo  N-0316
   Norway

   Phone: +47 22 84 08 37
   Email: safiquli@ifi.uio.no


   Jianjie You
   Huawei
   101 Software Avenue, Yuhua District
   Nanjing  210012
   China

   Email: youjianjie@huawei.com


15. Appendix A: TCB sharing history

   T/TCP proposed using caches to maintain TCB information across
   instances (temporal sharing), e.g., smoothed RTT, RTT variance,
   congestion avoidance threshold, and MSS [RFC1644]. These values were
   in addition to connection counts used by T/TCP to accelerate data
   delivery prior to the full three-way handshake during an OPEN. The
   goal was to aggregate TCB components where they reflect one
   association - that of the host-pair, rather than artificially
   separating those components by connection.

   At least one T/TCP implementation saved the MSS and aggregated the
   RTT parameters across multiple connections, but omitted caching the
   congestion window information [Br94], as originally specified in
   [RFC1379]. Some T/TCP implementations immediately updated MSS when
   the TCP MSS header option was received [Br94], although this was not
   addressed specifically in the concepts or functional specification
   [RFC1379][RFC1644]. In later T/TCP implementations, RTT values were
   updated only after a CLOSE, which does not benefit concurrent
   sessions.

   Temporal sharing of cached TCB data was originally implemented in
   the SunOS 4.1.3 T/TCP extensions [Br94] and the FreeBSD port of same
   [FreeBSD]. As mentioned before, only the MSS and RTT parameters were
   cached, as originally specified in [RFC1379]. Later discussion of
   T/TCP suggested including congestion control parameters in this
   cache [RFC1644].



Touch                   Expires April 28, 2017                [Page 18]