TCP Maintenance and Minor                                M. Jethanandani
Extensions                                                 Cisco Systems
Internet-Draft                                                M. Bashyam
Intended status: Informational                      Ocarina Systems, Inc
Expires: August 11, 2007                                February 7, 2007



                    draft-mahesh-persist-timeout-00.txt

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on August 11, 2007.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   This informational document describes how a connection can get stuck
   in persist state and its implication on the system if there is no
   mechanism to timeout this state.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",



Jethanandani & Bashyam   Expires August 11, 2007                [Page 1]


Internet-Draft  Improving TCP robustness in persist state  February 2007


   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  Solution  . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
   3.  Role of Application . . . . . . . . . . . . . . . . . . . . . . 5
   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6
   5.  Security Considerations . . . . . . . . . . . . . . . . . . . . 6
   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 6
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . . . 6
     7.1.  Normative References  . . . . . . . . . . . . . . . . . . . 6
     7.2.  Informative References  . . . . . . . . . . . . . . . . . . 6
   Appendix A.  An Appendix  . . . . . . . . . . . . . . . . . . . . . 7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . . . 7
   Intellectual Property and Copyright Statements  . . . . . . . . . . 8

































Jethanandani & Bashyam   Expires August 11, 2007                [Page 2]


Internet-Draft  Improving TCP robustness in persist state  February 2007


1.  Introduction

   RFC 1122 [RFC1122] Section 4.2.2.17, page 92 says that: A TCP MAY
   keep its offered receive window closed indefinitely.  As long as the
   receiving TCP continues to send acknowledgments in response to the
   probe segments, the sending TCP MUST allow the connection to stay
   open.

   The RFC goes on to say that it is important to remember that ACK
   (acknowledgement) segments that contain no data are not reliably
   transmitted by TCP.  Therefore zero window probing SHOULD be
   supported to prevent a connection from hanging forever if ACK
   segments that re-opens the window is lost.

   While it is clear why the sender needs to continue to probe the
   receiver, it is not clear why this process needs to be indefinite,
   particularly if the receiver reliably responds with a ACK and a
   window of zero.

   The particular situation we ran into was with a gaming client that
   would receive regular updates of the ensuing game from the server.
   At some point the client decided to pause the game, effectively
   telling the application to stop reading data from the TCP connection.
   Another example of such a setup is a HTTP based Web conferencing.

   The effect of the client that stops reading data is that the server
   continues to send data till the advertised window goes down to zero
   at which time the connection enters persist state.  Since the server
   has more buffers with data for the client, it will continue to probe
   the receiver.  However, it is not clear what the sender is supposed
   to do if the receiver never exits this state.

   If the sender is servicing several such clients the effect compounds
   itself to the extent that the system runs out of buffers and or
   connection resources.  The situation therefore lends itself to a DoS
   attack specially because legitimate connections get dropped or start
   seeing degraded service.

   It is quite possible that the receiving end enters the persist state
   by advertising a zero window and all subsequent window probes will
   result in a zero window being advertised towards the sender.  This
   could result in the sender holding on to large number of buffers/
   data.

   The problem is applicable to TCP and TCP derived transport protocol
   like SCTP.





Jethanandani & Bashyam   Expires August 11, 2007                [Page 3]


Internet-Draft  Improving TCP robustness in persist state  February 2007


2.  Solution

   The current behavior of the connection in persist state SHALL
   continue to exist as the default behavior.  We are proposing an
   option to enable an upper bound to the persist state with an absolute
   time limit or via a set number of retires.

   To enable an upper bound to the persist state, the administrator MAY
   configure an option.  The option SHOULD be configured as a time or
   number of retries.  If both the options are configured, whichever
   option kicks in first will take effect.

   If the configured option is time then that implies how long the
   connection will be allowed to stay in persist state.  The configured
   option is called persist-state-expiry-time.  When the connection
   enters persist state, i.e. the receiver advertises a window of zero,
   the value of current time is saved in the connection entry.  This
   entry is called persist-entry-time.  Thereafter every time the
   persist timer expires, and before it is set, or when an ACK is
   received that continues to advertise zero window, a check is done to
   make sure that the difference between current time and persist-entry-
   time is not more than persist-state-expiry-time.  If it is then the
   connection is reset and the connection resources are reclaimed by
   TCP.  Any time after the connection has gone into persist state and
   before reset of the connection, if the receiver advertises a non-zero
   window, the persist-entry-time is cleared.

   If the configured option is number of retries it implies the number
   of retries that will be made before the connection is aborted.  The
   configured option is called persist-state-expiry-retries.  When the
   connection enters persist state, i.e. the receiver advertises a
   window of zero, the count of retries called persist-state-retry-count
   in the connection entry is cleared.  Thereafter every time the
   persist timer expires, and before it is set, or when and ACK is
   received that continues to advertise zero window, a check is done to
   make sure that persist-state-retry-count does not exceed persist-
   state-expiry-retries.  If it does, the connection is reset and the
   connection resources are reclaimed by TCP.  Any time after the
   connection has gone into persist state and before reset of the
   connection, if the receiver advertises a non-zero window, the
   persist-state-expiry-retries is cleared.  If the difference between
   the current retry count and persist-entry-expiry-count is less than
   the persist-state-expiry-retries, the current retry count is
   incremented by one.  This configuration option of persist-state-
   expiry-retries is more coarse grained compared to the persist-state-
   expiry-time option.





Jethanandani & Bashyam   Expires August 11, 2007                [Page 4]


Internet-Draft  Improving TCP robustness in persist state  February 2007


3.  Role of Application

   In order to understand if application can play a role in solving this
   problem, one needs to understand the current behavior of application
   vis-a-vis TCP.

   Applications today do not know if a connection is stuck in persist
   state, Application in most cases is even unaware why TCP is not
   sending any more data.  It cannot distinguish between packets getting
   dropped because of network issues or send window not advancing
   because the other end has closed the window.  Trying to keep the
   application appraised of what is causing the problem only takes care
   of that particular connection and that particular application.  It
   does not take care of all applications and all connections that might
   be in persist state.

   TCP in most cases will not signal that a connection is blocked.  This
   is particularly true if there are buffers available or application
   has no more data to send.  If the application were to poll TCP to get
   the information, it is not clear how often it would need to poll.  As
   described before TCP MAY not send more data because of several
   reasons and in most cases the polling will show that the connection
   MAY not even be in persist state.

   It is quite possible that the application that is encountering the
   problem may not have implemented a way to detect and close the
   connection.  Since the impact of a connection in persist state is
   system wide all applications have to have implemented the option for
   the solution to be effective.  Even one application that has not
   implemented the option can cause the entire system to be impacted.
   It is also not possible to get every application to implement
   detection of persist state and have it turn on the option.

   It is also possible for applications to write data and exit before
   the data is sent.  An example of this application is HTTP server.
   When a HTTP server receives a HTTP request like a GET, the server
   will respond with data and go ahead and close the socket even before
   TCP has finished sending all the data.  In that case, TCP has no
   application it can inform to take action on a connection stuck in
   persist state.

   There are cases where the system is application agnostic.  A classic
   case of this is a TCP proxy.  In that particular case, there is no
   end application that can be informed of the state of the connection
   for the application to take action.

   Resources like TCP buffers are system wide resources and are not tied
   to any particular application.  TCP needs to be able to monitor



Jethanandani & Bashyam   Expires August 11, 2007                [Page 5]


Internet-Draft  Improving TCP robustness in persist state  February 2007


   buffer usage on a per connection basis for it to detect and drop
   packets on connections that are taking up a lot of buffers.  TCP
   cannot rely on an application to perform the task of looking at
   buffers system wide.

   Therefore we believe applications have at best a limited role to play
   is solving this problem.

   TCP already keeps track of connections in persist state.  It is in a
   central position to look at this state system wide.  The advantage of
   doing this in TCP is that once enabled, the entire system including
   all the applications benefit.  Moreover, resources like buffers which
   are system wide can be monitored by TCP to determine when to reset a
   connection and reclaim the resources.  The code change required to
   time bound persist state is minimal and easy to implement.


4.  IANA Considerations

   This document makes no request of IANA.


5.  Security Considerations

   This document discusses one security consideration.  That is the
   possible Denial of Service Attack discussed in Section 1.


6.  Acknowledgements

   Thanks to Anantha Ramiah for helping in providing feedback on this
   draft.


7.  References

7.1.  Normative References

   [RFC1122]  Braden, R., "Requirements for Internet Hosts -
              Communication Layers", STD 3, RFC 1122, October 1989.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

7.2.  Informative References






Jethanandani & Bashyam   Expires August 11, 2007                [Page 6]


Internet-Draft  Improving TCP robustness in persist state  February 2007


Appendix A.  An Appendix


Authors' Addresses

   Mahesh Jethanandani
   Cisco Systems
   170 West Tasman Drive
   San Jose, California  95134
   USA

   Phone: +1-408-527-8230
   Fax:   +1-408-527-0147
   Email: mahesh@cisco.com
   URI:   www.cisco.com


   Murali Bashyam
   Ocarina Systems, Inc
   Fremont, CA
   USA

   Phone:
   Fax:
   Email: mbashyam@ocarinatech.com
   URI:

























Jethanandani & Bashyam   Expires August 11, 2007                [Page 7]


Internet-Draft  Improving TCP robustness in persist state  February 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).





Jethanandani & Bashyam   Expires August 11, 2007                [Page 8]