draft-scharf-mptcp-api-01

Internet Engineering Task Force                                M. Scharf
Internet-Draft                                  Alcatel-Lucent Bell Labs
Intended status: Informational                                   A. Ford
Expires: September 9, 2010                           Roke Manor Research
                                                           March 8, 2010


               MPTCP Application Interface Considerations
                       draft-scharf-mptcp-api-01

Abstract

   Multipath TCP (MPTCP) adds the capability of using multiple paths to
   a regular TCP session.  Even though it is designed to be totally
   backwards compatible to applications, the data transport differs
   compared to regular TCP, and there are several additional degrees of
   freedom that applications may wish to exploit.  This document
   summarizes the impact that MPTCP may have on applications, such as
   changes in performance.  Furthermore, it describes an optional
   extended application interface that provides access to multipath
   information and enables control of some aspects of the MPTCP
   implementation's behaviour.

Status of This Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 9, 2010.

Copyright Notice




Scharf & Ford           Expires September 9, 2010               [Page 1]


Internet-Draft                  MPTCP API                     March 2010


   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.







































Scharf & Ford           Expires September 9, 2010               [Page 2]


Internet-Draft                  MPTCP API                     March 2010


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  Comparison of MPTCP and Regular TCP  . . . . . . . . . . . . .  5
     3.1.  Performance Impact . . . . . . . . . . . . . . . . . . . .  5
       3.1.1.  Throughput . . . . . . . . . . . . . . . . . . . . . .  5
       3.1.2.  Delay  . . . . . . . . . . . . . . . . . . . . . . . .  6
       3.1.3.  Resilience . . . . . . . . . . . . . . . . . . . . . .  6
     3.2.  Potential Problems . . . . . . . . . . . . . . . . . . . .  6
       3.2.1.  Impact of Middleboxes  . . . . . . . . . . . . . . . .  7
       3.2.2.  Outdated Implicit Assumptions  . . . . . . . . . . . .  7
       3.2.3.  Security Implications  . . . . . . . . . . . . . . . .  7
   4.  Operation of MPTCP with Legacy Applications  . . . . . . . . .  7
     4.1.  Overview of the MPTCP Network Stack  . . . . . . . . . . .  7
     4.2.  Usage of Addresses Inside Applications . . . . . . . . . .  8
     4.3.  Usage of Existing Socket Options . . . . . . . . . . . . .  9
     4.4.  Default Enabling of MPTCP  . . . . . . . . . . . . . . . . 10
     4.5.  Known Remaining Issues with Legacy Applications  . . . . . 10
   5.  Minimal API Enhancements for MPTCP-aware Applications  . . . . 11
     5.1.  Indicating MPTCP Awareness . . . . . . . . . . . . . . . . 11
     5.2.  Modified Address Handling  . . . . . . . . . . . . . . . . 11
     5.3.  Usage of a New Address Family  . . . . . . . . . . . . . . 11
   6.  Extended MPTCP API . . . . . . . . . . . . . . . . . . . . . . 11
     6.1.  MPTCP Usage Scenarios and Application Requirements . . . . 11
     6.2.  Requirements on API Extensions . . . . . . . . . . . . . . 13
     6.3.  Design Considerations  . . . . . . . . . . . . . . . . . . 15
     6.4.  Overview of Sockets Interface Extensions . . . . . . . . . 15
     6.5.  Detailed Description . . . . . . . . . . . . . . . . . . . 16
       6.5.1.  TCP_MP_ENABLE  . . . . . . . . . . . . . . . . . . . . 16
       6.5.2.  TCP_MP_SUBFLOWS  . . . . . . . . . . . . . . . . . . . 16
       6.5.3.  TCP_MP_PROFILE . . . . . . . . . . . . . . . . . . . . 16
     6.6.  Usage examples . . . . . . . . . . . . . . . . . . . . . . 17
     6.7.  Interactions and Incompatibilities with other
           Multihoming Solutions  . . . . . . . . . . . . . . . . . . 17
     6.8.  Other Advice to Application Developers . . . . . . . . . . 17
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 17
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 17
   9.  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 18
   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 18
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 18
     11.2. Informative References . . . . . . . . . . . . . . . . . . 19
   Appendix A.  Change History of the Document  . . . . . . . . . . . 19







Scharf & Ford           Expires September 9, 2010               [Page 3]


Internet-Draft                  MPTCP API                     March 2010


1.  Introduction

   Multipath TCP (MPTCP) adds the capability of using multiple paths to
   a regular TCP session [1].  The motivations for this extension
   include increasing throughput, overall resource utilisation, and
   resilience to network failure, and these motivations are discussed,
   along with high-level design decisions, as part of the MPTCP
   architecture [4].  MPTCP [5] offers the same reliable, in-order,
   byte-stream transport as TCP, and is designed to be backward-
   compatible with both applications and the network layer.  It requires
   support inside the network stack of both endpoints.  This document
   presents the impacts that MPTCP may have on applications, such as
   performance changes compared to regular TCP.  Furthermore, it
   specifies an extended Application Programming Interface (API)
   describing how applications can exploit additional features of
   multipath transport.  MPTCP is designed to be usable without any
   application changes.  The specified API is an optional extension that
   provides access to multipath information and enables control of some
   aspects of the MPTCP implementation's behaviour, for example
   switching on or off the automatic use of MPTCP.

   The de facto standard API for TCP/IP applications is the "sockets"
   interface.  This document defines experimental MPTCP-specific
   extensions, in particular additional socket options.  It is up to the
   applications, high-level programming languages, or libraries to
   decide whether to use these optional extensions.  For instance, an
   application may want to turn on or off the MPTCP mechanism for
   certain data transfers, or provide some guidance concerning its usage
   (and thus the service the application receives).  The syntax and
   semantics of the specification is in line with the Posix standard [8]
   as much as possible.

   Some network stack implementations, specially on mobile devices, have
   centralized connection managers or other higher-level APIs to solve
   multi-interface issues, as surveyed in [14].  Their interaction with
   MPTCP is outside the scope of this note.

   There are also various related extensions of the sockets interface:
   [11] specifies sockets API extensions for a multihoming shim layer.
   The API enables interactions between applications and the multihoming
   shim layer for advanced locator management and for access to
   information about failure detection and path exploration.  Other
   experimental extensions to the sockets API are defined for the Host
   Identity Protocol (HIP) [12] in order to manage the bindings of
   identifiers and locator.  Other related API extensions exist for IPv6
   [10] and SCTP [13].  There can be interactions or incompatibilities
   of these APIs with MPTCP, which are discussed later in this document.




Scharf & Ford           Expires September 9, 2010               [Page 4]


Internet-Draft                  MPTCP API                     March 2010


   The target readers of this document are application programmers who
   develop application software that may benefit significantly from
   MPTCP.  This document also provides the necessary information for
   developers of MPTCP to implement the API in a TCP/IP network stack.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [3].

   This document uses the terminology introduced in [5].

3.  Comparison of MPTCP and Regular TCP

   This section discusses the impact that the use of MPTCP will have on
   applications, in comparison to what may be expected from the use of
   regular TCP.

3.1.  Performance Impact

   One of the key goals of adding multipath capability to TCP is to
   improve the performance of a transport connection by load
   distribution over separate subflows across potentially disjoint
   paths.  Furthermore, it is an explicit goal of MPTCP that it should
   not provide a worse performing connection that would have existed
   through the use of legacy, single-path TCP.  A corresponding
   congestion control algorithm is described in [7].  The following
   sections summarize the performance impact of MPTCP as seen by an
   application.

3.1.1.  Throughput

   The most obvious performance improvement that will be gained with the
   use of MPTCP is an increase in throughput, since MPTCP will pool more
   than one path (where available) between two endpoints.  This will
   provide greater bandwidth for an application.  If there are shared
   bottlenecks between the flows, then the congestion control algorithms
   will ensure that load is evenly spread amongst regular and multipath
   TCP sessions, so that no end user receives worse performance than
   single-path TCP.

   Furthermore, this means that an MPTCP session could achieve
   throughput that is greater than the capacity of a single interface on
   the device.  If any applications make assumptions about interfaces
   due to throughput (or vice versa), they must take this into account.

   The transport of MPTCP signaling information results in a small



Scharf & Ford           Expires September 9, 2010               [Page 5]


Internet-Draft                  MPTCP API                     March 2010


   overhead.  If multiple subflows share a same bottleneck, this
   overhead slightly reduces the capacity that is available for data
   transport.  Yet, this potential reduction of throughput will be
   neglectible in many usage scenarios, and the protocol contains
   optimisations in its design so that this overhead is minimal.

3.1.2.  Delay

   If the delays on the constituent subflows of an MPTCP connection
   differ, the jitter perceivable to an application may appear higher as
   the data is striped across the subflows.  Although MPTCP will ensure
   in-order delivery to the application, the application must be able to
   cope with the data delivery being burstier than may be usual with
   single-path TCP.  Since burstiness is commonplace on the Internet
   today, it is unlikely that applications will suffer from such an
   impact on the traffic profile, but application authors may wish to
   consider this in future development.

   In addition, applications that make round trip time (RTT) estimates
   at the application level may have some issues.  Whilst the average
   delay calculated will be accurate, whether this is useful for an
   application will depend on what it requires this information for.  If
   a new application wishes to derive such information, it should
   consider how multiple subflows may affect its measurements, and thus
   how it may wish to respond.  In such a case, an application may wish
   to express its scheduling preferences, as described later in this
   document.

3.1.3.  Resilience

   The use of multiple subflows simultaneously means that, if one should
   fail, all traffic will move to the remaining subflow(s), and
   additionally any lost packets can be retransmitted on these subflows.

   Subflow failure may be caused by issues within the network, which an
   application would be unaware of, or interface failure on the node.
   An application may, under certain circumstances, be in a position to
   be aware of such failure (e.g. by radio signal strength, or simply an
   interface enabled flag), and so must not make assumptions of an MPTCP
   flow's stablity based on this.  MPTCP will never override an
   application's request for a given interface, however, so the cases
   where this issue may be applicable are limited.

3.2.  Potential Problems







Scharf & Ford           Expires September 9, 2010               [Page 6]


Internet-Draft                  MPTCP API                     March 2010


3.2.1.  Impact of Middleboxes

   MPTCP has been designed in order to pass through the majority of
   middleboxes, for example through its ability to open subflows in
   either direction, and through its use of a data-level sequence
   number.

   Nevertheless some middleboxes may still refuse to pass MPTCP messages
   due to the presence of TCP options.  If this is the case, MPTCP
   should fall back to regular TCP.  Although this will not create a
   problem for the application (its communication will be set up either
   way), there may be additional (and indeed, user-perceivable) delay
   while the first handshake fails.

   Empirical evidence suggests that new TCP options can successfully be
   used on most paths in the Internet.  But they can also have other
   unexpected implications.  For instance, intrusion detection systems
   could be triggered.  Full analysis of MPTCP's impact on such
   middleboxes is for further study.

3.2.2.  Outdated Implicit Assumptions

   MPTCP overcomes the one-to-one mapping of the socket interface to a
   flow through the network.  As a result, applications cannot
   implicitly rely on this one-to-one mapping any more.  Applications
   that require the transport along a single path can disable the use of
   MPTCP as described later in this document.  Examples include
   monitoring tools that want to measure the available bandwidth on a
   path, or routing protocols such as BGP that require the use of a
   specific link.

3.2.3.  Security Implications

   The support for multiple IP addresses within one MPTCP connection can
   result in additional security vulnerabilities, such as possibilities
   for attackers to hijack connections.  The protocol design of MPTCP
   minimizes this risk.  An attacker on one of the paths can cause harm,
   but this is hardly an additional security risk compared to single-
   path TCP, which is vulnerable to man-in-the-middle attacks, too.  A
   detailed thread analysis of MPTCP is published in [6].

4.  Operation of MPTCP with Legacy Applications

4.1.  Overview of the MPTCP Network Stack

   MPTCP is an extension of TCP, but it is designed to be backward
   compatible for legacy applications.  TCP interacts with other parts
   of the network stack by different interfaces.  The de facto standard



Scharf & Ford           Expires September 9, 2010               [Page 7]


Internet-Draft                  MPTCP API                     March 2010


   API between TCP and applications is the sockets interface.  The
   position of MPTCP in the protocol stack can be illustrated in
   Figure 1.

                      +-------------------------------+
                      |           Application         |
                      +-------------------------------+
                             ^                 |
                  ~~~~~~~~~~~|~Socket Interface|~~~~~~~~~~~
                             |                 v
                     +-------------------------------+
                     |             MPTCP             |
                     + - - - - - - - + - - - - - - - +
                     | Subflow (TCP) | Subflow (TCP) |
                     +-------------------------------+
                     |       IP      |      IP       |
                     +-------------------------------+

                      Figure 1: MPTCP protocol stack

   In general, MPTCP can affect all interfaces that rely on the coupling
   of a TCP connection to a single IP address and TCP port pair, to one
   sockets endpoint, to one network interface, or to a given path
   through the network.

   This means that there are two classes of applications:

   o  Legacy applications: These applications use the existing API
      towards TCP without any changes.  This is the default case.

   o  MPTCP-aware applications: These applications indicate support for
      an enhance MPTCP interface.

   In the following, it is discussed to which extent MPTCP affects
   legacy applications using the existing sockets API.

4.2.  Usage of Addresses Inside Applications

   The existing sockets API implies that applications deal with data
   structures that store, amongst others, the IP addresses and TCP port
   numbers of a TCP connection.  A design objective of MPTCP is that
   legacy applications can continue to use the established sockets API
   without any changes.  However, in MPTCP there is a one-to-many
   mapping between the socket endpoint and the subflows.  This has
   several subtle implications for legacy applications using sockets API
   functions.

   During binding, an application can either select a specific address,



Scharf & Ford           Expires September 9, 2010               [Page 8]


Internet-Draft                  MPTCP API                     March 2010


   or bind to INADDR_ANY.  Furthermore, the SO_BINDTODEVICE socket
   option can be used to bind to a specific interface.  If an
   application uses a specific address, or sets the SO_BINDTODEVICE
   socket option to bind to a specific interface, then MPTCP MUST
   respect this and not interfere in the application's choices.  If an
   application binds to INADDR_ANY, it is assumed that the application
   does not care which addresses to use locally.  In this case, a local
   policy MAY allow MPTCP to automatically set up multiple subflows on
   such a connection.  The extended sockets API will allow applications
   to express specific preferences in an MPTCP-compatible way (e.g. bind
   to a subset of interfaces only).

   Applications can use the getpeername() or getsockname() functions in
   order to retrieve the IP address of the peer or of the local socket.
   These functions can be used for various purposes, including security
   mechanisms, geo-location, or interface checks.  The socket API was
   designed with an assumption that a socket is using just one address,
   and since this address is visible to the application, the application
   may assume that the information provided by the functions is the same
   during the lifetime of a connection.  However, in MPTCP, unlike in
   TCP, there is a one-to-many mapping of a connection to subflows, and
   subflows can be added and removed while the connections continues to
   exist.  Therefore, MPTCP cannot expose addresses by getpeername() or
   getsockname() that are both valid and constant during the
   connection's lifetime.

   This problem is addressed as follows: If used by a legacy
   application, the MPTCP stack MUST always return the addresses of the
   first subflow of an MPTCP connection, in all circumstances, even if
   that particular subflow is no longer in use.  As this address may not
   be valid any more if the first subflow is closed, the MPTCP stack MAY
   close the whole MPTCP connection if the first subflow is closed (fate
   sharing).  Whether to close the whole MPTCP connection by default
   SHOULD be controlled by a local policy.  Further experiments are
   needed to investigate its implications.

   Instead of getpeername() or getsockname(), MPTCP-aware applications
   can use new API calls, documented later, in order to retrieve the
   full list of address pairs for the subflows in use.

4.3.  Usage of Existing Socket Options

   The existing sockets API includes options that modify the behavior of
   sockets and their underlying communications protocols.  Various
   socket options exist on socket, TCP, and IP level.  The value of an
   option can usually be set by the setsockopt() system function.  The
   getsockopt() function gets information.  In general, the existing
   sockets interface functions cannot configure each MPTCP subflow



Scharf & Ford           Expires September 9, 2010               [Page 9]


Internet-Draft                  MPTCP API                     March 2010


   individually.  In order to be backward compatible, existing APIs
   therefore should apply to all subflows within one connection, as far
   as possible.

   One commonly used TCP socket option (TCP_NODELAY) disables the Nagle
   algorithm as described in [2].  This option is also specified in the
   Posix standard [8].  Applications can use this option in combination
   with MPTCP exactly in the same way.  It then disables the Nagle
   algorithm for the MPTCP connection, i.e., all subflows.

   TODO: Setting this option could also trigger a different path
   scheduler algorithm - specifically, that which is designed for
   latency-sensitive traffic, as described in a later section.

   Applications can also explicitly configure send and receive buffer
   sizes by the sockets API (SO_SNDBUF, SO_RCVBUF).  These socket
   options can also be used in combination with MPTCP and then affect
   the buffer size of the MPTCP connection.  However, when defining
   buffer sizes, application programmers should take into account that
   the transport over several subflows requires a certain amount of
   buffer for resequencing.  Therefore, it does not make sense to use
   MPTCP in combination with very small receive buffers.  Small send
   buffers may prevent MPTCP from efficiently scheduling data over
   different subflows.  It may be appropriate for an MPTCP
   implementation to set a lower bound for such buffers, or
   alternatively treat a small buffer size request as an implicit
   request not to use MPTCP.

   Some network stacks also provide other implementation-specific socket
   options or interfaces that affect TCP's behavior.  If a network stack
   supports MPTCP, it must be ensured that these options do not
   interfere.

4.4.  Default Enabling of MPTCP

   It is up to a local policy at the end system whether a network stack
   should automatically enable MPTCP for sockets even if there is no
   explicit sign of MPTCP awareness of the corresponding application.
   Such a choice may be under the control of the user through system
   preferences.

4.5.  Known Remaining Issues with Legacy Applications

   TODO: Future experiments will show whether legacy applications could
   break despite the backward-compatible API of MPTCP.






Scharf & Ford           Expires September 9, 2010              [Page 10]


Internet-Draft                  MPTCP API                     March 2010


5.  Minimal API Enhancements for MPTCP-aware Applications

5.1.  Indicating MPTCP Awareness

   While applications can use MPTCP with the unmodified sockets API, a
   clean interface requires small semantic changes compared to the
   existing sockets API.  Even if these changes do not affect most
   applications, they are only enabled if an application explicitly
   signals that it supports multipath transport and the enhanced
   interface, in order to maintain backward compatibility with legacy
   applications.  An application can explicitly indicate multipath
   capability by setting the TCP_MP_ENABLE option described below.

5.2.  Modified Address Handling

   The main change of the sockets API for MPTCP-aware applications is as
   follows: If a socket is MPTCP-aware and thus does not use the
   backward-compatibility mode, the functions getpeername() and
   getsockname() SHOULD fail with a new error code EMULTIPATH.  Due to
   their ambiguity, an MPTCP-aware application should not use these two
   functions.  Instead, the information about the addresses in use can
   be accessed by the extended sockets API, if needed.

5.3.  Usage of a New Address Family

   As alternative to setting a socket option, an application can also
   use a new, separate address family called AF_MULTIPATH [9].  This
   separate address family can be used to exchange multiple addresses
   between an application and the standard sockets API, and additionally
   acts as an explicit indication that an application is MPTCP-aware,
   i.e., that it can deal with the semantic changes of the sockets API,
   in particular concerning getpeername() and getsockname().  The usage
   of AF_MULTIPATH is also more flexible with respect to multipath
   transport, either IPv4 or IPv6, or both in parallel [9].

6.  Extended MPTCP API

6.1.  MPTCP Usage Scenarios and Application Requirements

   Applications that use TCP may have different requirements on the
   transport layer.  While developers have become used to the
   characteristics of regular TCP, new opportunities created by MPTCP
   could allow the service provided to be optimised further.  An
   extended API enables MPTCP-aware applications to specify preferences
   and control certain aspects of the behavior, in addition to the
   simple controls already discussed, such as switching on or off the
   automatic use of MPTCP.




Scharf & Ford           Expires September 9, 2010              [Page 11]


Internet-Draft                  MPTCP API                     March 2010


   An application that wishes to transmit bulk data will want MPTCP to
   provide a high throughput service immediately, through creating and
   maximising utilisation of all available subflows.  This is the
   default MPTCP use case.

   But at the other extreme, there are applications that are highly
   interactive, but require only a small amount of throughput, and these
   are optimally served by low latency and jitter stability.  In such a
   situation, it would be preferable for the traffic to use only the
   lowest latency subflow (assuming it has sufficient capacity), with
   one or two additional subflows for resilience and recovery purposes.

   The choice between these two options affects the scheduler in terms
   of whether traffic should be, by default, sent on one subflow or
   across both.  Even if the total bandwidth required is less than that
   available on an individual path, it is desirable to spread this load
   to reduce stress on potential bottlenecks, and this is why this
   method should be the default.  It is recognised, however, that this
   may not benefit all applications that require latency/jitter
   stability, so the other (single path) option is provided.

   In the case of the latter option, however, a further question arises:
   should additional subflows be used whenever the primary subflow is
   overloaded, or only when the primary path fails (hot-standby)?  In
   other words, is latency stability or bandwidth more important to the
   application?

   We therefore divide this option into two: Firstly, there is the
   single path which can overflow into an additional subflow; and
   secondly there is single-path with hot-standby, whereby an
   application may want an alternative backup subflow in order to
   improve resilience.  In case that data delivery on the first subflow
   fails, the data transport could immediately be continued on the
   second subflow, which is idle otherwise.

   In summary, there are three different "application profiles"
   concerning the use of MPTCP:

   1.  Bulk data transport

   2.  Latency-sensitive transport (with overflow)

   3.  Latency-sensitive transport (hot-standby)

   These different application profiles affect both the management of
   subflows, i.e., the decisions when to set up additional subflows to
   which addresses as well as the assignment of data (including
   retransmissions) to the existing subflows.  In both cases different



Scharf & Ford           Expires September 9, 2010              [Page 12]


Internet-Draft                  MPTCP API                     March 2010


   policies can exist.

   These profiles have been defined to cover the common application use
   cases.  It is not possible to cover all application requirements,
   however, and as such applications may wish to have finer control over
   subflows and packet scheduling.  A set of requirements is listed
   below.

   Although it is intended that such functionality will be achieved
   through new MPTCP-specific options, it may also be possible to infer
   some application preferences from existing socket options, such as
   TCP_NODELAY.  Whether this would be reliable, and indeed appropriate,
   is for further study.

6.2.  Requirements on API Extensions

   Because of the importance of the sockets interface there are several
   fundamental design objectives for the interface between MPTCP and
   applications:

   o  Consistency with existing sockets APIs must be maintained as far
      as possible.  In order to support the large base of applications
      using the original API, a legacy application must be able to
      continue to use standard socket interface functions when run on a
      system supporting MPTCP.  Also, MPTCP-aware applications should be
      able to access the socket without any major changes.

   o  Sockets API extensions must be minimized and independent of an
      implementation.

   o  The interface should both handle IPv4 and IPv6.

   The following is a list of specific requirements from applications:

   TODO: This list of requirements is preliminary and requires further
   discussion.  Some requirements have to be removed.

   REQ1:  Turn on/off MPTCP: An application should be able to request to
          turn on or turn off the usage of MPTCP.  This means that an
          application should be able to explicitly request the use of
          MPTCP if this is possible.  Applications should also be able
          to request not to enable MPTCP and to use regular TCP
          transport instead.  This can be implicit in many cases, e.g.,
          since MPTCP must disabled by the use of binding to a specific
          address, or may be enabled if an application uses
          AF_MULTIPATH.





Scharf & Ford           Expires September 9, 2010              [Page 13]


Internet-Draft                  MPTCP API                     March 2010


   REQ2:  An application will want to be able to restrict MPTCP to
          binding to a given set of addresses or interfaces.

   REQ3:  An application should be able to know if multiple subflows are
          in use.

   REQ4:  An application should be able to enumerate all subflows in
          use, obtain information on the addresses used by a subflow,
          and obtain a subflow's usage (e.g., ratio of traffic sent via
          this subflow).

   REQ5:  An application should be able to extract a unique identifier
          for the connection (per endpoint), analogous to a port, i.e.,
          it should be able to retrieve MPTCP's connection identifier.
          (TODO)

   REQ6:  Set/get the application profile, as discussed in the previous
          section.

   The above requirements are seen as having fairly clear benefits to
   applications.  Although in some cases they are going above and beyond
   what regular TCP would provide, they are allowing an application to
   make optimal use of the new features that MPTCP provides.

   The following requirements are more specific, and could mostly be
   implied through more generic options, such as the application profile
   selection.  They are currently included here as potential discussion
   points, however, as they may have use to application developers as
   more specific configuration options, beyond being an implicit part of
   a profile selection.

   REQ7:   Constrain the maximum number of subflows to be used by an
           MPTCP connection.

   REQ8:   Request a change in scheduling between subflows.

   REQ9:   Request a change in the number of subflows in use, thus
           triggering removal or addition of subflows.  (A finer control
           granularity would be: Request the establishment of a new
           subflow to a provided destination, and request the
           termination of a specified, existing subflow.)

   REQ10:  Control automatic establishment/termination of subflows?
           There could be different configurations of the path manager,
           e.g., 'try ASAP', 'wait until there is a bunch of data, etc.
           (Tied to application profile?)





Scharf & Ford           Expires September 9, 2010              [Page 14]


Internet-Draft                  MPTCP API                     March 2010


   REQ11:  Set/get preferred subflows or subflow usage policies?  There
           could be different configurations of the multipath scheduler,
           e.g., 'all-or-nothing', 'overflow', etc.  (Again, tied to
           application profile?)

   REQ12:  Get/set redundancy, i.e., to send segments on more than one
           path in parallel.

   REQ13:  An application should be able to modify the MPTCP
           configuration while communication is ongoing, i.e., after
           establishment of the MPTCP connection.

6.3.  Design Considerations

   Multipath transport results in many degrees of freedom.  MPTCP
   manages the data transport over different subflows automatically.  By
   default, this is transparent to the application.  But applications
   can use the sockets API extensions defined in this section to
   interface with the MPTCP layer and to control important aspects of
   the MPTCP implementation's behaviour.  The API uses non-mandatory
   socket options and is designed to be as light-weight as possible.

   MPTCP mainly affects the sending of data.  Therefore, most of the new
   socket options must be set in the sender side of a data transfer in
   order to take effect.  Nevertheless, it is also possible for a
   receiver to have preferences about data transfer choices, as it may
   too have performance requirements.  (TODO) It is for further study as
   to whether it is feasible for a receiving application to influence
   sending policy, and if so, how this could be implemented.

   As this document specifies sockets API extensions, it is written so
   that the syntax and semantics are in line with the Posix standard [8]
   as much as possible.

6.4.  Overview of Sockets Interface Extensions

   The extended MPTCP API consist of several new socket options that are
   specific to MPTCP.  All of these socket options are defined at TCP
   level (IPPROTO_TCP).  These socket options can be used either by the
   getsockopt() or by the setsockopt() system call.

   The new API functions can be classified into general configuration
   and more advanced configuration.  The new socket options for the
   general configuration of MPTCP are:

   o  TCP_MP_ENABLE: Enable/disable MPTCP





Scharf & Ford           Expires September 9, 2010              [Page 15]


Internet-Draft                  MPTCP API                     March 2010


   o  TCP_MP_SUBFLOWS: Get the addresses currently used by the MPTCP
      subflows, optionally complemented by further information such as
      usage ratio

   o  TCP_MP_PROFILE: Get/set the MPTCP profile

   o  ...

   Table Table 1 shows a list of the socket options for the general
   configuration of MPTCP.  The first column gives the name of the
   option.  The second and third columns indicate whether the option can
   be handled by the getsockopt() system call and/or by the setsockopt()
   system call.  The fourth column lists the type of data structure
   specified along with the socket option.

                +-----------------+-----+-----+-----------+
                | Option name     | Get | Set | Data type |
                +-----------------+-----+-----+-----------+
                | TCP_MP_ENABLE   |  o  |  o  |    int    |
                | TCP_MP_SUBFLOWS |  o  |     |     *1    |
                | TCP_MP_PROFILE  |  o  |  o  |    int    |
                | ...             |     |     |           |
                +-----------------+-----+-----+-----------+

     *1: Data structure containing the addresses of each subflow, plus
                            further information

                     Table 1: Socket options for MPTCP

   TODO: More options may be added in a future version of this note.

6.5.  Detailed Description

6.5.1.  TCP_MP_ENABLE

   TODO: Description

6.5.2.  TCP_MP_SUBFLOWS

   TODO: Description

6.5.3.  TCP_MP_PROFILE

   TODO: Description







Scharf & Ford           Expires September 9, 2010              [Page 16]


Internet-Draft                  MPTCP API                     March 2010


6.6.  Usage examples

   TODO: Example C code for one or more API functions

6.7.  Interactions and Incompatibilities with other Multihoming
      Solutions

   The use of MPTCP can interact with various related sockets API
   extensions.  Care should be taken for the usage not to confuse with
   the overlapping features:

   o  SHIM API [11]: This API specifies sockets API extensions for the
      multihoming shim layer.

   o  HIP API [12]: The Host Identity Protocol (HIP) also results in a
      new API.

   The use of a multihoming shim layer conflicts with multipath
   transport such as MPTCP or SCTP [11].  In order to avoid any
   conflict, multiaddressed MPTCP SHOULD not be enabled if a network
   stack uses SHIM6 or HIP.  Furthermore, applications should not try to
   use both the MPTCP API and a multihoming shim layer API.  It is
   feasible, however, that some of the MPTCP functionality, such as
   congestion control, could be used in a SHIM6 or HIP environment.
   Such operation is outside the scope of this document.

6.8.  Other Advice to Application Developers

   o  Using the default MPTCP configuration: MPTCP is designed to be
      efficient and robust in the default configuration.  Application
      developers should not explicitly configure features unless this is
      really needed.

   o  Socker buffer dimensioning: Multipath transport requires larger
      buffers in the receiver for resequencing, as already explained.
      Applications should use reasonably buffer sizes (such as the
      operating system default values) in order to fully benefit from
      MPTCP.

7.  Security Considerations

   Will be added in a later version of this document.

8.  IANA Considerations

   No IANA considerations.





Scharf & Ford           Expires September 9, 2010              [Page 17]


Internet-Draft                  MPTCP API                     March 2010


9.  Conclusion

   This document discusses MPTCP's application implications and
   specifies an extended API.  From an architectural point of view,
   MPTCP offers additional degrees of freedom concerning the transport
   of data.  The extended sockets API allows MPTCP-aware applications to
   have additional control of some aspects of the MPTCP implementation's
   behaviour and to obtain information about its usage.  The new socket
   options for MPTCP can be used by getsockopt() and/or setsockopt()
   system calls.  But it is also ensured that the existing sockets API
   continues to work for legacy applications.

10.  Acknowledgments

   Authors sincerely thank to the following people for their helpful
   comments to the document: Costin Raiciu

   Michael Scharf is supported by the German-Lab project
   (http://www.german-lab.de/) funded by the German Federal Ministry of
   Education and Research (BMBF).  Alan Ford is supported by Trilogy
   (http://www.trilogy-project.org/), a research project (ICT-216372)
   partially funded by the European Community under its Seventh
   Framework Program.  The views expressed here are those of the
   author(s) only.  The European Commission is not liable for any use
   that may be made of the information in this document.

11.  References

11.1.  Normative References

   [1]   Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
         September 1981.

   [2]   Braden, R., "Requirements for Internet Hosts - Communication
         Layers", STD 3, RFC 1122, October 1989.

   [3]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
         Levels", BCP 14, RFC 2119, March 1997.

   [4]   Ford, A., Raiciu, C., Barre, S., and J. Iyengar, "Architectural
         Guidelines for Multipath TCP Development",
         draft-ietf-mptcp-architecture-00 (work in progress),
         March 2010.

   [5]   Ford, A., Raiciu, C., and M. Handley, "TCP Extensions for
         Multipath Operation with Multiple Addresses",
         draft-ford-mptcp-multiaddressed-02 (work in progress),
         October 2009.



Scharf & Ford           Expires September 9, 2010              [Page 18]


Internet-Draft                  MPTCP API                     March 2010


   [6]   Bagnulo, M., "Threat Analysis for Multi-addressed/Multi-path
         TCP", draft-ietf-mptcp-threat-00 (work in progress),
         February 2010.

   [7]   Raiciu, C., Handley, M., and D. Wischik, "Coupled Multipath-
         Aware Congestion Control", draft-raiciu-mptcp-congestion-00
         (work in progress), October 2009.

   [8]   "IEEE Std. 1003.1-2008 Standard for Information Technology --
         Portable Operating System Interface (POSIX). Open Group
         Technical Standard: Base Specifications, Issue 7, 2008.".

11.2.  Informative References

   [9]   Sarolahti, P., "Multi-address Interface in the Socket API",
         draft-sarolahti-mptcp-af-multipath-01 (work in progress),
         March 2010.

   [10]  Stevens, W., Thomas, M., Nordmark, E., and T. Jinmei, "Advanced
         Sockets Application Program Interface (API) for IPv6",
         RFC 3542, May 2003.

   [11]  Komu, M., Bagnulo, M., Slavov, K., and S. Sugimoto, "Socket
         Application Program Interface (API) for Multihoming Shim",
         draft-ietf-shim6-multihome-shim-api-13 (work in progress),
         February 2010.

   [12]  Komu, M. and T. Henderson, "Basic Socket Interface Extensions
         for Host Identity Protocol (HIP)", draft-ietf-hip-native-api-12
         (work in progress), January 2010.

   [13]  Stewart, R., Poon, K., Tuexen, M., Yasevich, V., and P. Lei,
         "Sockets API Extensions for Stream Control Transmission
         Protocol (SCTP)", draft-ietf-tsvwg-sctpsocket-21 (work in
         progress), February 2010.

   [14]  Wasserman, M., "Current Practices for Multiple Interface
         Hosts", draft-ietf-mif-current-practices-00 (work in progress),
         October 2009.

Appendix A.  Change History of the Document

   Changes compared to version 00:

   o  Distinction between legacy and MPTCP-aware applications

   o  Guidance concerning default enabling, reaction to the shutdown of
      the first sub-flow, etc.



Scharf & Ford           Expires September 9, 2010              [Page 19]


Internet-Draft                  MPTCP API                     March 2010


   o  Reference to a potential use of AF_MULTIPATH

   o  Additional references to related work

Authors' Addresses

   Michael Scharf
   Alcatel-Lucent Bell Labs
   Lorenzstrasse 10
   70435 Stuttgart
   Germany

   EMail: michael.scharf@alcatel-lucent.com


   Alan Ford
   Roke Manor Research
   Old Salisbury Lane
   Romsey, Hampshire  SO51 0ZN
   UK

   Phone: +44 1794 833 465
   EMail: alan.ford@roke.co.uk




























Scharf & Ford           Expires September 9, 2010              [Page 20]

Document	Document type	This is an older version of an Internet-Draft whose latest revision state is "Replaced". Expired & archived
	Select version	00 01 02 03 04
	Compare versions
	Author
	Replaced by	draft-ietf-mptcp-api
	RFC stream	(None)
	Other formats	txt pdf bibtex bibxml