Internet Engineering Task Force                   R. E. Gilligan (Sun)
INTERNET-DRAFT                                   S. Thomson (Bellcore)
                                                    J. Bound (Digital)
                                                       April 18, 1996

                Basic Socket Interface Extensions for IPv6
                    <draft-ietf-ipngwg-bsd-api-05.txt>

Abstract

   In order to implement the version 6 Internet Protocol (IPv6) [1] in
   an operating system based on Berkeley Unix (4.x BSD), changes must be
   made to the application program interface (API).  TCP/IP applications
   written for BSD-based operating systems have in the past enjoyed a
   high degree of portability because most of the systems derived from
   BSD provide the same API, known informally as "the socket interface".
   We would like the same portability with IPv6.  This memo presents a
   basic set of extensions to the BSD socket API to support IPv6.  The
   changes include a new data structure to carry IPv6 addresses, new
   address conversion functions, and some new setsockopt() options.  The
   extensions are designed to provide access to IPv6 features, while
   introducing a minimum of change into the system and providing
   complete compatibility for existing IPv4 applications.  Additional
   extensions for new IPv6 features may be added at a later time.

Status of this Memo

   This document is an Internet Draft.  Internet Drafts are working
   documents of the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups.  Note that other groups may also distribute
   working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months.  This Internet Draft expires on October 18, 1996.  Internet
   Drafts may be updated, replaced, or obsoleted by other documents at
   any time.  It is not appropriate to use Internet Drafts as reference
   material or to cite them other than as a "working draft" or "work in
   progress."

   To learn the current status of any Internet-Draft, please check the
   1id-abstracts.txt listing contained in the Internet-Drafts Shadow
   Directories on ds.internic.net, nic.nordu.net, ftp.isi.edu, or
   munnari.oz.au.

   Distribution of this memo is unlimited.

1.  Introduction




draft-ietf-ipngwg-bsd-api-05.txt                                [Page 1]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   While IPv4 addresses are 32-bits long, IPv6 nodes are identified by
   128-bit addresses.  The socket interface make the size of an IP
   address quite visible to an application; virtually all TCP/IP
   applications for BSD-based systems have knowledge of the size of an
   IP address.  Those parts of the API that expose the addresses need to
   be extended to accommodate the larger IPv6 address size.  IPv6 also
   introduces new features, some of which must be made visible to
   applications via the API.  This paper defines a set of extensions to
   the socket interface to support the larger address size and new
   features of IPv6.

   This specification is preliminary.  These API extensions are expected
   to evolve as we gain more implementation experience.

2.  Design Considerations

   There are a number of important considerations in designing changes
   to this well-worn API:

    -  The extended API should provide both source and binary
       compatibility for programs written to the original API.  That is,
       existing program binaries should continue to operate when run on
       a system supporting the new API.  In addition, existing
       applications that are re-compiled and run on a system supporting
       the new API should continue to operate.  Simply put, the API
       changes for IPv6 should not break existing programs.

    -  The changes to the API should be as small as possible in order to
       simplify the task of converting existing IPv4 applications to
       IPv6.

    -  Where possible, applications should be able to use the extended
       API to interoperate with both IPv6 and IPv4 hosts.  Applications
       should not need to know which type of host they are communicating
       with.

    -  IPv6 addresses carried in data structures should be 64-bit
       aligned.  This is necessary in order to obtain optimum
       performance on 64-bit machine architectures.

   Because of the importance of providing IPv4 compatibility in the API,
   these extensions are explicitly designed to operate on machines that
   provide complete support for both IPv4 and IPv6.  A subset of this
   API could probably be designed for operation on systems that support
   only IPv6.  However, this is not addressed in this document.

2.1.  What Needs to be Changed




draft-ietf-ipngwg-bsd-api-05.txt                                [Page 2]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   The socket interface API consists of a few distinct components:

    -  Core socket functions.

    -  Address data structures.

    -  Name-to-address translation functions.

    -  Address conversion functions.

   The core socket functions -- those functions that deal with such
   things as setting up and tearing down TCP connections, and sending
   and receiving UDP packets -- were designed to be transport
   independent.  Where protocol addresses are passed as function
   arguments, they are carried via opaque pointers.  A protocol specific
   address data structure is defined for each protocol that the socket
   functions support.  Applications must cast these protocol specific
   address structures into the generic "sockaddr" data type when using
   the socket functions.  These functions need not change for IPv6, but
   a new IPv6 specific address data structure is needed.

   The "sockaddr_in" structure is the protocol specific data structure
   for IPv4.  This data structure actually includes 8-octets of unused
   space, and it is tempting to try to use this space to adapt the
   sockaddr_in structure to IPv6.  Unfortunately, the sockaddr_in
   structure is not large enough to hold the 16-octet IPv6 address as
   well as the other information (2-octet address family and 2-octet
   port number) that is needed.  So a new address data structure must be
   defined for IPv6.

   The name-to-address translation functions in the socket interface are
   gethostbyname() and gethostbyaddr().  Gethostbyname() does not
   provide enough flexibility to accommodate protocols other than IPv4.
   POSIX, in its 1003.g draft specification, has proposed a new hostname
   to address translation function which is protocol independent.  This
   function can be used with IPv6, so no new function is defined here.

   The address conversion functions -- inet_ntoa() and inet_addr() --
   convert IPv4 addresses between binary and printable form.  These
   functions are quite specific to 32-bit IPv4 addresses.  We have
   designed two analogous functions which convert both IPv4 and IPv6
   addresses, and carry an address type parameter so that they can be
   extended to other protocol families as well.

   Finally, a few miscellaneous features are needed to support IPv6.  A
   new interface is needed in order to support the IPv6 flow label and
   priority header fields.  New interfaces are needed in order to
   receive IPv6 multicast packets and control the sending of multicast



draft-ietf-ipngwg-bsd-api-05.txt                                [Page 3]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   packets.

   The socket interface may be further extended in the future to provide
   access to other IPv6 features.  These extensions will be made in
   separate documents.

3.  Socket Interface

   This section specifies the socket interface changes for IPv6.

   The data types of the structure elements given in the following
   section are intended to be examples, not absolute requirements.
   System implementations may use other types if they are appropriate.
   In some cases, such as when a field of a data structure holds a
   protocol value, the structure field must be of some minimum size.
   These size requirements are noted in the text.  For example, since
   the UDP and TCP port values are 16-bit quantities, the sin6_port
   field must be at least a 16-bit data types.  The sin6_port field is
   specified as a u_int16m_t type, but an implementation may use any
   data type that is at least 16-bits long.

3.1.  New Address Family

   A new address family macro, named AF_INET6, is defined in
   <sys/socket.h>.  The AF_INET6 definition is used to distinguish
   between the original sockaddr_in address data structure, and the new
   sockaddr_in6 data structure.

   A new protocol family macro, named PF_INET6, is defined in
   <sys/socket.h>.  Like most of the other protocol family macros, this
   will usually be defined to have the same value as the corresponding
   address family macro:

       #define PF_INET6        AF_INET6

   The PF_INET6 is used in the first argument to the socket() function
   to indicate that an IPv6 socket is being created.

3.2. IPv6 Address Data Structure

   A new data structure to hold a single IPv6 address is defined as
   follows:

       struct in6_addr {
               u_char  s6_addr[16];      /* IPv6 address */
       }

   This data structure contains an array of sixteen 8-bit elements,



draft-ietf-ipngwg-bsd-api-05.txt                                [Page 4]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   which make up one 128-bit IPv6 address.  The IPv6 address is stored
   in network byte order.

   Applications obtain the declaration for this structure by including
   the system header file <netinet/in.h>.

3.3.  Socket Address Structure for 4.3 BSD-Based Systems

   In the socket interface, a different protocol-specific data structure
   is defined to carry the addresses for each of the protocol suite.
   Each protocol-specific data structure is designed so it can be cast
   into a protocol-independent data structure -- the "sockaddr"
   structure.  Each has a "family" field which overlays the "sa_family"
   of the sockaddr data structure.  This field can be used to identify
   the type of the data structure.

   The sockaddr_in structure is the protocol-specific address data
   structure for IPv4.  It is used to pass addresses between
   applications and the system in the socket functions. The following
   structure is defined to carry IPv6 addresses:

       struct sockaddr_in6 {
               u_int16m_t      sin6_family;    /* AF_INET6 */
               u_int16m_t      sin6_port;      /* Transport layer port # */
               u_int32m_t      sin6_flowinfo;  /* IPv6 flow information */
               struct in6_addr sin6_addr;      /* IPv6 address */
       };

   This structure is designed to be compatible with the sockaddr data
   structure used in the 4.3 BSD release.

   The sin6_family field is used to identify this as a sockaddr_in6
   structure.  This field is designed to overlay the sa_family field
   when the buffer is cast to a sockaddr data structure.  The value of
   this field must be AF_INET6.

   The sin6_port field is used to store the 16-bit UDP or TCP port
   number.  This field is used in the same way as the sin_port field of
   the sockaddr_in structure.  The port number is stored in network byte
   order.

   The sin6_flowinfo field is a 32-bit field that is used to store two
   pieces of information: the 24-bit IPv6 flow label and the 4-bit
   priority field.  The IPv6 flow label is represented as the low-order
   24-bits of the 32-bit field.  The priority is represented in the next
   4-bits above this.  The high-order 4 bits of this field are reserved.
   The sin6_flowinfo field is stored in network byte order.  The use of
   the flow label and priority fields are explained in sec 4.9.



draft-ietf-ipngwg-bsd-api-05.txt                                [Page 5]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   The sin6_addr field is a single in6_addr structure (defined in the
   previous section).  This field holds one 128-bit IPv6 address.  The
   address is stored in network byte order.

   The ordering of elements in this structure is specifically designed
   so that the sin6_addr field will be aligned on a 64-bit boundary.
   This is done for optimum performance on 64-bit architectures.

   Applications obtain the declaration of the sockaddr_in6 structure by
   including the system header file <netinet/in.h>.

3.4. Socket Address Structure for 4.4 BSD-Based Systems

   The 4.4 BSD release includes a small, but incompatible change to the
   socket interface.  The "sa_family" field of the sockaddr data
   structure was changed from a 16-bit value to an 8-bit value, and the
   space saved used to hold a length field, named "sa_len". The
   sockaddr_in6 data structure given in the previous section can not be
   correctly cast into the newer sockaddr data structure.  For this
   reason, following alternative IPv6 address data structure is provided
   to be used on systems based on 4.4 BSD:

       #define SIN6_LEN

       struct sockaddr_in6 {
               u_char          sin6_len;       /* length of this struct */
               u_char          sin6_family;    /* AF_INET6 */
               u_int16m_t      sin6_port;      /* Transport layer port # */
               u_int32m_t      sin6_flowinfo;  /* IPv6 flow information */
               struct in6_addr sin6_addr;      /* IPv6 address */
       };

   The only differences between this data structure and the 4.3 BSD
   variant are the inclusion of the length field, and the change of the
   family field to a 8-bit data type.  The definitions of all the other
   fields are identical to the 4.3 BSD variant defined in the previous
   section.

   Systems that provide this version of the sockaddr_in6 data structure
   must also declare the SIN6_LEN as a result of including the
   <netinet/in.h> header file.  This macro allows applications to
   determine whether they are being built on a system that supports the
   4.3 BSD or 4.4 BSD variants of the data structure.  Applications can
   be written to run on both systems by simply making their assignments
   and use of the sin6_len field conditional on the SIN6_LEN field.  For
   example, to fill in an IPv6 address structure in an application, one
   might write:




draft-ietf-ipngwg-bsd-api-05.txt                                [Page 6]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996



       struct sockaddr_in6 sin6;

       bzero((char *) &sin6, sizeof(struct sockaddr_in6));
       #ifdef SIN6_LEN
       sin6.sin6_len = sizeof(struct sockaddr_in6);
       #endif
       sin6.sin6_family = AF_INET6;
       sin6.sin6_port = htons(23);

   Note that the size of the sockaddr_in6 structure is larger than the
   size of the sockaddr structure.  Applications that use the
   sockaddr_in6 structure need to be aware that they can not use
   sizeof(sockaddr) to allocate a buffer to hold a sockaddr_in6
   structure.  They should use sizeof(sockaddr_in6) instead.

3.5.  The Socket Functions

   Applications use the socket() function to create a socket descriptor
   that represents a communication endpoint.  The arguments to the
   socket() function tell the system which protocol to use, and what
   format address structure will be used in subsequent functions.  For
   example, to create an IPv4/TCP socket, applications make the call:

       s = socket (PF_INET, SOCK_STREAM, 0);

   To create an IPv4/UDP socket, applications make the call:

       s = socket (PF_INET, SOCK_DGRAM, 0);

   Applications may create IPv6/TCP and IPv6/UDP sockets by simply using
   the constant PF_INET6 instead of PF_INET in the first argument.  For
   example, to create an IPv6/TCP socket, applications make the call:

       s = socket (PF_INET6, SOCK_STREAM, 0);

   To create an IPv6/UDP socket, applications make the call:

       s = socket (PF_INET6, SOCK_DGRAM, 0);

   Once the application has created a PF_INET6 socket, it must use the
   sockaddr_in6 address structure when passing addresses in to the
   system.  The functions which the application uses to pass addresses
   into the system are:







draft-ietf-ipngwg-bsd-api-05.txt                                [Page 7]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996



       bind()
       connect()
       sendmsg()
       sendto()

   The system will use the sockaddr_in6 address structure to return
   addresses to applications that are using PF_INET6 sockets.  The
   functions that return an address from the system to an application
   are:

       accept()
       recvfrom()
       recvmsg()
       getpeername()
       getsockname()

   No changes to the syntax of the socket functions are needed to
   support IPv6, since the all of the "address carrying" functions use
   an opaque address pointer, and carry an address length as a function
   argument.

3.6.  Compatibility with IPv4 Applications

   In order to support the large base of applications using the original
   API, system implementations must provide complete source and binary
   compatibility with the original API.  This means that systems must
   continue to support PF_INET sockets and the sockaddr_in addresses
   structure.  Applications must be able to create IPv4/TCP and IPv4/UDP
   sockets using the PF_INET constant in the socket() function, as
   described in the previous section.  Applications should be able to
   hold a combination of IPv4/TCP, IPv4/UDP, IPv6/TCP and IPv6/UDP
   sockets simultaneously within the same process.

   Applications using the original API should continue to operate as
   they did on systems supporting only IPv4.  That is, they should
   continue to interoperate with IPv4 nodes.  It is not clear, though,
   how, or even if, those IPv4 applications should interoperate with
   IPv6 nodes.  The open issues section (section 9) discusses some of
   the alternatives.

3.7.  Compatibility with IPv4 Nodes

   The API also provides a different type of compatibility: the ability
   for applications using the extended API to interoperate with IPv4
   nodes.  This feature uses the IPv4-mapped IPv6 address format defined
   in the IPv6 addressing architecture specification [2].  This address
   format allows the IPv4 address of an IPv4 node to be represented as



draft-ietf-ipngwg-bsd-api-05.txt                                [Page 8]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   an IPv6 address.  The IPv4 address is encoded into the low-order 32-
   bits of the IPv6 address, and the high-order 96-bits hold the fixed
   prefix 0:0:0:0:0:FFFF.  IPv4-mapped addresses are written as follows:

       ::FFFF:<IPv4-address>

   Applications may use PF_INET6 sockets to open TCP connections to IPv4
   nodes, or send UDP packets to IPv4 nodes, by simply encoding the
   destination's IPv4 address as an IPv4-mapped IPv6 address, and
   passing that address, within a sockaddr_in6 structure, in the
   connect() or sendto() call.  When applications use PF_INET6 sockets
   to accept TCP connections from IPv4 nodes, or receive UDP packets
   from IPv4 nodes, the system returns the peer's address to the
   application in the accept(), recvfrom(), or getpeername() call using
   a sockaddr_in6 structure encoded this way.

   Few applications will likely need to know which type of node they are
   interoperating with.  However, for those applications that do need to
   know, the inet6_isipv4addr() function, defined in section 6.3, is
   provided.

3.8.  Flow Information

   The IPv6 header has a 24-bit field to hold a "flow label", and a 4-
   bit field to hold a "priority" value.  Applications have control over
   what values for these fields are used in packets that they originate,
   and have access to the field values of packets that they receive.

   The sin6_flowinfo field of the sockaddr_in6 structure encodes two
   pieces of information: IPv6 flow label and IPv6 priority.
   Applications use this field to set the flow label and priority in
   IPv6 headers of packets they generate, and to retrieve the flow label
   and priority from the packets they receive.  The header fields of an
   actively opened TCP connection are set by assigning in the
   sin6_flowinfo field of the destination address sockaddr_in6 structure
   passed in the connect() function.  The same technique can be used
   with the sockaddr_in6 structure passed in to the sendto() or
   sendmsg() function to set the flow label and priority fields of UDP
   packets.  Similarly, the flow label and priority values of received
   UDP packets and accepted TCP connections are reflected in the
   sin6_flowinfo field of the sockaddr_in6 structure returned to the
   application by the recvfrom(), recvmsg(), and accept() functions.
   And an application may specify the flow label and priority to use in
   transmitted packets of a passively accepted TCP connection, by
   setting the sin6_flowinfo field of the address passed in the bind()
   function.

   Implementations provide two bitmask constant declarations to help



draft-ietf-ipngwg-bsd-api-05.txt                                [Page 9]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   applications select out the flow label and priority fields.  These
   constants are:

       IPV6_FLOWINFO_FLOWLABEL
       IPV6_FLOWINFO_PRIORITY

   These constants can be applied to the sin6_flowinfo field of
   addresses returned to the application, for example:

       struct sockaddr_in6 sin6;
        . . .
       recvfrom(s, buf, buflen, flags, (struct sockaddr *) &sin6, &fromlen);
        . . .
       received_flowlabel = sin6.sin6_flowinfo & IPV6_FLOWINFO_FLOWLABEL;
       received_priority = sin6.sin6_flowinfo & IPV6_FLOWINFO_PRIORITY;

   On the sending side, applications are responsible for selecting the
   flow label value.  The system provides constant declarations for the
   IPv6 priority values defined in the IPv6 specification [1].  These
   constants are:

       IPV6_PRIORITY_UNCHARACTERIZED
       IPV6_PRIORITY_FILLER
       IPV6_PRIORITY_UNATTENDED
       IPV6_PRIORITY_RESERVED1
       IPV6_PRIORITY_BULK
       IPV6_PRIORITY_RESERVED2
       IPV6_PRIORITY_INTERACTIVE
       IPV6_PRIORITY_CONTROL
       IPV6_PRIORITY_8
       IPV6_PRIORITY_9
       IPV6_PRIORITY_10
       IPV6_PRIORITY_11
       IPV6_PRIORITY_12
       IPV6_PRIORITY_13
       IPV6_PRIORITY_14
       IPV6_PRIORITY_15

   Applications can use these constants along with the flow label they
   selected to assign the sin6_flowinfo field, for example:

       struct sockaddr_in6 sin6;
        . . .
       send_flowlabel = . . . ;
        . . .
       sin6.sin6_flowinfo = IPV6_PRIORITY_UNATTENDED |
                            (IPV6_FLOWINFO_FLOWLABEL & send_flowlabel);




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 10]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   The macro declarations for these constants are obtained by including
   the header file <netinet/in.h>.

3.9. Binding to System-Selected Address

   While the bind() function allows applications to select the source IP
   address of UDP packets and TCP connections, applications often wish
   to let the system select the source address for them.  In IPv4, this
   is done by specifying the IPv4 address represented by the symbolic
   constant INADDR_ANY in the bind() call, or by simply by skipping the
   bind() entirely.

   Since the IPv6 address type is a structure (struct in6_addr), a
   symbolic constant can be used to initialize an IPv6 address variable,
   but can not be used in an assignment.  Therefore systems provide the
   IPv6 address value that can be used to instruct the system to select
   the source IPv6 address in two forms.

   The first version is a global variable named "in6addr_any" which is
   an in6_addr type structure.  The extern declaration for this variable
   is:

       extern const struct in6_addr in6addr_any;

   Applications use in6addr_any similarly to the way they use INADDR_ANY
   in IPv4.  For example, to bind a socket to port number 23, but let
   the system select the source address, an application could use the
   following code:

       struct sockaddr_in6 sin6;
        . . .
       sin6.sin6_family = AF_INET6;
       sin6.sin6_flowinfo = 0;
       sin6.sin6_port = htons(23);
       sin6.sin6_addr = in6addr_any;
        . . .
       if (bind(s, (struct sockaddr *) &sin6, sizeof(sin6)) == -1)
               . . .

   The other version is a symbolic constant named IN6ADDR_ANY_INIT.
   This constant can be used to initialize an in6_addr structure:

       struct in6_addr anyaddr = IN6ADDR_ANY_INIT;

   Note that this constant can be used ONLY at declaration type.  It can
   not be used assign a previously declared in6_addr structure.  For
   example, the following code will not work:




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 11]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996



       /* This is the WRONG way to assign an unspecified address */
       struct sockaddr_in6 sin6;
        . . .
       sin6.sin6_addr = IN6ADDR_ANY_INIT; /* Will NOT compile */


   The extern declaration for in6addr_any and the macro declaration for
   IN6ADDR_ANY_INIT are obtained by including <netinet/in.h>.

3.10. Communicating with Local Services

   Applications may need to send UDP packets to, or originate TCP
   connections to, services residing on the local node.  In IPv4, they
   can do this by using the constant IPv4 address INADDR_LOOPBACK in
   their connect(), sendto(), or sendmsg() call.

   IPv6 also provides a loopback address which can be used to contact
   local TCP and UDP services.  Like the unspecified address, the IPv6
   loopback address is provided in two forms -- a global variable and a
   symbolic constant.

   The global variable is an in6_addr type structure named
   "in6addr_loopback."  The extern declaration for this variable is:

       extern const struct in6_addr in6addr_loopback;

   Applications use in6addr_loopback as they would use INADDR_LOOPBACK
   in IPv4 applications.  For example, to open a TCP connection to the
   local telnet server, an application could use the following code:

       struct sockaddr_in6 sin6;
        . . .
       sin6.sin6_family = AF_INET6;
       sin6.sin6_flowinfo = 0;
       sin6.sin6_port = htons(23);
       sin6.sin6_addr = in6addr_loopback;
        . . .
       if (connect(s, (struct sockaddr *) &sin6, sizeof(sin6)) == -1)
               . . .

   The symbolic constant is named IN6ADDR_LOOPBACK_INIT.  It can be used
   at declaration time ONLY; for example:

       struct in6_addr loopbackaddr = IN6ADDR_LOOPBACK_INIT;

   Like IN6ADDR_ANY_INIT, this constant can not be used in an assignment
   to a previously declared IPv6 address variable.



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 12]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   The extern declaration for in6addr_loopback and the macro declaration
   for IN6ADDR_LOOPBACK_INIT are obtained by including <netinet/in.h>.

4. Socket Options

   A number of new socket options are defined for IPv6.  All of these
   new options are at the IPPROTO_IPV6 level.  That is, the "level"
   parameter in the getsockopt() and setsockopt() call is IPPROTO_IPV6
   when using these options.  The constant name prefix IPV6_ is used in
   all of the new socket options.  This serves to clearly identify these
   options as applying to IPv6.

   The macro declaration for IPPROTO_IPV6, the new IPv6 socket options,
   and related constants defined in this section are obtained by
   including the header file <netinet/in.h>

4.1  Changing Socket Type

   Unix allows open sockets to be passed between processes via the
   exec() call and other means.  It is a relatively common application
   practice to pass open sockets across exec() calls.  Thus it is
   possible for an application using the original API to pass an open
   PF_INET socket to an application that is expecting to receive a
   PF_INET6 socket.  Similarly, it is possible for an application using
   the extended API to pass an open PF_INET6 socket to an application
   using the original API, which would be equipped only to deal with
   PF_INET sockets.  Either of these cases could cause problems, because
   the application which is passed the open socket might not know how to
   decode the address structures returned in subsequent socket
   functions.

   To remedy this problem, a new setsockopt() option is defined that
   allows an application to "transform" a PF_INET6 socket into a PF_INET
   socket and vice-versa.

   An IPv6 application that is passed an open socket from an unknown
   process may use the IPV6_ADDRFORM setsockopt() option to "convert"
   the socket to PF_INET6.  Once that has been done, the system will
   return sockaddr_in6 address structures in subsequent socket
   functions.  Similarly, an IPv6 application that is about to pass an
   open PF_INET6 socket to a program that may not be IPv6 capable may
   "downgrade" the socket to PF_INET before calling exec().  After that,
   the system will return sockaddr_in address structures to the
   application that was exec()'ed.

   The IPV6_ADDRFORM option is valid at both the IPPROTO_IP and
   IPPROTO_IPV6 levels.  The only valid option values are PF_INET6 and
   PF_INET.  For example, to convert a PF_INET6 socket to PF_INET, a



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 13]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   program would call:

       int addrform = PF_INET;

       if (setsockopt(s, IPPROTO_IPV6, IPV6_ADDRFORM, (char *) &addrform,
               sizeof(addrform)) == -1)
               perror("setsockopt IPV6_ADDRFORM");

   An application may use IPV6_ADDRFORM in the getsockopt() function to
   learn whether an open socket is a PF_INET of PF_INET6 socket.  For
   example:

       int addrform;
       size_t len = sizeof(int);

       if (getsockopt(s, IPPROTO_IPV6, IPV6_ADDRFORM, (char *) &addrform,
               &len) == -1)
               perror("getsockopt IPV6_ADDRFORM");
       if (addrform == PF_INET)
               printf("This is an IPv4 socket.\n");
       else if (addrform == PF_INET6)
               printf("This is an IPv6 socket.\n");
       else
               printf("This system is broken.\n");

4.2.  Unicast Hop Limit

   A new setsockopt() option is used to control the hop limit used in
   outgoing unicast IPv6 packets.  The name of this option is
   IPV6_UNICAST_HOPS, and it is used at the IPPROTO_IPV6 layer.  The
   following example illustrates how it is used:

       int hoplimit = 10;

       if (setsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS, (char *) &hoplimit,
               sizeof(hoplimit)) == -1)
               perror("setsockopt IPV6_UNICAST_HOPS");

   When the IPV6_UNICAST_HOPS option is set with setsockopt(), the
   option value given is used as the hop limit for all subsequent
   unicast packets sent via that socket.  If the option is not set, the
   system selects a default value.

   The IPV6_UNICAST_HOPS option may be used in the getsockopt() function
   to determine the hop limit value that the system will use for
   subsequent unicast packets sent via that socket.  For example:





draft-ietf-ipngwg-bsd-api-05.txt                               [Page 14]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996



       int hoplimit;
       int len = sizeof(hoplimit);

       if (getsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS, (char *) &hoplimit,
               &len) == -1)
               perror("getsockopt IPV6_UNICAST_HOPS");
       else
               printf("Using %d for hop limit.\n", hoplimit);

4.3.  Sending and Receiving Multicast Packets

   IPv6 applications may send UDP multicast packets by simply specifying
   an IPv6 multicast address in the address argument of the sendto()
   function.

   A few setsockopt options at the IPPROTO_IPV6 layer are used to
   control some of the parameters of sending multicast packets.  These
   options are optional: applications may send multicast packets without
   using these options.  The setsockopt() options for controlling the
   sending of multicast packets are summarized below:

       IPV6_MULTICAST_IF

           Set the interface to use for outgoing multicast packets.  The
           argument is an IPv6 address of the interface to use.

           Argument type: struct in6_addr

       IPV6_MULTICAST_HOPS

           Set the hop limit to  use  for  outgoing  multicast  packets.
           (Note  a separate option - IPV6_UNICAST_HOPS - is provided to
           set the hop limit to use for outgoing unicast packets.)

           Argument type: unsigned int

       IPV6_MULTICAST_LOOP

           Controls whether outgoing multicast packets  sent  should  be
           delivered  back  to the local application.  A toggle.  If the
           option is set to 1, multicast packets are looped back.  If it
           is set to 0, they are not.

           Argument type: unsigned int

   The reception of multicast packets is controlled by the two
   setsockopt() options summarized below:



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 15]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996



       IPV6_ADD_MEMBERSHIP

           Join a multicast group.  Requests that multicast packets sent
           to  a  particular  multicast  address  be  delivered  to this
           socket.  The argument is the IPv6 multicast  address  of  the
           group to join.

           Argument type: struct ipv6_mreq

       IPV6_DROP_MEMBERSHIP

           Leave a multicast group.   Requests  that  multicast  packets
           sent to a particular multicast address no longer be delivered
           to this socket.  The argument is the IPv6  multicast  address
           of the group to join.

           Argument type: struct ipv6_mreq

   The argument type of both of these options is the ipv6_mreq
   structure, which is defined as follows:

       struct ipv6_mreq {
               /* IPv6 multicast address of group */
               struct in6_addr ipv6mr_multiaddr;

               /* local IPv6 address of interface */
               struct in6_addr ipv6mr_interface;
       };

5. Library Functions

   New library functions are needed to perform a variety of operations
   with IPv6 addresses.  Functions are needed to lookup IPv6 addresses
   in the Domain Name System (DNS).  Both forward lookup (hostname to
   address translation) and reverse lookup (address to hostname
   translation) need to be supported.  Functions are also needed to
   convert IPv6 addresses between their binary and textual form.

5.1. Hostname to Address Translation

   A new hostname to address translation function is being defined by
   the Institute of Electrical and Electronic Engineers (IEEE) as part
   of the POSIX 1003.1g (Protocol Independent Interfaces) draft
   specification [4].  This function, named getaddrinfo(), has been
   designed to be protocol independent, so it can be used without change
   to lookup IPv6 addresses.




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 16]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   As discussed in the "Transition Mechanisms for IPv6 Hosts and
   Routers" specification [5], systems may provide the ability to
   transparently query for IPv4 address records when the application
   requests an IPv6 lookup.  The getaddrinfo() function can implement
   this by automatically performing a query for IPv4 records if its
   initial query for IPv6 records finds none.  Or it may elect to always
   query for both IPv6 and IPv4 records on all lookups.  (Many DNS
   implementations do not support querying for multiple record types in
   a single request, so the IPv6 and IPv4 lookups can not be performed
   simultaneously.)  If IPv4 records are found, the addresses can be
   returned to the application as IPv4-mapped IPv6 addresses.  Systems
   that support transparent querying for IPv4 address records should
   provide a system-wide configuration switch to allow the system
   administrator to enable or disable that feature.

5.2. Address to Hostname Translation

   The POSIX 1003.1g specification includes no function to perform a
   reverse DNS lookup (query based on IPv6 address).  Therefore, we have
   defined the following function:

       int getnameinfo(
               const struct sockaddr *sa,
               size_t addrlen,
               char *host,
               size_t hostlen,
               char *serv,
               size_t servlen);

   This function looks up an IP address and port number provided by the
   caller in the DNS and system-specific database, and returns text
   strings for both in buffers provided by the caller.  The first
   argument, sa, points to either a sockaddr_in structure (for IPv4) or
   a sockaddr_in6 structure (for IPv6) which holds the IP address and
   port number.  The addrlen argument gives the length of the
   sockaddr_in or sockaddr_in6 structure.  The function returns the
   hostname associated with the IP address in the buffer pointed to by
   the host argument.  The caller provides the size of this buffer via
   the hostlen argument.  The service name associated with the port
   number is returned in the buffer pointed to by serv, and the servlen
   argument gives the length of this buffer.  The caller may instruct
   the function not to return either string by providing a zero value
   for the hostlen or servlen arguments.  Otherwise, the caller must
   provide buffers large enough to hold the fully qualified domain
   hostname, and the full service name, including the terminating null
   character.  The function indicates successful completion by a zero
   return value; a non-zero return value indicates failure.




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 17]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   Applications obtain the function prototype declaration for
   getnameinfo() by including the header file <netdb.h>.


5.3.  Address Conversion Functions

   BSD Unix provides two functions, inet_addr() and inet_ntoa(), to
   convert an IPv4 address between binary and text form.  IPv6
   applications need similar functions.  The following two functions
   convert both IPv6 and IPv4 addresses:

       ssize_t inet_pton(
               int af,
               const char *cp,
               void *ap);

   and

       char *inet_ntop(
               int af,
               const void *ap,
               size_t len,
               char *cp);

   The first function converts an address in its standard text
   presentation form into its numeric binary form.  The af argument
   specifies the family of the address.  Currently AF_INET and AF_INET6
   address families are supported.  The cp argument points to the string
   being passed in.  The ap argument points to a buffer into which the
   function stores the numeric address.  The address is returned in
   network byte order. Inet_pton() returns the length of the address in
   octets if the conversion succeeds, and -1 otherwise. The function
   does not modify the buffer pointed to by ap if the conversion fails.
   The calling application must ensure that the buffer referred to by ap
   is large enough to hold the converted address.

   If the af argument is AF_INET, the function accepts a string in the
   standard IPv4 dotted decimal form:

       ddd.ddd.ddd.ddd

   where ddd is a one to three digit decimal number between 0 and 255.

   If the af argument is AF_INET6, then the function accepts a string in
   one of the standard IPv6 text forms defined in the addressing
   architecture specification [2].

   The second function converts a numeric address into a text string



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 18]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   suitable for presentation.  The af argument specifies the family of
   the address.  This can be AF_INET or AF_INET6.  The ap argument
   points to a buffer holding an IPv4 address if the af argument is
   AF_INET, or an IPv6 address if the af argument is AF_INET6.  The len
   field specifies the length in octets of the address pointed to by ap.
   This must be 4 if af is AF_INET, or 16 if af is AF_INET6.  The cp
   argument points to a buffer that the function can use to store the
   text string.  If the cp argument is NULL, the function uses its own
   private static buffer.  If the application specifies a non-NULL cp
   argument, the buffer must be large enough to hold the text
   representation of the address, including the terminating null octet.
   For IPv6 addresses, the buffer must be at least 46-octets.  For IPv4
   addresses, the buffer must be at least 16-octets.  In order to allow
   applications to easily declare buffers of the proper size to store
   IPv4 and IPv6 addresses in string form, implementations should
   provide the following constants, made available to applications that
   include <netinet/in.h>:

       #define INET_ADDRSTRLEN         16
       #define INET6_ADDRSTRLEN        46

   The inet_ntop() function returns a pointer to the buffer containing
   the text string if the conversion succeeds, and NULL otherwise.  The
   function does not modify the storage pointed to by cp if the
   conversion fails.

   Applications obtain the prototype declarations for inet_ntop() and
   inet_pton() by including the header file <arpa/inet.h>.

5.4. Embedded IPv4 Addresses

   The IPv4-mapped IPv6 address format is used to represent IPv4
   addresses as IPv6 addresses.  Most applications should be able to to
   manipulate IPv6 addresses as opaque 16-octet quantities, without
   needing to know whether they represent IPv4 addresses.  However, a
   few applications may need to determine whether an IPv6 address is an
   IPv4-mapped address or not.  The following function is provided for
   those applications:

       int inet6_isipv4addr (const struct in6_addr *addr);

   The "addr" argument to this function points to a buffer holding an
   IPv6 address in network byte order.  The function returns true (non-
   zero) if that address is an IPv4-mapped address, and returns 0
   otherwise.

   This function could be used by server applications to determine
   whether the peer is an IPv4 node or an IPv6 node.  After accepting a



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 19]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   TCP connection via accept(), or receiving a UDP packet via
   recvfrom(), the application can apply the inet6_isipv4addr() function
   to the returned address.

   Applications obtain the prototype for this function by including the
   header file <arpa/inet.h>.

6.  Security Considerations

   IPv6 provides a number of new security mechanisms, many of which need
   to be accessible to applications.  A companion document detailing the
   extensions to the socket interfaces to support IPv6 security is being
   written [3].  At some point in the future, that document and this one
   may be merged into a single API specification.

7. Change History

   Changes from the January 1996 Edition

    -  Eliminated source routing and interface identification features
       in order to simplify the spec.  API features to provide this
       functionallity can be defined at a later time.

    -  Eliminated definitions of hostname2addr() and addr2hostname().
       Added reference to POSIX getaddrinfo() function to provide
       functionallity previously provided by hostname2addr().  Added
       definition of getnameinfo() function to provide functionallity of
       addr2hostname().

    -  Changed name of addr2ascii() and ascii2addr() functions to
       inet_ntop() and inet_pton() to be more consistent with BSD
       function naming conventions.

    -  Changed some type definitions to align with POSIX.


   Changes from the November 1995 Edition

    -  Added the symbolic constants IPV6ADDR_ANY_INIT and
       IPV6ADDR_LOOPBACK_INIT for applications to use for
       initializations.

    -  Eliminated restrictions on the value of ipv6addr_any.  Systems
       may now choose any value, including all-zeros.

    -  Added a mechanism for returning time to live with the address in
       the name-to-address translation functions.




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 20]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


    -  Added a mechanism for applications to specify the interface in
       the setsockopt() options to join and leave a multicast group.

   Changes from the July 1995 Edition

    -  Changed u_long and u_short types in structures to u_int32_t and
       u_int16_t for consistency and clarity.

    -  Added implementation-provided constants for IPv4 and IPv6 text
       address buffer length.

    -  Defined a set of constants for subfields of sin6_flowid and for
       priority values.

    -  Defined constants for getting and setting the source route flag.

    -  Define where ansi prototypes for hostname2addr(),
       addr2hostname(), addr2ascii(), ascii2addr(), and
       ipv6_isipv4addr() reside.

    -  Clarified the include file requirements.  Say that the structure
       definitions are defined as a result of including the header file
       <netinet/in.h>, not that the structures are necessarily defined
       there.

    -  Removed underscore chars from is_ipv4_addr() function name for
       BSD compatibility.

    -  Added inet6_ prefix to is_ipv4_addr() function name to avoid name
       space conflicts.

    -  Changes setsockopt option naming convention to use IPV6_ prefix
       instead of IP_ so that there is clearly no ambiguity with IPv4
       options.  Also, use level IPPROTO_IPV6 for these options.

    -  Made hostname2addr() and addr2hostname() functions thread-safe.

    -  Added support for sendmsg() and recvmsg() in source routing
       section.

    -  Changed in_addr6 to in6_addr for consistency.

    -  Re-structured document into sub-sections.

    -  Deleted the implementation experience section.  It was too wordy.

    -  Added argument types to multicast socket options.




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 21]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


    -  Added constant for largest source route array buffer.

    -  Added the freehostent() function.

    -  Added receiving interface determination and sending interface
       selection options.

    -  Added definitions of ipv6addr_any and ipv6addr_loopback.

    -  Added text making the lookup of IPv4 addresses by hostname2addr()
       optional.

   Changes from the June 1995 Edition

    -  Added capability for application to select loose or strict source
       routing.

   Changes from the March 1995 Edition

    -  Changed the definition of the ipv6_addr structure to be an array
       of sixteen chars instead of four longs.  This change is necessary
       to support machines which implement the socket interface, but do
       not have a 32-bit addressable word.  Virtually all machines which
       provide the socket interface do support an 8-bit addressable data
       type.

    -  Added a more detailed explanation that the data types defined in
       this documented are not intended to be hard and fast
       requirements.  Systems may use other data types if they wish.

    -  Added a note flagging the fact that the sockaddr_in6 structure is
       not the same size as the sockaddr structure.

    -  Changed the sin6_flowlabel field to sin6_flowinfo to accommodate
       the addition of the priority field to the IPv6 header.

   Changes from the October 1994 Edition

    -  Added variant of sockaddr_in6 for 4.4 BSD-based systems (sa_len
       compatibility).

    -  Removed references to SIT transition specification, and added
       reference to addressing architecture document, for definition of
       IPv4-mapped addresses.

    -  Added a solution to the problem of the application not providing
       enough buffer space to hold a received source route.




draft-ietf-ipngwg-bsd-api-05.txt                               [Page 22]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


    -  Moved discussion of IPv4 applications interoperating with IPv6
       nodes to open issues section.

    -  Added length parameter to addr2ascii() function to be consistent
       with addr2hostname().

    -  Changed IP_MULTICAST_TTL to IP_MULTICAST_HOPS to match IPv6
       terminology, and added IP_UNICAST_HOPS option to match
       IP_MULTICAST_HOPS.

    -  Removed specification of numeric values for AF_INET6,
       IP_ADDRFORM, and IP_RCVSRCRT, since they need not be the same on
       different implementations.

    -  Added a definition for the in_addr6 IPv6 address data structure.
       Added this so that applications could use sizeof(struct in_addr6)
       to get the size of an IPv6 address, and so that a structured type
       could be used in the is_ipv4_addr().
8. Open Issues

   A few open issues for IPv6 socket interface API specification remain,
   including:

    -  An API should be provided to allocate and free a flow label that
       meets the uniqueness and randomness requirements outlined in the
       IPv6 protocol spec.

    -  Should we add a timeout parameter to the hostname/address
       translation functions?  DNS lookups need to be given some finite
       timeout interval, so it might be nice to let the application
       specify that interval.

    -  Can the IPV6_ADDRFORM option really be implemented?

    -  Can existing IPv4 applications interoperate with IPv6 nodes?
       This issue is discussed in more detail in the following section.

8.1. IPv4 Applications Interoperating with IPv6 Nodes

   This problem primarily has to do with the how IPv4 applications
   represent addresses of IPv6 nodes.  What address should be returned
   to the application when an IPv6/UDP packet is received, or an
   IPv6/TCP connection is accepted?  The peer's address could be any
   arbitrary 128-bit IPv6 address.  But the application is only equipped
   to deal with 32-bit IPv4 addresses encoded in sockaddr_in data
   structures.

   We have not discovered any solution that provides complete



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 23]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   transparent interoperability with IPv6 nodes for applications using
   the original IPv4 API.  However, two techniques that partially solve
   the problem are:

    1) Prohibit communication between IPv4 applications and IPv6 nodes.
       Only UDP packets received from IPv4 nodes would be passed up to
       the application, and only TCP connections received from IPv4
       nodes would be accepted.  UDP packets from IPv6 nodes would be
       dropped, and TCP connections from IPv6 nodes would be refused.

    2) The system could generate a local 32-bit cookie to represent the
       full 128-bit IPv6 address, and pass this value to the
       application.  The system would maintain a mapping from cookie
       value into the 128-bit IPv6 address that it represents.  When the
       application passed a cookie back into the system (for example, in
       a sendto() or connect() call) the system would use the 128-bit
       IPv6 address that the cookie represents.

       The cookie would have to be chosen so as to be an invalid IPv4
       address (e.g. an address on net 127.0.0.0), and the system would
       have to make sure that these cookie values did not escape into
       the Internet as the source or destination addresses of IPv4
       packets.

   Both of these techniques have drawbacks.  This is an area for further
   study.  System implementors may use one of these techniques or
   implement another solution.

Acknowledgments

   Thanks to the many people who made suggestions and provided feedback
   to to the numerous revisions of this document, including: Werner
   Almesberger, Ran Atkinson, Fred Baker, Dave Borman, Andrew Cherenson,
   Alex Conta, Alan Cox, Steve Deering, Francis Dupont, Robert Elz, Marc
   Hasson, Tom Herbert, Christian Huitema, Wan-Yen Hsu, Alan Lloyd,
   Charles Lynn, Dan McDonald, Craig Metz, Erik Nordmark, Josh Osborne,
   Craig Partridge, Richard Stevens, Matt Thomas, Dean D. Throop, Glenn
   Trewitt, Paul Vixie, David Waitzman, and Carl Williams. The
   getnameinfo() function was based on the getinfobysockaddr() function
   defined by Keith Sklower.

   Ramesh Govindan made a number of contributions and co-authored an
   earlier version of this paper.

References

   [1] S. Deering, R. Hinden. "Internet Protocol, Version 6 (IPv6)
       Specification".  RFC 1883.  December 1995.



draft-ietf-ipngwg-bsd-api-05.txt                               [Page 24]


INTERNET-DRAFT      IPv6 Socket Interface Extensions          April 1996


   [2] R. Hinden, S. Deering. "IP Version 6 Addressing Architecture".
       RFC 1884.  December 1995.

   [3] D. McDonald. "IPv6 Security API for BSD Sockets".  Internet
       Draft. January 1995.

   [4] IEEE, "Protocol Independent Interfaces", IEEE Std 1003.1g, DRAFT
       6.3.  November 1995.

   [5] R. Gilligan, E. Nordmark. "Transition Mechanisms for IPv6 Hosts
       and Routers".  RFC 1933.  April 1996.

Authors' Address

           Jim Bound
           Digital Equipment Corporation
           110 Spitbrook Road ZK3-3/U14
           Nashua, NH 03062-2698
           Phone: +1 603 881 0400
           Email: bound@zk3.dec.com


           Susan Thomson
           Bell Communications Research
           MRE 2P-343, 445 South Street
           Morristown, NJ 07960
           Telephone: +1 201 829 4514
           Email: set@thumper.bellcore.com


           Robert E. Gilligan
           Mailstop MPK 17-202
           Sun Microsystems, Inc.
           2550 Garcia Avenue
           Mountain View, CA 94043-1100
           Phone: +1 415 786 5151
           Email: gilligan@eng.sun.com














draft-ietf-ipngwg-bsd-api-05.txt                               [Page 25]