INTERNET DRAFT                                          Vivek Kashyap
<draft-ietf-ipoib-dhcp-over-infiniband-05.txt>                    IBM
Expiration Date: August 2003                            February 2003

                        DHCP over InfiniBand

Status of this memo

        This document is an Internet-Draft and is in full conformance
        with all provisions of Section 10 of RFC 2026.

        Internet-Drafts are working documents of the Internet
        Engineering Task Force (IETF), its areas, and its working
        groups. Note that other groups may also distribute working
        documents as Internet- Drafts.

        Internet-Drafts are draft documents valid for a maximum of six
        months and may be updated, replaced, or obsoleted by other
        documents at any time. It is inappropriate to use
        Internet-Drafts as Reference material or to cite them other
        than as ``work in progress''.

        The list of current Internet-Drafts can be accessed at

        The list of Internet-Draft Shadow Directories can be accessed

        This memo provides information for the Internet community.
        This memo does not specify an Internet standard of any kind.
        Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2001).  All Rights Reserved.


        An InfiniBand network uses a link-layer addressing scheme that
        is 20-octets long. This is larger than the 16-octets reserved
        for the hardware address in DHCP/BOOTP message. The above
        inequality imposes restrictions on the use of the DHCP message
        fields when used over an IP over InfiniBand(IPoIB) network.
        This document describes the use of DHCP message fields when
        implementing DHCP over IPoIB.

Kashyap                                                         [Page 1]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003

1. Introduction

        The Dynamic Host Configuration Protocol(DHCP) provides a
        framework for passing configuration information to hosts on a
        TCP/IP network [RFC2131]. DHCP is based on the Bootstrap
        Protocol (BOOTP) [RFC951] adding the capability of automatic
        allocation of reusable network addresses and additional
        configuration options [RFC2131,RFC2132].

        The DHCP server receives a broadcast request from the DHCP
        client. The DHCP server uses the client interface's
        hardware-address to unicast a reply back when the client
        doesn't yet have an IP address assigned to it. The 'chaddr'
        field in the DHCP message carries the client's hardware

        The 'chaddr' field is 16-octets in length. The IPoIB link-layer
        address is 20-octets in length. Therefore the IPoIB link-layer
        address will not fit in the 'chaddr' field making it
        impossible for the DHCP server to unicast a reply back to the

        To ensure interoperability the usage of the fields and the
        method for DHCP interaction must be clarified. This document
        describes the IPoIB specific usage of some fields of DHCP. See
        [RFC2131] for the mechanism of DHCP and the explanations of
        each field.

        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
        "OPTIONAL" in this document are to be interpreted as described
        in RFC 2119 [RFC2119].

2. The DHCP over IPoIB mechanism

        As is noted above, the link-layer address is unavailable to
        the DHCP server because it is larger than the 'chaddr' field
        length. Therefore, a DHCP client MUST request that the server
        sends a broadcast reply by setting the BROADCAST flag when
        IPoIB ARP is not possible, i.e. in situations where the client
        does not know its IP address.

        RFC1542 notes that the use of a broadcast reply is
        discouraged. But in the case of IPoIB this is a necessity.
        There is no option but to broadcast back to the client since
        it is not possible to reply the client's unicast address. To
        desynchronise broadcasts at subnet startup, RFC2131 suggests
        that a client wait a random time (1 to 10 seconds) before

Kashyap                                                         [Page 2]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003

        initiating server discovery. The same timeout will equally
        spread out the DHCP server broadcast responses generated due
        to the use of the use of the BROADCAST bit.

        The client hardware address, 'chaddr', is unique in the subnet
        and hence can be used to identify the client interface. But in
        the absence of a unique chaddr the client-identifier must be

        The DHCP protocol states that the 'client identifier' option
        may be used as the unique identifying value for the client.
        This value must be unique within the subnet the client is a
        member of.

        The client identifier option includes a type and identifier
        pair. The identifier included in the client-identifier option
        may consist of a hardware address or any other unique value
        such as the DNS name of the client. When a hardware address is
        used, the type field should be one of the ARP hardware types
        listed in [ARPPARAM].

2.1 IPoIB specific usage of DHCP message fields

        A DHCP client, when working over an IPoIB interface, MUST
        follow the following rules:

            'htype' (hardware address type) MUST be 32 [ARPPARAM]

            'hlen' (hardware address length) MUST be 0.

            'chaddr' (client hardware address) field MUST be zeroed.

            'client identifier' option MUST be used in DHCP messages.

        According to RFC2132 the 'client identifier' option MAY
        consist of any data, but IPoIB clients SHOULD use the
        format discussed below for the client-identifier option.

        Note: This document does not preclude the use of other 'client
              identifier' type, such as fully qualified domain
              name(FQDN) or the EUI-64 value associated with the

2.1.1 Client-identifier values

        Every IPoIB interface is associated with an identifier
        referred to as the GID [IPoIB_ARCH]. A GID is formed by

Kashyap                                                         [Page 3]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003

        appending the port's EUI-64 identifier to the InfiniBand
        subnet prefix. An invariant GID is formed when the port's
        manufacturer assigned EUI-64 value is used to form the GID. A
        port might have additional EUI-64 values assigned to it by the
        subnet-manager(SM) [IBARCH]. Therefore a port can have
        multiple GIDs associated with it. A GID is unique in the
        InfiniBand fabric.

        The GID is associated with a particular hardware port. The GID
        and a QPN define an IPoIB interface at the port[IPOIB_ENCAP].
        Therefore an implementation could associate multiple IPoIB
        interfaces on the same port by utilising a common GID but
        different QPNs. In such a case the GID is shared between
        multiple interfaces, and therefore, the 'client identifier'
        formed from just the GID is no longer unique in the IP

        This is not an issue if the interfaces sharing the GID are in
        different InfiniBand partitions, and thereby on different
        IPoIB links, since the 'client identifier' need only be unique
        within a subnet. However, if the GID is shared by interfaces
        within the same partition the implementation MUST ensure a
        unique client-identifier. For example, a unique
        client-identifier may be formed by including the QPN
        associated with the relevant IPoIB interface if the
        implementation is designed to keep this association constant
        across boots. Some other value unique to the implementation
        may also be used for the same purpose.

        If there is only one IPoIB interface associated with a
        particular GID within a partition, then use of the GID is

        Since a port may be associated with multiple GIDs, multiple
        IPoIB interfaces may exist on the same port while using a
        different GID from among the GIDs associated with the port. In
        such a case too the GID can form a unique 'client

Kashyap                                                         [Page 4]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003

        Therefore, one of the following formats SHOULD be used for the
        client-identifier option.

        1. If the QPN is used to distinguish between interfaces using the
           same GID.

    Code  Len   Type |<---------------- Client-Identifier -------------->|
   |  61 | 21  |  32 |      20 octets(link-layer address)                |

        2. Some other 'unique value' might be used in place of the QPN
           to distinguish between interfaces using the same GID. In
           this case a 'type' of 0 MUST be specified since the
           identifier is not an identifier listed in ARPPARAM

    Code  Len   Type |<---------------- Client-Identifier -------------->|
   |  61 | 21  |  00 | Unique-value(4 octets)|   16 octet GID            |

           But if the GID is not shared with another IPoIB interface
           then there is no need for another 'unique-value'. In such a
           case the GID suffices by itself.

    Code  Len   Type |<---------------- Client-Identifier -------------->|
   |  61 | 21  |  00 |      00 (4 octets)    |   16 octet GID            |

2.2 Use of the BROADCAST flag

        A DHCP client on IPoIB MUST set the BROADCAST flag in
        DHCPDISCOVER and DHCPREQUEST messages (and set 'ciaddr' to
        zero) to ensure that the server (or the relay agent)
        broadcasts its reply to the client.

        Note: As described in [RFC2131], 'ciaddr' MUST be filled in
              with client's IP address during BOUND, RENEWING or
              REBINDING state, therefore, the BROADCAST flag MUST NOT
              be set. In these cases, the DHCP server unicasts DHCPACK
              message to the address in 'ciaddr'. The link address
              will be resolved by IPoIB ARP.

Kashyap                                                         [Page 5]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003

3. Security Considerations

        RFC2131 describes the security considerations relevant to
        DHCP. This document does not introduce any new issues.

4. Acknowledgement

        This document borrows extensively from [RFC 2855]. Roy Larsen
        pointed out the length discrepancy between the IPoIB link
        address and DHCP's chaddr field.


        [RFC2119]       Key words for use in RFCs to Indicate
                        Requirement Levels S. Bradner

        [RFC2131]       Dynamic Host Configuration Protocol, R. Droms

        [RFC2132]       DHCP Options and BOOTP Vendor Extensions,
                        S. Alexander, R. Droms

        [RFC951]        Bootstrap Protocol, B. Croft, J. Gilmore

        [RFC1542]       Clarifications and Extensions for the
                        Bootstrap Protocol W. Wimer


        [RFC2855]       DHCP for IEEE 1394, K. Fujisawa

        [IPoIB_ARCH]    draft-ietf-ipoib-architecture-01.txt, V. Kashyap

        [IPoIB_ENCAP]   draft-ietf-ipoib-ip-over-infiniband-01.txt,
                        V. Kashyap, H.K. Jerry Chu

        [IBARCH]        InfiniBand Architecture Specification, Volume 1.1

Author's Address

          Vivek Kashyap

          15450, SW Koll Parkway
          OR 97006

          Phone: +1 503 578 3422

Kashyap                                                         [Page 6]

INTERNET-DRAFT            DHCP over InfiniBand             February 2003


Full Copyright Statement

          Copyright (C) The Internet Society (2000).  All Rights Reserved.

        This document and translations of it may be copied and
        furnished to others, and derivative works that comment on or
        otherwise explain it or assist in its implementation may be
        prepared, copied, published and distributed, in whole or in
        part, without restriction of any kind, provided that the above
        copyright notice and this paragraph are included on all such
        copies and derivative works. However, this document itself may
        not be modified in any way, such as by removing the copyright
        notice or references to the Internet Society or other Internet
        organizations, except as needed for the purpose of developing
        Internet standards in which case the procedures for copyrights
        defined in the Internet Standards process must be followed, or
        as required to translate it into languages other than

        The limited permissions granted above are perpetual and will
        not be revoked by the Internet Society or its successors or

        This document and the information contained herein is provided

Kashyap                                                         [Page 7]


Vivek Kashyap
Linux Technology Center, IBM