INTERNET DRAFT Vivek Kashyap
<draft-ietf-ipoib-dhcp-over-infiniband-06.txt> IBM
Expiration Date: September 2004 March 2004
DHCP over InfiniBand
Status of this memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC 2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as Reference
material or to cite them other than as ``work in progress''.
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This memo provides information for the Internet community. This
memo does not specify an Internet standard of any kind.
Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2001). All Rights Reserved.
Abstract
An InfiniBand network uses a link-layer addressing scheme that is
20-octets long. This is larger than the 16-octets reserved for the
hardware address in DHCP/BOOTP message. The above inequality imposes
restrictions on the use of the DHCP message fields when used over an
IP over InfiniBand(IPoIB) network. This document describes the use
of DHCP message fields when implementing DHCP over IPoIB.
Kashyap [Page 1]
INTERNET-DRAFT DHCP over InfiniBand March 2004
1. Introduction
The Dynamic Host Configuration Protocol(DHCP) provides a framework
for passing configuration information to hosts on a TCP/IP network
[RFC2131]. DHCP is based on the Bootstrap Protocol (BOOTP) [RFC951]
adding the capability of automatic allocation of reusable network
addresses and additional configuration options [RFC2131,RFC2132].
The DHCP server receives a broadcast request from the DHCP client.
The DHCP server uses the client interface's hardware-address to
unicast a reply back when the client doesn't yet have an IP address
assigned to it. The 'chaddr' field in the DHCP message carries the
client's hardware address.
The "chaddr" field is 16-octets in length. The IPoIB link-layer
address is 20-octets in length. Therefore the IPoIB link-layer
address will not fit in the "chaddr" field making it impossible for
the DHCP server to unicast a reply back to the client.
To ensure interoperability the usage of the fields and the method
for DHCP interaction must be clarified. This document describes the
IPoIB specific usage of some fields of DHCP. See [RFC2131] for the
mechanism of DHCP and the explanations of each field.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. The DHCP over IPoIB mechanism
As is noted above, the link-layer address is unavailable to the DHCP
server because it is larger than the "chaddr" field length.
Therefore, a DHCP client MUST request that the server sends a
broadcast reply by setting the BROADCAST flag when IPoIB ARP is not
possible, i.e. in situations where the client does not know its IP
address.
RFC1542 notes that the use of a broadcast reply is discouraged. But
in the case of IPoIB this is a necessity. There is no option but to
broadcast back to the client since it is not possible to reply the
client's unicast address. To desynchronise broadcasts at subnet
startup, RFC2131 suggests that a client wait a random time (1 to 10
seconds) before initiating server discovery. The same timeout will
equally spread out the DHCP server broadcast responses generated due
to the use of the use of the BROADCAST bit.
The client hardware address, "chaddr", is unique in the subnet and
hence can be used to identify the client interface. But in the
Kashyap [Page 2]
INTERNET-DRAFT DHCP over InfiniBand March 2004
absence of a unique chaddr the client-identifier must be used.
The DHCP protocol states that the "client identifier" option may be
used as the unique identifying value for the client. This value
must be unique within the subnet the client is a member of.
The client identifier option includes a type and identifier pair.
The identifier included in the client-identifier option may consist
of a hardware address or any other unique value such as the DNS name
of the client. When a hardware address is used, the type field
should be one of the ARP hardware types listed in [ARPPARAM].
2.1 IPoIB specific usage of DHCP message fields
A DHCP client, when working over an IPoIB interface, MUST follow the
following rules:
"htype" (hardware address type) MUST be 32 [ARPPARAM]
"hlen" (hardware address length) MUST be 0.
"chaddr" (client hardware address) field MUST be zeroed.
"client identifier" option MUST be used in DHCP messages.
According to RFC2132 the "client identifier" option MAY consist of
any data, but IPoIB clients SHOULD use the format discussed below
for the client-identifier option.
Note: This document does not preclude the use of other "client
identifier" type, such as fully qualified domain
name(FQDN) or the EUI-64 value associated with the
interface.
2.1.1 Client-identifier values
Every IPoIB interface is associated with an identifier referred to
as the GID [IPoIB_ARCH]. A GID is formed by appending the port's
EUI-64 identifier to the InfiniBand subnet prefix. An invariant GID
is formed when the port's manufacturer assigned EUI-64 value is used
to form the GID. A port might have additional EUI-64 values assigned
to it by the subnet-manager(SM) [IBARCH]. Therefore a port can have
multiple GIDs associated with it. A GID is unique in the InfiniBand
fabric.
The GID is associated with a particular hardware port. The GID and a
QPN define an IPoIB interface at the port[IPOIB_ENCAP]. Therefore
an implementation could associate multiple IPoIB interfaces on the
Kashyap [Page 3]
INTERNET-DRAFT DHCP over InfiniBand March 2004
same port by utilising a common GID but different QPNs. In such a
case the GID is shared between multiple interfaces, and therefore,
the "client identifier" formed from just the GID is no longer unique
in the IP subnet.
This is not an issue if the interfaces sharing the GID are in
different InfiniBand partitions, and thereby on different IPoIB
links, since the "client identifier" need only be unique within a
subnet. However, if the GID is shared by interfaces within the same
partition the implementation MUST ensure a unique client-identifier.
For example, a unique client-identifier may be formed by including
the QPN associated with the relevant IPoIB interface if the
implementation is designed to keep this association constant across
boots. Some other value unique to the implementation may also be
used for the same purpose.
If there is only one IPoIB interface associated with a particular
GID within a partition, then use of the GID is sufficient.
Since a port may be associated with multiple GIDs, multiple IPoIB
interfaces may exist on the same port while using a different GID
from among the GIDs associated with the port. In such a case too the
GID can form a unique "client identifier".
Therefore, one of the following formats SHOULD be used for the
client-identifier option.
1. If the QPN is used to distinguish between interfaces using the
same GID.
Code Len Type |<---------------- Client-Identifier -------------->|
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
| 61 | 21 | 32 | 20 octets(link-layer address) |
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
2. A unique value, other than the QPN, may be used to distinguish
between interfaces using the same GID. In this case a "type"
of 0 MUST be specified since the identifier is not an identifier
listed in ARPPARAM [RFC2132].
Code Len Type |<---------------- Client-Identifier -------------->|
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
| 61 | 21 | 00 | Unique-value(4 octets)| 16 octet GID |
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
But if the GID is not shared with another IPoIB interface then there
is no need for another "unique-value". In such a case the GID
suffices by itself.
Kashyap [Page 4]
INTERNET-DRAFT DHCP over InfiniBand March 2004
Code Len Type |<---------------- Client-Identifier -------------->|
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
| 61 | 21 | 00 | 00 (4 octets) | 16 octet GID |
+-----+-----+-----+-----+-----+-----+-----+-------------------....----+
2.2 Use of the BROADCAST flag
A DHCP client on IPoIB MUST set the BROADCAST flag in DHCPDISCOVER
and DHCPREQUEST messages (and set "ciaddr" to zero) to ensure that
the server (or the relay agent) broadcasts its reply to the client.
Note: As described in [RFC2131], "ciaddr" MUST be filled in
with client's IP address during BOUND, RENEWING or
REBINDING state, therefore, the BROADCAST flag MUST NOT
be set. In these cases, the DHCP server unicasts DHCPACK
message to the address in "ciaddr". The link address
will be resolved by IPoIB ARP.
3. Security Considerations
RFC2131 describes the security considerations relevant to DHCP. This
document does not introduce any new issues.
4. Acknowledgement
This document borrows extensively from [RFC 2855]. Roy Larsen
pointed out the length discrepancy between the IPoIB link address
and DHCP's chaddr field.
References
[RFC2119] Key words for use in RFCs to Indicate Requirement Levels,
S. Bradner
[RFC2131] Dynamic Host Configuration Protocol, R. Droms
[RFC2132] DHCP Options and BOOTP Vendor Extensions,
S. Alexander, R. Droms
[RFC951] Bootstrap Protocol, B. Croft, J. Gilmore
[RFC1542] Clarifications and Extensions for the Bootstrap Protocol,
W. Wimer
[ARPPARAM] http://www.iana.org/numbers.html
Kashyap [Page 5]
INTERNET-DRAFT DHCP over InfiniBand March 2004
[RFC2855] DHCP for IEEE 1394, K. Fujisawa
[IPoIB_ARCH] draft-ietf-ipoib-architecture-03.txt, V. Kashyap
[IPoIB_ENCAP] draft-ietf-ipoib-ip-over-infiniband-06.txt,
H.K. Jerry Chu, V. Kashyap
[IBARCH] InfiniBand Architecture Specification, www.infinibandta.org
Author's Address
Vivek Kashyap
15350, SW Koll Parkway
Beaverton, OR 97006
USA
Phone: +1 503 578 3422
EMail: vivk@us.ibm.com
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise
explain it or assist in its implementation may be prepared,
copied, published and distributed, in whole or in part, without
restriction of any kind, provided that the above copyright
notice and this paragraph are included on all such copies and
derivative works. However, this document itself may not be
modified in any way, such as by removing the copyright notice or
references to the Internet Society or other Internet
organizations, except as needed for the purpose of developing
Internet standards in which case the procedures for copyrights
defined in the Internet Standards process must be followed, or
as required to translate it into languages other than English.
The limited permissions granted above are perpetual and will not
be revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided
on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE
OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
Kashyap [Page 6]
INTERNET-DRAFT DHCP over InfiniBand March 2004
Kashyap [Page 7]