DNSEXT Working Group                                Paul Vixie, ISC
<draft-ietf-dnsext-rfc2671bis-edns0-01.txt>          March 17, 2008

Intended Status: Standards Track
Obsoletes: 2671 (if approved)

              Revised extension mechanisms for DNS (EDNS0)

Status of this Memo
   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

Copyright Notice

   Copyright (C) The IETF Trust (2007).


   The Domain Name System's wire protocol includes a number of fixed
   fields whose range has been or soon will be exhausted and does not
   allow clients to advertise their capabilities to servers.  This
   document describes backward compatible mechanisms for allowing the
   protocol to grow.

Expires September 2008                                          [Page 1]

INTERNET-DRAFT                    EDNS0                       March 2008

1 - Introduction

1.1. DNS (see [RFC1035]) specifies a Message Format and within such
messages there are standard formats for encoding options, errors, and
name compression.  The maximum allowable size of a DNS Message is fixed.
Many of DNS's protocol limits are too small for uses which are or which
are desired to become common.  There is no way for implementations to
advertise their capabilities.

1.2. Unextended agents will not know how to interpret the protocol
extensions detailed here.  In practice, these clients will be upgraded
when they have need of a new feature, and only new features will make
use of the extensions.  Extended agents must be prepared for behaviour
of unextended clients in the face of new protocol elements, and fall
back gracefully to unextended DNS.  RFC 2671 originally has proposed
extensions to the basic DNS protocol to overcome these deficiencies.
This memo refines that specification and obsoletes RFC 2671.

1.3. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
document are to be interpreted as described in RFC 2119 [RFC2119].

2 - Affected Protocol Elements

2.1. The DNS Message Header's (see [RFC1035 4.1.1]) second full 16-bit
word is divided into a 4-bit OPCODE, a 4-bit RCODE, and a number of
1-bit flags.  The original reserved Z bits have been allocated to
various purposes, and most of the RCODE values are now in use.  More
flags and more possible RCODEs are needed.  The OPT pseudo-RR specified
in Section 4 contains subfields that carry a bit field extension of the
RCODE field and additional flag bits, respectively; for details see
Section 4.6 below.

2.2. The first two bits of a wire format domain label are used to denote
the type of the label.  [RFC1035 4.1.4] allocates two of the four
possible types and reserves the other two.  Proposals for use of the
remaining types far outnumber those available.  More label types were
needed, and an extension mechanism was proposed in RFC 2671 [RFC2671
Section 3].  Section 3 of this document reserves DNS labels with a first
octet in the range of 64-127 decimal (label type 01) for future
standardization of Extended DNS Labels.

Expires September 2008                                          [Page 2]

INTERNET-DRAFT                    EDNS0                       March 2008

2.3. DNS Messages are limited to 512 octets in size when sent over UDP.
While the minimum maximum reassembly buffer size still allows a limit of
512 octets of UDP payload, most of the hosts now connected to the
Internet are able to reassemble larger datagrams.  Some mechanism must
be created to allow requestors to advertise larger buffer sizes to
responders.  To this end, the OPT pseudo-RR specified in Section 4
contains a maximum payload size field; for details see Section 4.5

3 - Extended Label Types

The first octet in the on-the-wire representation of a DNS label
specifies the label type; the basic DNS specification [RFC1035]
dedicates the two most significant bits of that octet for this purpose.

This document reserves DNS label type 0b01 for use as an indication for
Extended Label Types.  A specific extended label type is selected by the
6 least significant bits of the first octet.  Thus, Extended Label Types
are indicated by the values 64-127 (0b01xxxxxx) in the first octet of
the label.

Allocations from this range are to be made for IETF documents fully
describing the syntax and semantics as well as the applicability of the
particular Extended Label Type.

This document does not describe any specific Extended Label Type.

4 - OPT pseudo-RR

4.1. One OPT pseudo-RR (RR type 41) MAY be added to the additional data
section of a request, and to responses to such requests.  An OPT is
called a pseudo-RR because it pertains to a particular transport level
message and not to any actual DNS data.  OPT RRs MUST NOT be cached,
forwarded, or stored in or loaded from master files.  The quantity of
OPT pseudo-RRs per message MUST be either zero or one, but not greater.

4.2. An OPT RR has a fixed part and a variable set of options expressed
as {attribute, value} pairs.  The fixed part holds some DNS meta data
and also a small collection of new protocol elements which we expect to
be so popular that it would be a waste of wire space to encode them as
{attribute, value} pairs.

Expires September 2008                                          [Page 3]

INTERNET-DRAFT                    EDNS0                       March 2008

4.3. The fixed part of an OPT RR is structured as follows:

Field Name   Field Type     Description
NAME         domain name    empty (root domain)
TYPE         u_int16_t      OPT (41)
CLASS        u_int16_t      sender's UDP payload size
TTL          u_int32_t      extended RCODE and flags
RDLEN        u_int16_t      describes RDATA
RDATA        octet stream   {attribute,value} pairs

4.4. The variable part of an OPT RR is encoded in its RDATA and is
structured as zero or more of the following:

      :          +0 (MSB)             :              +1 (LSB)         :
   0: |                          OPTION-CODE                          |
   2: |                         OPTION-LENGTH                         |
   4: |                                                               |
      /                          OPTION-DATA                          /
      /                                                               /

OPTION-CODE    (Assigned by IANA.)

OPTION-LENGTH  Size (in octets) of OPTION-DATA.


4.4.1. Order of appearance of option tuples is never relevant.  Any
option whose meaning is affected by other options is so affected no
matter which one comes first in the OPT RDATA.

4.4.2. Any OPTION-CODE values not understood by a responder or requestor
MUST be ignored.  So, specifications of such options might wish to
include some kind of signalled acknowledgement.  For example, an option
specification might say that if a responder sees option XYZ, it SHOULD
include option XYZ in its response.

Expires September 2008                                          [Page 4]

INTERNET-DRAFT                    EDNS0                       March 2008

4.5. The sender's UDP payload size (which OPT stores in the RR CLASS
field) is the number of octets of the largest UDP payload that can be
reassembled and delivered in the sender's network stack.  Note that path
MTU, with or without fragmentation, may be smaller than this.  Values
lower than 512 are undefined, and may be treated as format errors, or
may be treated as equal to 512, at the implementor's discretion.

4.5.1. Note that a 512-octet UDP payload requires a 576-octet IP
reassembly buffer.  Choosing 1280 on an Ethernet connected requestor
would be reasonable.  The consequence of choosing too large a value may
be an ICMP message from an intermediate gateway, or even a silent drop
of the response message.

4.5.2. Both requestors and responders are advised to take account of the
path's discovered MTU (if already known) when considering message sizes.

4.5.3. The requestor's maximum payload size can change over time, and
therefore MUST NOT be cached for use beyond the transaction in which it
is advertised.

4.5.4. The responder's maximum payload size can change over time, but
can be reasonably expected to remain constant between two sequential
transactions; for example, a meaningless QUERY to discover a responder's
maximum UDP payload size, followed immediately by an UPDATE which takes
advantage of this size.  (This is considered preferrable to the outright
use of TCP for oversized requests, if there is any reason to suspect
that the responder implements EDNS, and if a request will not fit in the
default 512 payload size limit.)

4.5.5. Due to transaction overhead, it is unwise to advertise an
architectural limit as a maximum UDP payload size.  Just because your
stack can reassemble 64KB datagrams, don't assume that you want to spend
more than about 4KB of state memory per ongoing transaction.

4.6. The extended RCODE and flags (which OPT stores in the RR TTL field)
are structured as follows:

      :          +0 (MSB)             :              +1 (LSB)         :
   0: |         EXTENDED-RCODE        |            VERSION            |
   2: | DO|                           Z                               |

Expires September 2008                                          [Page 5]

INTERNET-DRAFT                    EDNS0                       March 2008

EXTENDED-RCODE  Forms upper 8 bits of extended 12-bit RCODE.  Note that
                EXTENDED-RCODE value zero (0) indicates that an
                unextended RCODE is in use (values zero (0) through
                fifteen (15)).

VERSION         Indicates the implementation level of whoever sets it.
                Full conformance with this specification is indicated by
                version zero (0).  Requestors are encouraged to set this
                to the lowest implemented level capable of expressing a
                transaction, to minimize the responder and network load
                of discovering the greatest common implementation level
                between requestor and responder.  A requestor's version
                numbering strategy should ideally be a run time
                configuration option.

                If a responder does not implement the VERSION level of
                the request, then it answers with RCODE=BADVERS.  All
                responses MUST be limited in format to the VERSION level
                of the request, but the VERSION of each response MUST be
                the highest implementation level of the responder.  In
                this way a requestor will learn the implementation level
                of a responder as a side effect of every response,
                including error responses, including RCODE=BADVERS.

DO              DNSSEC OK bit [RFC3225].

Z               Set to zero by senders and ignored by receivers, unless
                modified in a subsequent specification [IANAFLAGS].

5 - Transport Considerations

5.1. The presence of an OPT pseudo-RR in a request is an indication that
the requestor fully implements the given version of EDNS, and can
correctly understand any response that conforms to that feature's

5.2. Lack of use of these features in a request is an indication that
the requestor does not implement any part of this specification and that
the responder SHOULD NOT use any protocol extension described here in
its response.

5.3. Responders who do not understand these protocol extensions are
expected to send a response with RCODE NOTIMPL, FORMERR, or SERVFAIL, or
to appear to "time out" due to inappropriate action by a "middle box"
such as a NAT, or to ignore extensions and respond only to unextended

Expires September 2008                                          [Page 6]

INTERNET-DRAFT                    EDNS0                       March 2008

protocol elements.  Therefore use of extensions SHOULD be "probed" such
that a responder who isn't known to support them be allowed a retry with
no extensions if it responds with such an RCODE, or does not respond.
If a responder's capability level is cached by a requestor, a new probe
SHOULD be sent periodically to test for changes to responder capability.

5.4. If EDNS is used in a request, and the response arrives with TC set
and with no EDNS OPT RR, a requestor should assume that truncation
prevented the OPT RR from being appended by the responder, and further,
that EDNS is not used in the response.  Correspondingly, an EDNS
responder who cannot fit all necessary elements (including an OPT RR)
into a response, should respond with a normal (unextended) DNS response,
possibly setting TC if the response will not fit in the unextended
response message's 512-octet size.

6 - Security Considerations

Requestor-side specification of the maximum buffer size may open a new
DNS denial of service attack if responders can be made to send messages
which are too large for intermediate gateways to forward, thus leading
to potential ICMP storms between gateways and responders.

7 - IANA Considerations

IANA has allocated RR type code 41 for OPT.

This document controls the following IANA sub-registries in registry

   "EDNS Extended Label Type"
   "EDNS Option Codes"
   "EDNS Version Numbers"
   "Domain System Response Code"

IANA is advised to re-parent these subregistries to this document.

This document assigns label type 0b01xxxxxx as "EDNS Extended Label
Type."  We request that IANA record this assignment.

This document assigns option code 65535 to "Reserved for future

This document assigns EDNS Extended RCODE "16" to "BADVERS".

Expires September 2008                                          [Page 7]

INTERNET-DRAFT                    EDNS0                       March 2008

IESG approval is required to create new entries in the EDNS Extended
Label Type or EDNS Version Number registries, while any published RFC
(including Informational, Experimental, or BCP) is grounds for
allocation of an EDNS Option Code.

8 - Acknowledgements

Paul Mockapetris, Mark Andrews, Robert Elz, Don Lewis, Bob Halley,
Donald Eastlake, Rob Austein, Matt Crawford, Randy Bush, Thomas Narten,
Alfred Hoenes and Markku Savela were each instrumental in creating and
refining this specification.

9 - References

[RFC1035]    P. Mockapetris, "Domain Names - Implementation and
             Specification," RFC 1035, USC/Information Sciences
             Institute, November 1987.

[RFC2119]    S. Bradner, "Key words for use in RFCs to Indicate
             Requirement Levels," RFC 2119, Harvard University, March

[RFC2671]    P. Vixie, "Extension mechanisms for DNS (EDNS0)," RFC 2671,
             Internet Software Consortium, August 1999.

[RFC3225]    D. Conrad, "Indicating Resolver Support of DNSSEC," RFC
             3225, Nominum Inc., December 2001.

[IANAFLAGS]  IANA, "DNS Header Flags and EDNS Header Flags," web site
             http://www.iana.org/assignments/dns-header-flags, as of
             June 2005 or later.

10 - Author's Address

Paul Vixie
   Internet Systems Consortium
   950 Charter Street
   Redwood City, CA 94063
   +1 650 423 1301
   EMail: vixie@isc.org

Expires September 2008                                          [Page 8]

INTERNET-DRAFT                    EDNS0                       March 2008

Full Copyright Statement

Copyright (C) IETF Trust (2007).

This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors retain
all their rights.

This document and the information contained herein are provided on an

Intellectual Property

The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in this
document or the extent to which any license under such rights might or
might not be available; nor does it represent that it has made any
independent effort to identify any such rights.  Information on the
procedures with respect to rights in RFC documents can be found in BCP
78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an attempt
made to obtain a general license or permission for the use of such
proprietary rights by implementers or users of this specification can be
obtained from the IETF on-line IPR repository at

The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary rights
that may cover technology that may be required to implement this
standard.  Please address the information to the IETF at


Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).

Expires September 2008                                          [Page 9]