DNSEXT Working Group                                Paul Vixie, ISC
   INTERNET-DRAFT <draft-ietf-dnsext-rfc2671bis-edns0-00.txt>
   December 27, 2007
                Revised extension mechanisms for DNS (EDNS0)
   Status of this Memo
      By submitting this Internet-Draft, each author represents that any
      applicable patent or other IPR claims of which he or she is aware
      have been or will be disclosed, and any of which he or she becomes
      aware will be disclosed, in accordance with Section 6 of BCP 79.
      Internet-Drafts are working documents of the Internet Engineering
      Task Force (IETF), its areas, and its working groups.  Note that
      other groups may also distribute working documents as Internet-
      Internet-Drafts are draft documents valid for a maximum of six
      months and may be updated, replaced, or obsoleted by other
      documents at any time.  It is inappropriate to use Internet-Drafts
      as reference material or to cite them other than as "work in
      The list of current Internet-Drafts can be accessed at
      The list of Internet-Draft Shadow Directories can be accessed at
   Copyright Notice
      Copyright (C) The IETF Trust (2007).
      The Domain Name System's wire protocol includes a number of fixed
      fields whose range has been or soon will be exhausted and does not
      allow clients to advertise their capabilities to servers.  This
      document describes backward compatible mechanisms for allowing the
      protocol to grow.
   Expires May 27, 2008                                          [Page 1]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   1 - Introduction
   1.1. DNS (see [RFC1035]) specifies a Message Format and within such
   messages there are standard formats for encoding options, errors, and
   name compression.  The maximum allowable size of a DNS Message is
   fixed.  Many of DNS's protocol limits are too small for uses which are
   or which are desired to become common.  There is no way for
   implementations to advertise their capabilities.
   1.2. Unextended agents will not know how to interpret the protocol
   extensions detailed here.  In practice, these clients will be upgraded
   when they have need of a new feature, and only new features will make
   use of the extensions.  Extended agents must be prepared for behaviour
   of unextended clients in the face of new protocol elements, and fall
   back gracefully to unextended DNS.
   1.3. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
   this document are to be interpreted as described in RFC 2119
   2 - Affected Protocol Elements
   2.1. The DNS Message Header's (see [RFC1035 4.1.1]) second full 16-bit
   word is divided into a 4-bit OPCODE, a 4-bit RCODE, and a number of
   1-bit flags.  The original reserved Z bits have been allocated to
   various purposes, and most of the RCODE values are now in use.  More
   flags and more possible RCODEs are needed.
   2.2. The first two bits of a wire format domain label are used to
   denote the type of the label.  [RFC1035 4.1.4] allocates two of the
   four possible types and reserves the other two.  Proposals for use of
   the remaining types far outnumber those available.  More label types
   were needed, and an extension mechanism was proposed in RFC 2671
   [RFC2671 Section 3].
   2.3. DNS Messages are limited to 512 octets in size when sent over
   UDP.  While the minimum maximum reassembly buffer size still allows a
   limit of 512 octets of UDP payload, most of the hosts now connected to
   the Internet are able to reassemble larger datagrams.  Some mechanism
   must be created to allow requestors to advertise larger buffer sizes
   to responders.
   Expires May 27, 2008                                          [Page 2]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   3 - Extended Label Types
   [RFC2671 Section 3] reserved label type "0 1" to indicate that an
   extended label type followed in the next octet, but gave inadequate
   guidance as to how EDNS, as a hop-by-hop signalling method, could be
   used to carry a new kind of DNS label.  Extended label types might be
   addressed in a future specification, perhaps requiring that the EDNS
   VERSION be incremented.
   4 - OPT pseudo-RR
   4.1. One OPT pseudo-RR (RR type 41) MAY be added to the additional
   data section of a request, and to responses to such requests.  An OPT
   is called a pseudo-RR because it pertains to a particular transport
   level message and not to any actual DNS data.  OPT RRs MUST NOT be
   cached, forwarded, or stored in or loaded from master files.  The
   quantity of OPT pseudo-RRs per message MUST be either zero or one, but
   not greater.
   4.2. An OPT RR has a fixed part and a variable set of options
   expressed as {attribute, value} pairs.  The fixed part holds some DNS
   meta data and also a small collection of new protocol elements which
   we expect to be so popular that it would be a waste of wire space to
   encode them as {attribute, value} pairs.
   4.3. The fixed part of an OPT RR is structured as follows:
   Field Name   Field Type     Description
   NAME         domain name    empty (root domain)
   TYPE         u_int16_t      OPT
   CLASS        u_int16_t      sender's UDP payload size
   TTL          u_int32_t      extended RCODE and flags
   RDLEN        u_int16_t      describes RDATA
   RDATA        octet stream   {attribute,value} pairs
   Expires May 27, 2008                                          [Page 3]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   4.4. The variable part of an OPT RR is encoded in its RDATA and is
   structured as zero or more of the following:
                    +0 (MSB)                            +1 (LSB)
      0: |                          OPTION-CODE                          |
      2: |                         OPTION-LENGTH                         |
      4: |                                                               |
         /                          OPTION-DATA                          /
         /                                                               /
   OPTION-CODE    (Assigned by IANA.)
   OPTION-LENGTH  Size (in octets) of OPTION-DATA.
   4.4.1. Order of appearance of option tuples is never relevant.  Any
   option whose meaning is affected by other options is so affected no
   matter which one comes first in the OPT RDATA.
   4.4.2. Any OPTION-CODE values not understood by a responder or
   requestor MUST be ignored.  So, specifications of such options might
   wish to include some kind of signalled acknowledgement.  For example,
   an option specification might say that if a responder sees option XYZ,
   it SHOULD include option XYZ in its response.
   4.5. The sender's UDP payload size (which OPT stores in the RR CLASS
   field) is the number of octets of the largest UDP payload that can be
   reassembled and delivered in the sender's network stack.  Note that
   path MTU, with or without fragmentation, may be smaller than this.
   4.5.1. Note that a 512-octet UDP payload requires a 576-octet IP
   reassembly buffer.  Choosing 1280 on an Ethernet connected requestor
   would be reasonable.  The consequence of choosing too large a value
   may be an ICMP message from an intermediate gateway, or even a silent
   drop of the response message.
   4.5.2. Both requestors and responders are advised to take account of
   the path's discovered MTU (if already known) when considering message
   Expires May 27, 2008                                          [Page 4]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   4.5.3. The requestor's maximum payload size can change over time, and
   therefore MUST NOT be cached for use beyond the transaction in which
   it is advertised.
   4.5.4. The responder's maximum payload size can change over time, but
   can be reasonably expected to remain constant between two sequential
   transactions; for example, a meaningless QUERY to discover a
   responder's maximum UDP payload size, followed immediately by an
   UPDATE which takes advantage of this size.  (This is considered
   preferrable to the outright use of TCP for oversized requests, if
   there is any reason to suspect that the responder implements EDNS, and
   if a request will not fit in the default 512 payload size limit.)
   4.5.5. Due to transaction overhead, it is unwise to advertise an
   architectural limit as a maximum UDP payload size.  Just because your
   stack can reassemble 64KB datagrams, don't assume that you want to
   spend more than about 4KB of state memory per ongoing transaction.
   4.6. The extended RCODE and flags (which OPT stores in the RR TTL
   field) are structured as follows:
                    +0 (MSB)                            +1 (LSB)
      0: |         EXTENDED-RCODE        |            VERSION            |
      2: | DO|                           Z                               |
   EXTENDED-RCODE  Forms upper 8 bits of extended 12-bit RCODE.  Note
                   that EXTENDED-RCODE value "0" indicates that an
                   unextended RCODE is in use (values "0" through "15").
   VERSION         Indicates the implementation level of whoever sets it.
                   Full conformance with this specification is indicated
                   by version ``0.''  Requestors are encouraged to set
                   this to the lowest implemented level capable of
                   expressing a transaction, to minimize the responder
                   and network load of discovering the greatest common
                   implementation level between requestor and responder.
                   A requestor's version numbering strategy should
                   ideally be a run time configuration option.
                   If a responder does not implement the VERSION level of
                   the request, then it answers with RCODE=BADVERS.  All
   Expires May 27, 2008                                          [Page 5]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
                   responses MUST be limited in format to the VERSION
                   level of the request, but the VERSION of each response
                   MUST be the highest implementation level of the
                   responder.  In this way a requestor will learn the
                   implementation level of a responder as a side effect
                   of every response, including error responses,
                   including RCODE=BADVERS.
   DO              DNSSEC OK bit [RFC3225].
   Z               Set to zero by senders and ignored by receivers,
                   unless modified in a subsequent specification
   5 - Transport Considerations
   5.1. The presence of an OPT pseudo-RR in a request is an indication
   that the requestor fully implements the given version of EDNS, and can
   correctly understand any response that conforms to that feature's
   5.2. Lack of use of these features in a request is an indication that
   the requestor does not implement any part of this specification and
   that the responder SHOULD NOT use any protocol extension described
   here in its response.
   5.3. Responders who do not understand these protocol extensions are
   expected to send a response with RCODE NOTIMPL, FORMERR, or SERVFAIL,
   or to appear to "time out" due to inappropriate action by a "middle
   box" such as a NAT.  Therefore use of extensions SHOULD be ``probed''
   such that a responder who isn't known to support them be allowed a
   retry with no extensions if it responds with such an RCODE, or does
   not respond.  If a responder's capability level is cached by a
   requestor, a new probe SHOULD be sent periodically to test for changes
   to responder capability.
   6 - Security Considerations
   Requestor-side specification of the maximum buffer size may open a new
   DNS denial of service attack if responders can be made to send
   messages which are too large for intermediate gateways to forward,
   thus leading to potential ICMP storms between gateways and responders.
   Expires May 27, 2008                                          [Page 6]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   7 - IANA Considerations
   IANA has allocated RR type code 41 for OPT.
   This document controls the following IANA sub-registries in registry
      "EDNS Extended Label Type"
      "EDNS Option Codes"
      "EDNS Version Numbers"
      "Domain System Response Code"
   This document assigns label type 0b01xxxxxx as "EDNS Extended Label
   Type."  We request that IANA record this assignment.
   This document assigns option code 65535 to "Reserved for future
   This document assigns EDNS Extended RCODE "16" to "BADVERS".
   IESG approval is required to create new entries in the EDNS Extended
   Label Type or EDNS Version Number registries, while any published RFC
   (including Informational, Experimental, or BCP) is grounds for
   allocation of an EDNS Option Code.
   8 - Acknowledgements
   Paul Mockapetris, Mark Andrews, Robert Elz, Don Lewis, Bob Halley,
   Donald Eastlake, Rob Austein, Matt Crawford, Randy Bush, and Thomas
   Narten were each instrumental in creating and refining this
   Expires May 27, 2008                                          [Page 7]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   9 - References
   [RFC1035]    P. Mockapetris, ``Domain Names - Implementation and
                Specification,'' RFC 1035, USC/Information Sciences
                Institute, November 1987.
   [RFC2119]    S. Bradner, ``Key words for use in RFCs to Indicate
                Requirement Levels,'' RFC 2119, Harvard University, March
   [RFC2671]    P. Vixie, ``Extension mechanisms for DNS (EDNS0),'' RFC
                2671, Internet Software Consortium, August 1999.
   [RFC3225]    D. Conrad, ``Indicating Resolver Support of DNSSEC,'' RFC
                3225, Nominum Inc., December 2001.
   [IANAFLAGS]  IANA, ``DNS Header Flags and EDNS Header Flags,'' web
                site http://www.iana.org/assignments/dns-header-flags, as
                of June 2005 or later.
   10 - Author's Address
   Paul Vixie
      Internet Systems Consortium
      950 Charter Street
      Redwood City, CA 94063
      +1 650 423 1301
   Full Copyright Statement
   Copyright (C) IETF Trust (2007).
   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.
   This document and the information contained herein are provided on an
   Expires May 27, 2008                                          [Page 8]

   INTERNET-DRAFT                   EDNS0               December 27, 2007
   Intellectual Property
   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be found
   in BCP 78 and BCP 79.
   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this specification
   can be obtained from the IETF on-line IPR repository at
   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).
   Expires May 27, 2008                                          [Page 9]