[Search] [pdf|bibtex] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 09 10 11 12                        
          13 14 15                                                      
Internet Engineering Task Force                                 P. Vixie
Internet-Draft                               Internet Systems Consortium
Intended status: Informational                                   A. Kato
Expires: June 20, 2008                      The University of Tokyo/WIDE
                                                       December 18, 2007

                   DNS Referral Response Size Issues

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on June 20, 2008.

Copyright Notice

   Copyright (C) The IETF Trust (2007).


   With a mandated default minimum maximum UDP message size of 512
   octets, the DNS protocol presents some special problems for zones
   wishing to expose a moderate or high number of authority servers (NS
   RRs).  This document explains the operational issues caused by, or
   related to this response size limit, and suggests ways to optimize
   the use of this limited space.  Guidance is offered to DNS server

Vixie & Kato              Expires June 20, 2008                 [Page 1]

Internet-Draft         DNS Referral Response Size          December 2007

   implementors and to DNS zone operators.

1.  Introduction

1.1.  Introduction and Overview

   The DNS standard limits UDP message size to 512 octets (see [RFC1035]
   4.2.1).  Even though this limitation was due to the required minimum
   IP reassembly limit for IPv4, it became a hard DNS protocol limit and
   is not implicitly relaxed by changes in a network layer protocol, for
   example to IPv6.

   The EDNS protocol extension starting with version 0 permits larger
   responses by mutual agreement of the requester and responder (see
   [RFC2671] 2.3, 4.5), and it is recommended to support EDNS.  The 512
   octets UDP message size limit will remain in practical effect until
   virtually all DNS servers and resolvers support EDNS.

   Since DNS responses include a copy of the request, the space
   available for response data is somewhat less than the full 512
   octets.  Negative responses are quite small, but for positive and
   referral responses, every octet must be carefully and sparingly
   allocated.  While the response size of positive responses is also a
   concern[RFC3226], this document specifically addresses referral
   response size.

1.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Delegation Details

2.1.  Relevant Protocol Elements

   A delegation response will include the following elements:

       Header Section: fixed length (12 octets)
       Question Section: original query (name, class, type)
       Answer Section: empty, or a CNAME/DNAME chain
       Authority Section: NS RRset (nameserver names)
       Additional Section: A and AAAA RRsets (nameserver addresses)

   If the total size of the UDP response exceeds 512 octets or
   advertised size in EDNS, and if the data that does not fit was

Vixie & Kato              Expires June 20, 2008                 [Page 2]

Internet-Draft         DNS Referral Response Size          December 2007

   "required", then the TC bit will be set (indicating truncation).
   This will usually cause the requester to retry using TCP, depending
   on what information was desired and what information was omitted.
   For example, truncation in the authority section is of no interest to
   a stub resolver who only plans to consume the answer section.  If a
   retry using TCP is needed, the total cost of the transaction is much
   higher.  See [RFC1123] for details on the requirement that
   UDP be attempted before falling back to TCP.

   RRsets are never sent partially unless the TC bit is set to indicate
   truncation.  When the TC bit is set, the final apparent RRset in the
   final non-empty section must be considered "possibly damaged" (see
   [RFC1035] 6.2, [RFC2181] 9).

   With or without truncation, the glue present in the additional data
   section should be considered "possibly incomplete", and requesters
   should be prepared to re-query for any damaged or missing RRsets.
   Note that truncation of the additional data section might not be
   signaled via the TC bit since additional data is often optional (see
   discussion in [RFC4472] B).

   DNS label compression allows the component labels of a domain name to
   be instantiated exactly once per DNS message, and then referenced
   with a two-octet "pointer" from other locations in that same DNS
   message (see [RFC1035] 4.1.4).  If all nameserver names in a message
   share a common parent (for example, all ending in ".ROOT-
   SERVERS.NET"), then more space will be available for incompressible
   data (such as nameserver addresses).

   The query name can be as long as 255 octets of network data.  In this
   worst case scenario, the question section will be 259 octets in size,
   which would leave only 240 octets for the authority and additional
   sections (after deducting 12 octets for the fixed length header) in a

2.2.  Advice to Zone Owners

   Average and maximum question section sizes can be predicted by the
   zone owner, since they will know what names actually exist, and can
   measure which ones are queried for most often.  Note that if the zone
   contains any wildcards, it is possible for maximum length queries to
   require positive responses, but that it is reasonable to expect
   truncation and TCP retry in that case.  For cost and performance
   reasons, the majority of requests should be satisfied without
   truncation or TCP retry.

   Some queries to non-existing names can be large, but this is not a
   problem because negative responses need not contain any answer,

Vixie & Kato              Expires June 20, 2008                 [Page 3]

Internet-Draft         DNS Referral Response Size          December 2007

   authority or additional records.  See [RFC2308] 2.1 for more
   information about the format of negative responses.

   The minimum useful number of name servers is two, for redundancy (see
   [RFC1034] 4.1).  A zone's name servers should be reachable by all IP
   protocols versions (e.g., IPv4 and IPv6) in common use.  As long as
   the servers are well managed, the server serving IPv6 might be
   different from the server serving IPv4 sharing the same server name.
   It is important to ensure that a zone should have servers reachable
   by all IP protocol in common use (e.g., IPv4 and IPv6).

   The best case is no truncation at all.  This is because many
   requesters will retry using TCP immediately, or will automatically
   requery for RRsets that are possibly truncated, without considering
   whether the omitted data was actually necessary.

   Anycasting[RFC3258] is a useful tool for performance and reliability
   without increasing the size of referral response.

   While it is irrelevant to the response size issue, all zones have to
   be served via IPv4 as well to avoid name space

2.3.  Advice to Server Implementors

   Each NS RR for a zone will add 12 fixed octets (name, type, class,
   ttl, and rdlen) plus 2 to 255 variable octets (for the NSDNAME).
   Each A RR will require 16 octets, and each AAAA RR will require 28

   While DNS distinguishes between necessary and optional resource
   records, this distinction is according to protocol elements necessary
   to signify facts, and takes no official notice of protocol content
   necessary to ensure correct operation.  For example, a nameserver
   name that is in or below the zone cut being described by a delegation
   is "necessary content", since there is no way to reach that zone
   unless the parent zone's delegation includes "glue records"
   describing that name server's addresses.

   Recall that the TC bit is only set when the required RRset can not be
   included in its entirety (see [RFC2181] 9).  Even when some of the
   RRsets to be included in the additional section are not fit in the
   response size, the TC bit isn't set.  These RRsets may be important
   for a referral.  Some DNS implementations try to resolve these
   missing glue records separately which will introduce extra queries
   and extra time to resolve a given name.

   A delegation response should prioritize glue records as follows.

Vixie & Kato              Expires June 20, 2008                 [Page 4]

Internet-Draft         DNS Referral Response Size          December 2007

       All glue RRsets for one name server whose name is in or below the
       zone being delegated, or which has multiple address RRsets
       (currently A and AAAA), or preferably both;
       Alternate between adding all glue RRsets for any name servers
       whose names are in or below the zone being delegated, and all
       glue RRsets for any name servers who have multiple address RRsets
       (currently A and AAAA);
       All other glue RRsets, in any order.

   Whenever there are multiple candidates for a position in this
   priority scheme, one should be chosen on a round-robin or fully
   random basis.  The goal of this priority scheme is to offer
   "necessary" glue first, avoiding silent truncation for this glue if

   If any "necessary content" is silently truncated, then it is
   advisable that the TC bit be set in order to force a TCP retry,
   rather than have the zone be unreachable.  Note that a parent
   server's proper response to a query for in-child glue or below-child
   glue is a referral rather than an answer, and that this referral must
   be able to contain the in-child or below-child glue, and that in
   outlying cases, only EDNS or TCP will be large enough to contain that

3.  Analysis

   An instrumented protocol trace of a best case delegation response is
   shown in Figure 1.  Note that 13 servers are named, and 13 addresses
   are given.  This query was artificially designed to exactly reach the
   512 octets limit.

Vixie & Kato              Expires June 20, 2008                 [Page 5]

Internet-Draft         DNS Referral Response Size          December 2007

      ;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13
      ;;  [23456789.123456789.123456789.\
           123456789.123456789.123456789.com A IN]        ;; @80

      com.                 86400 NS  E.GTLD-SERVERS.NET.  ;; @112
      com.                 86400 NS  F.GTLD-SERVERS.NET.  ;; @128
      com.                 86400 NS  G.GTLD-SERVERS.NET.  ;; @144
      com.                 86400 NS  H.GTLD-SERVERS.NET.  ;; @160
      com.                 86400 NS  I.GTLD-SERVERS.NET.  ;; @176
      com.                 86400 NS  J.GTLD-SERVERS.NET.  ;; @192
      com.                 86400 NS  K.GTLD-SERVERS.NET.  ;; @208
      com.                 86400 NS  L.GTLD-SERVERS.NET.  ;; @224
      com.                 86400 NS  M.GTLD-SERVERS.NET.  ;; @240
      com.                 86400 NS  A.GTLD-SERVERS.NET.  ;; @256
      com.                 86400 NS  B.GTLD-SERVERS.NET.  ;; @272
      com.                 86400 NS  C.GTLD-SERVERS.NET.  ;; @288
      com.                 86400 NS  D.GTLD-SERVERS.NET.  ;; @304

      A.GTLD-SERVERS.NET.  86400 A           ;; @320
      B.GTLD-SERVERS.NET.  86400 A         ;; @336
      C.GTLD-SERVERS.NET.  86400 A         ;; @352
      D.GTLD-SERVERS.NET.  86400 A         ;; @368
      E.GTLD-SERVERS.NET.  86400 A         ;; @384
      F.GTLD-SERVERS.NET.  86400 A         ;; @400
      G.GTLD-SERVERS.NET.  86400 A         ;; @416
      H.GTLD-SERVERS.NET.  86400 A        ;; @432
      I.GTLD-SERVERS.NET.  86400 A        ;; @448
      J.GTLD-SERVERS.NET.  86400 A         ;; @464
      K.GTLD-SERVERS.NET.  86400 A        ;; @480
      L.GTLD-SERVERS.NET.  86400 A        ;; @496
      M.GTLD-SERVERS.NET.  86400 A         ;; @512

      ;; MSG SIZE  sent: 80  rcvd: 512

                                 Figure 1

   For longer query names, the number of address records supplied will
   be lower.  Furthermore, it is only by using a common parent name
   (which is "GTLD-SERVERS.NET" in this example) that all 13 addresses
   are able to fit, due to the use of DNS compression pointers in the
   last 12 occurrences of the parent domain name.  The following outputs
   shown in Figure 2 and Figure 3 from a response simulator in
   Appendix A written in perl[PERL] demonstrate these properties.

Vixie & Kato              Expires June 20, 2008                 [Page 6]

Internet-Draft         DNS Referral Response Size          December 2007

   % perl respsize.pl a.dns.br b.dns.br c.dns.br d.dns.br
   a.dns.br requires 10 bytes
   b.dns.br requires 4 bytes
   c.dns.br requires 4 bytes
   d.dns.br requires 4 bytes
   # of NS: 4
   For maximum size query (255 byte):
       only A is considered:        # of A is 4 (green)
       A and AAAA are considered:   # of A+AAAA is 3 (yellow)
       preferred-glue A is assumed: # of A is 4, # of AAAA is 3 (yellow)
   For average size query (64 byte):
       only A is considered:        # of A is 4 (green)
       A and AAAA are considered:   # of A+AAAA is 4 (green)
       preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green)

                                 Figure 2

   % perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int
   ns-ext.isc.org requires 16 bytes
   ns.psg.com requires 12 bytes
   ns.ripe.net requires 13 bytes
   ns.eu.int requires 11 bytes
   # of NS: 4
   For maximum size query (255 byte):
       only A is considered:        # of A is 4 (green)
       A and AAAA are considered:   # of A+AAAA is 3 (yellow)
       preferred-glue A is assumed: # of A is 4, # of AAAA is 2 (yellow)
   For average size query (64 byte):
       only A is considered:        # of A is 4 (green)
       A and AAAA are considered:   # of A+AAAA is 4 (green)
       preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green)

                                 Figure 3

   Here we use the term "green" if all address records could fit, or
   "yellow" if two or more could fit, or "orange" if only one could fit,
   or "red" if no address record could fit.  It's clear that without a
   common parent for nameserver names, much space would be lost.  For
   these examples we use an average/common name size of 15 octets,
   befitting our assumption of GTLD-SERVERS.NET as our common parent

   We're assuming a medium query name size of 64 since that is the
   typical size seen in trace data at the time of this writing.  If
   Internationalized Domain Name (IDN) or any other technology which
   results in larger query names be deployed significantly in advance of

Vixie & Kato              Expires June 20, 2008                 [Page 7]

Internet-Draft         DNS Referral Response Size          December 2007

   EDNS, then new measurements and new estimates will have to be made.

4.  Conclusions

   The current practice of giving all nameserver names a common parent
   (such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS
   responses and allows for more nameservers to be enumerated than would
   otherwise be possible, since the common parent domain name only
   appears once in a DNS message and is referred to via "compression
   pointers" thereafter.

   If all nameserver names for a zone share a common parent, then it is
   operationally advisable to make all servers for the zone thus served
   also be authoritative for the zone of that common parent.  For
   example, the root name servers (?.ROOT-SERVERS.NET) can answer
   authoritatively for the ROOT-SERVERS.NET zone.  This is to ensure
   that the zone's servers always have the zone's nameservers' glue
   available when delegating, and will be able to respond with answers
   rather than referrals if a requester who wants that glue comes back
   asking for it.  In this case the name server will likely be a
   "stealth server" -- authoritative but unadvertised in the glue zone's
   NS RRset.  See [RFC1996] 2 for more information about stealth

   Thirteen (13) is the effective maximum number of nameserver names
   usable with traditional (non-extended) DNS, assuming a common parent
   domain name, and given that implicit referral response truncation is
   undesirable in the average case.

   More than one address records in a protocol family per a server is
   inadvisable since the necessary glue RRsets (A or AAAA) are
   atomically indivisible, and will be larger than a single resource
   record.  Larger RRsets are more likely to lead to or encounter

   More than one address records across protocol families is less likely
   to lead to or encounter truncation, partly because multiprotocol
   clients, which are required to handle larger RRsets such as AAAA RRs,
   are more likely to speak EDNS which can use a larger UDP response
   size limit, and partly because the resource records (A and AAAA) are
   in different RRsets and are therefore divisible from each other.

   Name server names which are at or below the zone they serve are more
   sensitive to referral response truncation, and glue records for them
   should be considered "more important" than other glue records, in the
   assembly of referral responses.

Vixie & Kato              Expires June 20, 2008                 [Page 8]

Internet-Draft         DNS Referral Response Size          December 2007

   If a zone is served by thirteen (13) name servers having a common
   parent name (such as ?.ROOT-SERVERS.NET) and each such name server
   has a single address record in some protocol family (e.g., an A RR),
   then all thirteen name servers or any subset thereof could have
   address records in a second protocol family by adding a second
   address record (e.g., an AAAA RR) without reducing the reachability
   of the zone thus served.

5.  Security Considerations

   The recommendations contained in this document have no known security

6.  IANA Considerations

   This document does not call for changes or additions to any IANA

7.  Acknowledgement

   The authors thank Peter Koch, Rob Austein, Joe Abley, Mark Andrews,
   Kenji Rikitake, Stephane Bortzmeyerand, Olafur Gudmundsson, and
   Alfred Hines for their valuable comments and suggestions.

   This work was supported by the US National Science Foundation
   (research grant SCI-0427144) and DNS-OARC.

8.  Normative References

   [PERL]     Wall, L., Christiansen, T., and j. Orwant, "Programing
              Perl", July 2000.

   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
              STD 13, RFC 1034, November 1987.

   [RFC1035]  Mockapetris, P., "Domain names - implementation and
              specification", STD 13, RFC 1035, November 1987.

   [RFC1123]  Braden, R., "Requirements for Internet Hosts - Application
              and Support", STD 3, RFC 1123, October 1989.

   [RFC1996]  Vixie, P., "A Mechanism for Prompt Notification of Zone
              Changes (DNS NOTIFY)", RFC 1996, August 1996.

Vixie & Kato              Expires June 20, 2008                 [Page 9]

Internet-Draft         DNS Referral Response Size          December 2007

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2181]  Elz, R. and R. Bush, "Clarifications to the DNS
              Specification", RFC 2181, July 1997.

   [RFC2308]  Andrews, M., "Negative Caching of DNS Queries (DNS
              NCACHE)", RFC 2308, March 1998.

   [RFC2671]  Vixie, P., "Extension Mechanisms for DNS (EDNS0)",
              RFC 2671, August 1999.

   [RFC3226]  Gudmundsson, O., "DNSSEC and IPv6 A6 aware server/resolver
              message size requirements", RFC 3226, December 2001.

   [RFC3258]  Hardie, T., "Distributing Authoritative Name Servers via
              Shared Unicast Addresses", RFC 3258, April 2002.

   [RFC3901]  Durand, A. and J. Ihren, "DNS IPv6 Transport Operational
              Guidelines", BCP 91, RFC 3901, September 2004.

   [RFC4472]  Durand, A., Ihren, J., and P. Savola, "Operational
              Considerations and Issues with IPv6 DNS", RFC 4472,
              April 2006.

Appendix A.  The response simulator program

   #    respsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ...
   #        if all queries are assumed to have a same zone suffix,
   #     such as "jp" in JP TLD servers, specify it in -z option
   use strict;
   use Getopt::Std;

   my ($sz_msg) = (512);
   my ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = (12, 2, 16, 28);
   my ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = (2, 2, 4, 2);
   my (%namedb, $name, $nssect, %opts, $optz);
   my $n_ns = 0;

   getopt('z', %opts);
   if (defined($opts{'z'})) {
       server_name_len($opts{'z'}); # just register it

Vixie & Kato              Expires June 20, 2008                [Page 10]

Internet-Draft         DNS Referral Response Size          December 2007


   foreach $name (@ARGV) {
       my $len;
       $len = server_name_len($name);
       print "$name requires $len bytes\n";
       $nssect += $sz_ptr + $sz_type + $sz_class + $sz_ttl
               +  $sz_rdlen + $len;
   print "# of NS: $n_ns\n";
   arsect(255, $nssect, $n_ns, "maximum");
   arsect(64, $nssect, $n_ns, "average");

   sub server_name_len {
       my ($name) = @_;
       my (@labels, $len, $n, $suffix);

       $name =~ tr/A-Z/a-z/;
       @labels = split(/\./, $name);
       $len = length(join('.', @labels)) + 2;
       for ($n = 0; $#labels >= 0; $n++, shift @labels) {
           $suffix = join('.', @labels);
           return length($name) - length($suffix) + $sz_ptr
               if (defined($namedb{$suffix}));
           $namedb{$suffix} = 1;
       return $len;

   sub arsect {
       my ($sz_query, $nssect, $n_ns, $cond) = @_;
       my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect);
       $ansect = $sz_query + $sz_type + $sz_class;
       $space = $sz_msg - $sz_header - $ansect - $nssect;
       $n_a = atmost(int($space / $sz_rr_a), $n_ns);
       $n_a_aaaa = atmost(int($space
                              / ($sz_rr_a + $sz_rr_aaaa)), $n_ns);
       $n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns)
                              / $sz_rr_aaaa), $n_ns);
       printf "For %s size query (%d byte):\n", $cond, $sz_query;
       printf "    only A is considered:        ";
       printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns);
       printf "    A and AAAA are considered:   ";
       printf "# of A+AAAA is %d (%s)\n",
              $n_a_aaaa, &judge($n_a_aaaa, $n_ns);
       printf "    preferred-glue A is assumed: ";
       printf "# of A is %d, # of AAAA is %d (%s)\n",

Vixie & Kato              Expires June 20, 2008                [Page 11]

Internet-Draft         DNS Referral Response Size          December 2007

           $n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns);

   sub judge {
       my ($n, $n_ns) = @_;
       return "green" if ($n >= $n_ns);
       return "yellow" if ($n >= 2);
       return "orange" if ($n == 1);
       return "red";

   sub atmost {
       my ($a, $b) = @_;
       return 0 if ($a < 0);
       return $b if ($a > $b);
       return $a;

Authors' Addresses

   Paul Vixie
   Internet Systems Consortium
   950 Charter Street
   Redwood City, CA  94063

   Phone: +1 650 423 1300
   Email: paul@vix.com

   Akira Kato
   The University of Tokyo/WIDE Project
   Information Technology Center, 2-11-16 Yayoi
   Bunkyo, Tokyo  113-8658

   Phone: +81 3 5841 2750
   Email: kato@wide.ad.jp

Vixie & Kato              Expires June 20, 2008                [Page 12]

Internet-Draft         DNS Referral Response Size          December 2007

Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at


   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).

Vixie & Kato              Expires June 20, 2008                [Page 13]