DNSOP Working Group                               Paul Vixie, ISC (Ed.)
   INTERNET-DRAFT                                         Akira Kato, WIDE
   <draft-ietf-dnsop-respsize-00.txt>                           June, 2003
   
                           DNS Response Size Issues
   
   Status of this Memo
   
      This document is an Internet-Draft and is in full conformance with
      all provisions of Section 10 of RFC2026.
   
      Internet-Drafts are working documents of the Internet Engineering
      Task Force (IETF), its areas, and its working groups.  Note that
      other groups may also distribute working documents as Internet-
      Drafts.
   
      Internet-Drafts are draft documents valid for a maximum of six months
      and may be updated, replaced, or obsoleted by other documents at any
      time.  It is inappropriate to use Internet-Drafts as reference
      material or to cite them other than as "work in progress."
   
      The list of current Internet-Drafts can be accessed at
      http://www.ietf.org/ietf/1id-abstracts.txt
   
      The list of Internet-Draft Shadow Directories can be accessed at
      http://www.ietf.org/shadow.html.
   
   Copyright Notice
   
      Copyright (C) The Internet Society (2003).  All Rights Reserved.
   
   Abstract
   
      With a mandated default minimum maximum message size of 512 octets,
      the DNS protocol presents some special problems for zones wishing to
      expose a moderate or high number of authority servers (NS RRs).  This
      document explains the operational issues caused by, or related to
      this response size limit.
   
   1 - Introduction and Overview
   
   1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512
   octets.  Even though this limitation was due to the required minimum UDP
   reassembly limit for IPv4, it is a hard DNS protocol limit and is not
   implicitly relaxed by changes in transport, for example to IPv6.
   
   
   
   Expires December 2003                                           [Page 1]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   1.2. The EDNS0 standard (see [RFC2671 2.3, 4.5]) permits larger
   responses by mutual agreement of the requestor and responder.  However,
   deployment of EDNS0 cannot be expected to reach every Internet resolver
   in the short or medium term.  The 512 octet message size limit remains
   in practical effect at this time.
   
   1.3. Since DNS responses include a copy of the request, the space
   available for response data is somewhat less than the full 512 octets.
   For negative or positive responses, there is rarely a space constraint.
   For positive and delegation responses, though, every octet must be
   carefully and sparingly allocated.  This document specifically addresses
   delegation response sizes.
   
   2 - Delegation Details
   
   2.1. A delegation response will include the following elements:
   
      Header Section: fixed length (12 octets)
      Question Section: original query (name, class, type)
      Answer Section: (empty)
      Authority Section: NS RRset (nameserver names)
      Additional Section: A and AAAA RRsets (nameserver addresses)
   
   2.2. If the total response size would exceed 512 octets, and if the data
   that would not fit was in the question, answer, or authority section,
   then the TC bit will be set (indicating truncation) which may cause the
   requestor to retry using TCP, depending on what information was present
   and what was omitted.  If a retry using TCP is needed, the total cost of
   the transaction is much higher.
   
   2.3. RRsets are never sent partially, so if truncation occurs, entire
   RRsets are omitted.  Note that the authority section consists of a
   single RRset.  It is absolutely essential that truncation not occur in
   the authority section.
   
   2.4. DNS label compression allows a domain name to be instantiated only
   once per DNS message, and then referenced with a two-octet "pointer"
   from other locations in that same DNS message.  If all nameserver names
   in a message are similar (for example, all ending in ".ROOT-
   SERVERS.NET"), then more space will be available for uncompressable data
   (such as nameserver addresses).
   
   2.5. The query name can be as long as 255 characters of presentation
   data, which can be up to 256 octets of network data.  In this worst case
   scenario, the question section will be 260 octets in size, which would
   
   
   
   Expires December 2003                                           [Page 2]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   leave only 240 octets for the authority and additional sections (after
   deducting 12 octets for the fixed length header.)
   
   2.6. Average and maximum question section sizes can be predicted by the
   zone owner, since they will know what names actually exist, and can
   measure which ones are queried for most often.  For cost and performance
   reasons, the majority of requests should be satisfied without truncation
   or TCP retry.
   
   2.7. Requestors who deliberately send large queries to force truncation
   are only increasing their own costs, and cannot effectively attack the
   resources of an authority server since the requestor would have to retry
   using TCP to complete the attack.  An attack that always used TCP would
   have a lower cost.
   
   2.8. The minimum useful glue is two address records.  (With only one
   address, the probability that it would refer to an unreachable server is
   too high.)  Truncation which occurs after two address records have been
   added to the additional data section is therefore less operationally
   significant than truncation which occurs earlier.
   
   2.9. The best case is no truncation.  (This is because many requestors
   will retry using TCP by reflex, without considering whether the omitted
   data was actually necessary.)
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   Expires December 2003                                           [Page 3]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   3 - Analysis
   
   3.1. An instrumented protocol trace of a best case delegation response
   follows.  Note that 13 servers are named, and 13 addresses are given.
   This query was artificially designed to exactly reach the 512 octet
   limit.
   
      ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 13
      ;; QUERY SECTION:
      ;;  [23456789.123456789.123456789.\
           123456789.123456789.123456789.com A IN]        ;; @80
   
      ;; AUTHORITY SECTION:
      com.                 86400 NS  E.GTLD-SERVERS.NET.  ;; @112
      com.                 86400 NS  F.GTLD-SERVERS.NET.  ;; @128
      com.                 86400 NS  G.GTLD-SERVERS.NET.  ;; @144
      com.                 86400 NS  H.GTLD-SERVERS.NET.  ;; @160
      com.                 86400 NS  I.GTLD-SERVERS.NET.  ;; @176
      com.                 86400 NS  J.GTLD-SERVERS.NET.  ;; @192
      com.                 86400 NS  K.GTLD-SERVERS.NET.  ;; @208
      com.                 86400 NS  L.GTLD-SERVERS.NET.  ;; @224
      com.                 86400 NS  M.GTLD-SERVERS.NET.  ;; @240
      com.                 86400 NS  A.GTLD-SERVERS.NET.  ;; @256
      com.                 86400 NS  B.GTLD-SERVERS.NET.  ;; @272
      com.                 86400 NS  C.GTLD-SERVERS.NET.  ;; @288
      com.                 86400 NS  D.GTLD-SERVERS.NET.  ;; @304
   
      ;; ADDITIONAL SECTION:
      A.GTLD-SERVERS.NET.  86400 A   192.5.6.30           ;; @320
      B.GTLD-SERVERS.NET.  86400 A   192.33.14.30         ;; @336
      C.GTLD-SERVERS.NET.  86400 A   192.26.92.30         ;; @352
      D.GTLD-SERVERS.NET.  86400 A   192.31.80.30         ;; @368
      E.GTLD-SERVERS.NET.  86400 A   192.12.94.30         ;; @384
      F.GTLD-SERVERS.NET.  86400 A   192.35.51.30         ;; @400
      G.GTLD-SERVERS.NET.  86400 A   192.42.93.30         ;; @416
      H.GTLD-SERVERS.NET.  86400 A   192.54.112.30        ;; @432
      I.GTLD-SERVERS.NET.  86400 A   192.43.172.30        ;; @448
      J.GTLD-SERVERS.NET.  86400 A   192.48.79.30         ;; @464
      K.GTLD-SERVERS.NET.  86400 A   192.52.178.30        ;; @480
      L.GTLD-SERVERS.NET.  86400 A   192.41.162.30        ;; @496
      M.GTLD-SERVERS.NET.  86400 A   192.55.83.30         ;; @512
   
      ;; MSG SIZE  sent: 80  rcvd: 512
   
   
   
   
   
   Expires December 2003                                           [Page 4]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   3.2. For longer query names, the number of address records supplied will
   be lower.  Furthermore, it is only by using a common parent name (which
   is GTLD-SERVERS.NET in this example) that all 13 addresses are able to
   fit.  The following output from a response simulator demonstrates these
   properties:
   
      % perl respsize.pl 13 13 0
        common name, average case: msg:303    glue#13 (green)
        common name,   worst case: msg:495    glue# 1 (red)
      uncommon name, average case: msg:457    glue# 3 (orange)
      uncommon name,   worst case: msg:649(*) glue# 0 (red)
      % perl respsize.pl 13 13 2
        common name, average case: msg:303    glue#11 (orange)
        common name,   worst case: msg:495    glue# 1 (red)
      uncommon name, average case: msg:457    glue# 2 (orange)
      uncommon name,   worst case: msg:649(*) glue# 0 (red)
   
   (Note: The response simulator's source code is contained in the
   appendix.)
   
   Here we use the term "green" if all address records could fit, or
   "orange" if two or more could fit, or "red" if fewer than two could fit.
   It's clear that without a common parent for nameserver names, much space
   would be lost.
   
   4 - Further Work
   
   4.1. Traces from one or more root name servers and at least a dozen
   diverse TLD name servers should be analyzed to measure the actual
   minimum, maximum, average, and standard deviation in query name sizes.
   
   4.2. Current delegation response sizes from the root server system for
   all TLDs should be measured in light of the known query name sizes found
   to be in use.
   
   4.3. A policy should be created for the addition of AAAA RR's for
   existing name servers, in both TLD delegations under the root zone, and
   SLD delegations under interested TLDs.
   
   4.4. Participants in the Internationalized Domain Names (IDN) effort
   should take careful note of the performance effects of larger query
   names on root name server system delegation sizes.
   
   
   
   
   
   
   Expires December 2003                                           [Page 5]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   5 - Source Code
   
   #!/usr/bin/perl -w
   
   $asize = 2+2+2+4+2+4;
   $aaaasize = 2+2+2+4+2+16;
   ($nns, $na, $naaaa) = @ARGV;
   test("common", "average", common_name_average($nns), $na, $naaaa);
   test("common", "worst", common_name_worst($nns), $na, $naaaa);
   test("uncommon", "average", uncommon_name_average($nns), $na, $naaaa);
   test("uncommon", "worst", uncommon_name_worst($nns), $na, $naaaa);
   exit 0;
   
   sub test { my ($namekind, $casekind, $msg, $na, $naaaa) = @_;
        my $nglue = numglue($msg, $na, $naaaa);
        printf "%8s name, %7s case: msg:%3d%s glue#%2d (%s)\n",
             $namekind, $casekind,
             $msg, ($msg > 512) ? "(*)" : "   ",
             $nglue, ($nglue == $na + $naaaa) ? "green"
                  : ($nglue >= 2) ? "orange"
                       : "red";
   }
   
   sub pnum { my ($num, $tot) = @_;
        return sprintf "%3d%s",
   }
   
   sub numglue { my ($msg, $na, $naaaa) = @_;
        my $space = ($msg > 512) ? 0 : (512 - $msg);
        my $num = 0;
   
        while ($space && ($na || $naaaa )) {
             if ($na) {
                  if ($space >= $asize) {
                       $space -= $asize;
                       $num++;
                  }
                  $na--;
             }
             if ($naaaa) {
                  if ($space >= $aaaasize) {
                       $space -= $aaaasize;
                       $num++;
                  }
                  $naaaa--;
   
   
   
   Expires December 2003                                           [Page 6]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
             }
        }
        return $num;
   }
   
   sub msgsize { my ($qname, $nns, $nsns) = @_;
        return    12 +                            # header
                $qname+2+2 +                    # query
                0 +                             # answer
                $nns * (4+2+2+4+2+$nsns);       # authority
   }
   
   sub average_case { my ($nns, $nsns) = @_;
        return msgsize(64, $nns, $nsns);
   }
   
   sub worst_case { my ($nns, $nsns) = @_;
        return msgsize(256, $nns, $nsns);
   }
   
   sub common_name_average { my ($nns) = @_;
        return 15 + average_case($nns, 2);
   }
   
   sub common_name_worst { my ($nns) = @_;
        return 15 + worst_case($nns, 2);
   }
   
   sub uncommon_name_average { my ($nns) = @_;
        return average_case($nns, 15);
   }
   
   sub uncommon_name_worst { my ($nns) = @_;
        return worst_case($nns, 15);
   }
   
   
   
   
   
   
   
   
   
   
   
   
   
   Expires December 2003                                           [Page 7]


   INTERNET-DRAFT                  June 2003                       RESPSIZE
   
   
   5 - Author's Address
   
   Paul Vixie
      950 Charter Street
      Redwood City, CA 94063
      +1 650 779 7000
      paul@vix.com
   
   Akira Kato
      University of Tokyo, Information Technology Center
      2-11-16 Yayoi Bunkyo
      Tokyo 113-8658, JAPAN
      +81 3 5841 2750
      kato@wide.ad.jp
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   Expires December 2003                                           [Page 8]