Network Working Group                                           J. Myers
Internet Draft                                           Carnegie Mellon
Document: draft-myers-mail-largesite-00.txt                December 1996


             DNS MX Record Deployment for Large Mail Sites

Status of this Memo

   This document is an Internet Draft.  Internet Drafts are working
   documents of the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups.  Note that other groups may also distribute
   working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months.  Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time.  It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a
   ``working draft'' or ``work in progress``.

   To learn the current status of any Internet-Draft, please check the
   1id-abstracts.txt listing contained in the Internet-Drafts Shadow
   Directories on ds.internic.net, nic.nordu.net, ftp.isi.edu, or
   munnari.oz.au.

   A revised version of this draft document will be submitted to the RFC
   editor as a Best Current Practice for the Internet Community.
   Discussion and suggestions for improvement are requested.  This
   document will expire six months from date of issue.  Distribution of
   this draft is unlimited.

Abstract

   Domains which recieve a large number of incoming mail messages have a
   need to distribute incoming requests over a large number of SMTP
   servers.  The most obvious method for advertising DNS records these
   SMTP servers leads to undesirable behavior when certain network
   failures occur.  For example, on XXXX, 1996 [XXXREF] a single routing
   failure at a large site led to the catastrophic failure of a large
   portion of mail systems on the Internet.  This document gives
   recommendations for how large sites should advertise DNS records for
   their SMTP servers to avoid these problems and for how SMTP client
   implementations should insulate themselves against large sites which
   do not follow these recommendations.







J. Myers                                                        [Page i]


Internet DRAFT            MX Record Deployment          16 December 1996


1.   The wrong way for large sites to advertise SMTP servers

   The most obvious method for a large site to advertise DNS records for
   its SMTP servers is to advertise some number of equal-weighted MX
   records for the domain, where each host name in an MX record has
   multiple A records.  Each of the multiple A records contains the IP
   address of one of the SMTP servers.

   For example, a site might have the following MX records:

   bigsite.com.        30491   MX      10 a.mx.bigsite.com.
   bigsite.com.        30491   MX      10 b.mx.bigsite.com.
   bigsite.com.        30491   MX      10 c.mx.bigsite.com.
   bigsite..com.        30491   MX      10 d.mx.bigsite.com.
   bigsite.com.        30491   MX      10 e.mx.bigsite.com.
   bigsite.com.        30491   MX      10 f.mx.bigsite.com.
   bigsite.com.        30491   MX      10 g.mx.bigsite.com.
   bigsite.com.        30491   MX      10 h.mx.bigsite.com.
   bigsite.com.        30491   MX      10 i.mx.bigsite.com.

   Each host name mentioned in in one of the above MX records could have
   A records like:

   a.mx.bigsite.com.   5806    A       10.2.94.14
   a.mx.bigsite.com.   5806    A       10.2.94.35
   a.mx.bigsite.com.   5806    A       10.2.94.36
   a.mx.bigsite.com.   5806    A       10.2.94.30
   a.mx.bigsite.com.   5806    A       10.2.94.11

   b.mx.bigsite.com.   5779    A       10.2.94.34
   b.mx.bigsite.com.   5779    A       10.2.94.37
   b.mx.bigsite.com.   5779    A       10.2.94.47
   b.mx.bigsite.com.   5779    A       10.2.94.33
   b.mx.bigsite.com.   5779    A       10.2.94.13

   (and so on.)

   The reason large sites use MX records containing host names which
   have multiple A records instead of simply using one MX record for
   each SMTP server is because the resulting set of MX records would be
   too large for a DNS server to return all of them in a single UDP
   packet.

   This method of advertising DNS records for SMTP servers leads to
   extremely undesirable behavior of SMTP clients in the face of certain
   network problems.  The fail-over semantics of multiple A records for
   a host name are the exact opposite than are desired for mail
   delivery.  If an SMTP client gets a "connection refused" error or 4xx



J. Myers                                                        [Page 2]


Internet DRAFT            MX Record Deployment          16 December 1996


   greeting when attempting to contact the host b.mx.bigsite.com at
   address 10.2.94.34, it knows that the "multihomed host"
   b.mx.bigsite.com is not available and may fail over to the next MX
   record.  In this case, the IP addresses for the other A records are
   not attempted.

   If, on the other hand, the SMTP client recieves no response to
   connection attempts to 10.2.94.34, it will time out and then attempt
   to connect to 10.2.94.37 and so on.  If there is a routing failure
   which causes all packets either into or out of the big site's network
   to be dropped, the SMTP client will in this example have to time out
   a total of 45 times.  At two minutes per timeout, this means the SMTP
   client must wait a total of an hour and a half before it can give up
   the delivery attempt.

2.   Recommendations for large sites

   In order to reduce the impact of network failures on SMTP clients,
   large sites should advertise their SMTP servers as follows:

   There SHOULD be from three to six MX records for each recieving mail
   domain.  There SHOULD be at most two different weights used amongst
   the MX records for a domain, in order to allow SMTP clients to
   randomly select their sorting per [HOST-REQ] 5.3.4(1).  If all of the
   SMTP servers are identically configured, it is preferred that all of
   the MX records have equal weights.  A site MUST NOT have more than
   ten MX records for a given domain.

   Each host name in a MX record SHOULD have exactly one A record.  If
   one or more MX records have multiple A records, each A record after
   the first counts as an additional MX record with respect to the
   limits stated in the previous paragraph.  (Put another way, the
   limits in the previous paragraph apply to the total number of A
   records which are pointed to by the set of MX records for a domain.)

   A site MUST NOT have more than five A records for a host name in an
   MX record.  Older resolver libraries do not deal correctly with six
   or more A records per host name.

   There are two suggestions for load-balancing amongst a pool of SMTP
   servers larger than the number of MX records.  Neither alternative is
   preferred.

2.1. First alternative

   Each host name pointed to by a MX record is in a zone served by a
   load-balancing name server.  For each appropriate query, this load-
   balancing name server returns a single A record containing an address



J. Myers                                                        [Page 3]


Internet DRAFT            MX Record Deployment          16 December 1996


   selected from a pool of SMTP servers.  To comply with the
   requirements of the DNS protocol, each time the name server is
   queried for the SOA record for the zone, the record it returns MUST
   contain a different serial number.

   The TTL for the returned A records SHOULD be non-zero.  Returning
   zero TTLs is likely to cause the load-balancing name server to be
   innundated with queries.  For sites large enough to need more than
   six SMTP servers, it is most likely not going to be a problem when a
   name server for any other domain caches one of these A records.
   There will be enough different sending sites using different caching
   name servers to keep the load from getting two uneven.  Even when a
   large sending site caches a set of A records, the multiple equal-
   weighted MX records will cause some distribution of load across the
   set of SMTP servers cached on the SMTP client's name server.

2.2. Second alternative

   Each host name pointed to by a MX record has an A record with the
   address of a proxy host or Network Address Translation router.  Each
   proxy host or NAT router then proxies/routes each SMTP connection to
   an SMTP server selected from the pool.

3.   Recommendations for SMTP client implementations

   The second paragraph of section 5.3.1.1 of [HOST-REQ] states:

               The sender MUST delay retrying a particular destination
               after one attempt has failed.  In general, the retry
               interval SHOULD be at least 30 minutes; however, more
               sophisticated and variable strategies will be beneficial
               when the sender-SMTP can determine the reason for non-
               delivery.

   Implementing this directive is extremely important in order to avoid
   tying up all resources attempting to deliver mulitple messages to a
   domain with unresponsive SMTP servers.  Implementing this directive
   is not, however, sufficient to avoid the problem mentioned in section
   1 of this document, since a delivery attempt time on the order of 90
   minutes will completely overshadow a retry interval on the order of
   30 minutes.

   [DNS-MX] permits SMTP clients to have a fixed limit on the number of
   MX records that are tried. [HOST-REQ] 5.3.4 extends this to permit a
   configurable limit on the number of alternate addresses that can be
   tried, further stating a host SHOULD try at least two addresses.

   SMTP client implementations need to limit the amount of time they



J. Myers                                                        [Page 4]


Internet DRAFT            MX Record Deployment          16 December 1996


   spend attempting to connect to a server for a given mail domain.
   This document gives two suggestions for doing this, an SMTP client
   SHOULD implement one of the two following alternatives.  The second
   alternative is preferred.

3.1. First alternative

   SMTP implementations following this alternative have a configurable
   limit on the maximum number of connection attempts it makes across
   all MX records for a domain.  This limit SHOULD be at least two and
   SHOULD have a default value no larger than 6.

   At two minutes per timeout, this translates to a maximum of about 12
   minutes per delivery attempt to a domain.

3.2. Second alternative

   SMTP implementations following this alternative have a configurable
   limit on the total amount of time it spends attempting to make a
   connection to some MX server for a domain.  After this time limit
   expires, the SMTP client will attempt no new connections during that
   delivery attempt.  The default vaulue for this limit SHOULD be no
   more than 10 minutes.

4.   Security Considerations

   Section 3 gives recommendations for SMTP client implementations to
   insulate themselves against catastrophic denial of service due to a
   known problem caused by a remote site having a combination of poor
   DNS configuration and a particular type of network failure.

   Otherwise, there are no known security considerations with this memo.

5.   Author's Address

   John G. Myers
   Carnegie-Mellon University
   5000 Forbes Ave.
   Pittsburgh PA, 15213-3890

   Email: jgm+@cmu.edu

Appendix A: lbnamed

   XXX-todo






J. Myers                                                        [Page 5]


Internet DRAFT            MX Record Deployment          16 December 1996


Appendix B: References

   [XXXREF] some news article on the AOL failure

   [DNS-MX] Partridge, C., "Mail routing and the domain system", RFC 974

   [HOST-REQ] Braden, R., "Requirements for Internet hosts - application
   and support", RFC 1123











































J. Myers                                                        [Page 6]


Internet DRAFT            MX Record Deployment          16 December 1996





                            TTaabbllee ooff CCoonntteennttss



Status of this Memo ...............................................    i
Abstract ..........................................................    i
1.   The wrong way for large sites to advertise SMTP servers ......    2
2.   Recommendations for large sites ..............................    3
2.1. First alternative ............................................    3
2.2. Second alternative ...........................................    4
3.   Recommendations for SMTP client implementations ..............    4
3.1. First alternative ............................................    5
3.2. Second alternative ...........................................    5
4.   Security Considerations ......................................    5
5.   Author's Address .............................................    5
Appendix A: lbnamed ...............................................    5
Appendix B: References ............................................    6































J. Myers                                                       [Page ii]