Building Directories from DNS: Experiences from WWWSeeker
RFC 2517

Document Type RFC - Informational (February 1999; No errata)
Was draft-rfced-info-moats (individual)
Last updated 2013-03-02
Stream Legacy
Formats plain text pdf html bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 2517 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                       R. Moats
Request for Comments: 2517                                  R. Huber
Category: Informational                                         AT&T
                                                       February 1999

       Building Directories from DNS: Experiences from WWWSeeker

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (1999).  All Rights Reserved.

Abstract

   There has been much discussion and several documents written about
   the need for an Internet Directory.  Recently, this discussion has
   focused on ways to discover an organization's domain name without
   relying on use of DNS as a directory service.  This memo discusses
   lessons that were learned during InterNIC Directory and Database
   Services' development and operation of WWWSeeker, an application that
   finds a web site given information about the name and location of an
   organization.  The back end database that drives this application was
   built from information obtained from domain registries via WHOIS and
   other protocols.  We present this information to help future
   implementors avoid some of the blind alleys that we have already
   explored.  This work builds on the Netfind system that was created by
   Mike Schwartz and his team at the University of Colorado at Boulder
   [1].

1. Introduction

   Over time, there have been several RFCs [2, 3, 4] about approaches
   for providing Internet Directories.  Many of the earlier documents
   discussed white pages directories that supply mappings from a
   person's name to their telephone number, email address, etc.

   More recently, there has been discussion of directories that map from
   a company name to a domain name or web site.  Many people are using
   DNS as a directory today to find this type of information about a
   given company.  Typically when DNS is used, users guess the domain
   name of the company they are looking for and then prepend "www.".
   This makes it highly desirable for a company to have an easily

Moats & Huber                Informational                      [Page 1]
RFC 2517             Building Directories from DNS         February 1999

   guessable name.

   There are two major problems here.  As the number of assigned names
   increases, it becomes more difficult to get an easily guessable name.
   Also, the TLD must be guessed as well as the name.  While many users
   just guess ".COM" as the "default" TLD today, there are many two-
   letter country code top-level domains in current use as well as other
   gTLDs (.NET, .ORG, and possibly .EDU) with the prospect of additional
   gTLDs in the future.  As the number of TLDs in general use increases,
   guessing gets more difficult.

   Between July 1996 and our shutdown in March 1998, the InterNIC
   Directory and Database Services project maintained the Netfind search
   engine [1] and the associated database that maps organization
   information to domain names. This database thus acted as the type of
   Internet directory that associates company names with domain names.
   We also built WWWSeeker, a system that used the Netfind database to
   find web sites associated with a given organization.  The experienced
   gained from maintaining and growing this database provides valuable
   insight into the issues of providing a directory service.  We present
   it here to allow future implementors to avoid some of the blind
   alleys that we have already explored.

2. Directory Population

2.1 What to do?

   There are two issues in populating a directory: finding all the
   domain names (building the skeleton) and associating those domains
   with entities (adding the meat).  These two issues are discussed
   below.

2.2 Building the skeleton

   In "building the skeleton", it is popular to suggest using a variant
   of a "tree walk" to determine the domains that need to be added to
   the directory.  Our experience is that this is neither a reasonable
   nor an efficient proposal for maintaining such a directory.  Except
   for some infrequent and long-standing DNS surveys [5], DNS "tree
   walks" tend to be discouraged by the Internet community, especially
   given that the frequency of DNS changes would require a new tree walk
   monthly (if not more often).  Instead, our experience has shown that
   data on allocated DNS domains can usually be retrieved in bulk
   fashion with FTP, HTTP, or Gopher (we have used each of these for
   particular TLDs).  This has the added advantage of both "building the
Show full document text