ASID Working Group                                       Martin Hamilton
INTERNET-DRAFT                                   Loughborough University
                                                               July 1996


                       WHOIS++ URL Specification
               Filename: draft-ietf-asid-whois-url-00.txt


Status of This Memo

      This document is an Internet-Draft.  Internet-Drafts are working
      documents of the Internet Engineering Task Force (IETF), its
      areas, and its working groups.  Note that other groups may also
      distribute working documents as Internet-Drafts.

      Internet-Drafts are draft documents valid for a maximum of six
      months and may be updated, replaced, or obsoleted by other
      documents at any time.  It is inappropriate to use Internet-Drafts
      as reference material or to cite them other than as ``work in
      progress.''

      To learn the current status of any Internet-Draft, please check
      the ``1id-abstracts.txt'' listing contained in the Internet-Drafts
      Shadow Directories on ds.internic.net (US East Coast),
      nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or
      munnari.oz.au (Pacific Rim).

      Distribution of this memo is unlimited.  Editorial comments should
      be sent directly to the author.  Technical discussion will take
      place on the IETF ASID mailing list - ietf-asid@umich.edu.

      This Internet Draft expires January 25th, 1997.

Abstract

   This document defines a new Uniform Resource Locator (URL) scheme
   "whois++", which provides a convention within the URL framework for
   referring to WHOIS++ servers and the data held within them.  It does
   not specify a standard.

1. Overview of the WHOIS++ protocol

   RFC 1835 [1] defines a simple Internet directory protocol known as
   WHOIS++.  In order that WHOIS++ may be used within the Uniform
   Resource Locator (URL) framework defined by RFC 1738 [2], a URL
   scheme definition for WHOIS++ is necessary.  This document specifies
   a URL scheme "whois++", for use with the WHOIS++ protocol.



                                                                [Page 1]


INTERNET-DRAFT                                                 July 1996


   WHOIS++ is text based protocol after the fashion of many popular
   Internet application protocols, such as SMTP [3] and FTP [4].
   Although the protocol is TCP based, WHOIS++ is effectively stateless
   - no state information is preserved across requests, there is no
   concept of a session per se since each request/response pair is
   self-contained, and there is no "login" phase.

   WHOIS++ transactions normally consist of a single request from the
   client and response from the server, followed by the TCP connection
   between the two being torn down.  Use of the "hold" constraint in the
   WHOIS++ request makes it possible for the client to indicate that it
   would like to keep the TCP connection open for more than one request/
   response pair, but whether this is actually done is at the discretion
   of the server.

2. WHOIS++ URL specification

   The following information is necessary for a WHOIS++ client to
   formulate and deliver a request:

     o the domain name or IP address of the server to contact
     o the port number of the server (63 by default)
     o the request itself - normally a single line of text

   This is a good match with the generic Uniform Resource Locator (URL)
   scheme specified in RFC 1738.  So, a URL of the following form would
   seem to be appropriate:

     whois++://host[:port][/<request-specification>]

   Using the BNF grammar defined in RFC 1738, this could be written as:

     whoisppurl   = "whois++://" hostport [ "/" whoisppsrch ]

   where

     whoisppsrch  = *uchar

   The definitions for hostport and uchar are imported from RFC 1738:

     hostport     = host [ ":" port ]
     uchar        = unreserved | escape

   These in turn depend upon the following:

     unreserved   = alpha | digit | safe | extra
     alpha        = lowalpha | hialpha
     digit        = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |



                                                                [Page 2]


INTERNET-DRAFT                                                 July 1996


                    "8" | "9"
     safe         = "$" | "-" | "_" | "." | "+"
     extra        = "!" | "*" | "'" | "(" | ")" | ","
     lowalpha     = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                    "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                    "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                    "y" | "z"
     hialpha      = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                    "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                    "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"

     escape       = "%" hex hex
     hex          = digit | "A" | "B" | "C" | "D" | "E" | "F" |
                    "a" | "b" | "c" | "d" | "e" | "f"

   BNF for the WHOIS++ request format is defined in Appendix F of RFC
   1835.  This can contain characters which may confuse software which
   deals with whois++ URLs, notably spaces and characters drawn from
   non-ASCII character sets such as the UTF-8 variant of Unicode [5,6].
   Hence, the usual rules about hex-escaping illegal and reserved
   characters should apply - and the definiton of the WHOIS++ request as
   "uchar".  Note that the default WHOIS++ port of 63 should be used if
   the port number component of the "hostport" construction is left out.

   Global constraints such as authentication information, language and
   character set preferences may be expressed as part of the WHOIS++
   request.  Consequently it is not thought necessary to specify them
   separately in a mechanism such as the "user@host" construction
   defined for the FTP URL.

   Most WHOIS++ requests can be expected to consist of a single line of
   text, followed by carriage return and line feed characters.  It
   should, however, be noted that it may be necessary to encode multi-
   line requests within WHOIS++ URLs.  Software which implements whois++
   URLs should either be capable of handling this, or fail gracefully.

3. Examples

   The whois++ URL scheme defined above should make it possible to write
   URLs for any of the following:

     (a) a reference particular WHOIS++ server, without implying
           that a search should be done
     (b) a "canned" search of a particular server
     (c) individual objects within a server

   Case (a) simply requires that the host and optionally the port number
   be specified, e.g.



                                                                [Page 3]


INTERNET-DRAFT                                                 July 1996


     whois++://acm.org/

   or

     whois++://acm.org:63/

   When given a whois++ URL of this format, implementations may choose
   to present the user with a search form or dialogue, contact the
   server for information about which WHOIS++ options it supports, and
   so on.  The WHOIS++ default port 63 should be used if the port number
   is not specified.

   Case (b) requires a search specification to be present, e.g.

     whois++://acm.org/name=phil%20and%20name=zimmerman

   This may be sent verbatim to the server, once hex escaped chars in
   the URL have been converted back to normal, e.g.

     name=phil and name=zimmerman

   Case, (c) is effectively an instance of (b).  This may be implemented
   as a search where the request consists of the WHOIS++ "handle" of the
   requested object, e.g.

     whois++://acm.org/handle=number6

4. Global constraints

   Although there are no global constraints specified in these last two
   URLs, the WHOIS++ client may choose to add global constraints of its
   own, e.g.  use of the "hold" constraint to request that the
   connection be held open for a further request.

   If in addition, global constraints are part of the URL, this can
   easily be recognised by the presence of a colon ":" immediately after
   the slash "/" which separates the host and port information from the
   search specifier, e.g.

     whois++://acm.org/:authenticate=password;name=foo;password=bar

   At the implementor's discretion, the client may choose to pass these
   global constraints on in any queries which are passed to this server,
   e.g. if this URL was used in a search for "zimmerman", the request
   passed to the server might be either of

     zimmerman




                                                                [Page 4]


INTERNET-DRAFT                                                 July 1996


   or

     zimmerman:authenticate=password;name=foo;password=bar

   or "zimmerman", followed by some combination of the global
   constraints specified in the URL and other global constraints
   introduced by the WHOIS++ client.

5. Compatibility with WHOIS and RWhois

   The three protocols in the WHOIS family, NICNAME/WHOIS [7], WHOIS++,
   and RWhois [8], are not particularly similar.  WHOIS++ and RWhois use
   different request and response formats, and have different well-known
   port numbers.  WHOIS responses are assumed to be plain text and human
   readable.

   Consequently, this document has not attempted to define a single URL
   scheme for use with all three protocols.

6. World-Wide Web integration

   These whois++ URLs may be used as hyperlinks in HTML [9] documents,
   though it should be noted that the relative URL syntax defined in RFC
   1808 [10] is not appropriate for use in these links.  This is because
   WHOIS++ requests do not map conveniently onto the generic resource
   locator syntax used for relative URLs - the syntactic conventions
   used in writing a WHOIS++ request are very different from those of
   the generic resource locator.

   The WHOIS++ protocol and the whois++ URL lend themselves to
   implementation via a proxy HTTP [11] gateway, since the information
   necessary to contact the server and deliver the request is embedded
   within the URL itself.  A simple proxy gateway has been implemented
   which takes an HTTP "GET" request containing a whois++ URL, carries
   out a WHOIS++ transaction and returns the results formatted as HTML.
   This will probably be the preferred approach to providing WHOIS++
   support by proxy for some time - there is no Internet Media Type
   (aka.  MIME content-type) registered for WHOIS++ responses as yet.
   The proxy server implementation can be found at

     <URL:http://www.roads.lut.ac.uk/pickup/>

   It does not appear to be appropriate to use any HTTP methods other
   than "GET" with whois++ URLs, and there does not appear to be any
   value in using whois++ URLs in HTML forms.

   The appearance of the "+" character in the protocol scheme component
   of a URL is legal, according to RFC 1738.  The author has lingering



                                                                [Page 5]


INTERNET-DRAFT                                                 July 1996


   doubts about the ability of all software which processes URLs, for
   example in parsing HTML documents, to cope with this character.  No
   evidence has been found to back these doubts up, however.

7. Security Considerations

   Client software should check both the contents of the whois++ URL and
   the results returned from WHOIS++ search requests for any unsafe
   characters and character strings.

   It is possible to embed requests for other protocols within this URL
   format.  This is an approach which may be used to defeat security
   schemes, spoof protocols, and so on.  Implementors should consider
   requiring user confirmation when requests are directed to reserved
   ports (i.e.  those less than 1024) other than 63 and 43, or well-
   known ports in the unreserved range.

   Finally, implementations should take care not to cache authentication
   information.

8. Acknowledgements

   Thanks to Jeff Allen, Lorcan Dempsey, Patrik Faltstrom, Jon Knight,
   William F. Maton, Larry Masinter, and Scott Williamson for their
   comments on draft versions of this document.

   This work was supported by grants from the UK Electronic Libraries
   Programme (eLib) and the European Commission's Telematics for
   Research Programme.

9. References

   Request For Comments (RFC) and Internet Draft documents are available
   from <URL:ftp://ftp.internic.net> and numerous mirror sites.

         [1]         P. Deutsch, R. Schoultz, P. Faltstrom and C.
                     Weider.  "Architecture of the WHOIS++ service", RFC
                     1835. August 1995.


         [2]         T. Berners-Lee, L. Masinter and M. McCahill (eds).
                     "Uniform Resource Locators (URL)", RFC 1738.
                     December 1994.


         [3]         J. Postel.  "Simple Mail Transfer Protocol", RFC
                     821.  August 1982.




                                                                [Page 6]


INTERNET-DRAFT                                                 July 1996


         [4]         J. Postel, J. K. Reynolds.  "File Transfer Proto-
                     col", RFC 959.  October 1985.


         [5]         The Unicode Standard, Worldwide Character Encoding,
                     Version 1.0, Volume 1, Addison-Wesley, 1990. ISBN
                     0-201-56788-1.


         [6]         The Unicode Standard, Worldwide Character Encoding,
                     Version 1.0, Volume 2, Addison-Wesley, 1992. ISBN
                     0-201-60845-6.


         [7]         K. Harrenstien, M.K. Stahl, E.J. Feinler.
                     "NICNAME/WHOIS", RFC 954. October 1985.


         [8]         S. Williamson & M. Kosters.  "Referral Whois Proto-
                     col (RWhois)", RFC 1714.  November 1994.


         [9]         T. Berners-Lee, D. Connolly.  "Hypertext Markup
                     Language - 2.0", RFC 1866.  November 1995.


         [10]        R. Fielding. "Relative Uniform Resource Locators",
                     RFC 1808.  June 1995.


         [11]        T. Berners-Lee, R. Fielding, H. Frystyk.  "Hyper-
                     text Transfer Protocol -- HTTP/1.0", RFC 1945.  May
                     1996.

10. Author's address

   Martin Hamilton
   Department of Computer Studies
   Loughborough University of Technology
   Leics. LE11 3TU, UK

   Email: m.t.hamilton@lut.ac.uk
              This Internet Draft expires January 25th, 1997.








                                                                [Page 7]