Network Working Group                                     Sam X. Sun
INTERNET-DRAFT                                                  CNRI
Expires Jan, 16, 1999                                  July 16, 1998
draft-sun-handle-system-01.txt


        Handle System: A Persistent Global Name Service
                        Overview and Syntax

Status of this Memo

   This document is an Internet-Draft. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time. It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."

   To learn the current status of any Internet-Draft, please check
   the "1id-abstracts.txt" listing contained in the Internet-Drafts
   Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
   (Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East
   Coast), or ftp.isi.edu (US West Coast).

Abstract

   The Handle System (r) is a comprehensive system  for assigning,
   managing, and resolving persistent identifiers, known as
   'handles' for digital objects and other resources on the Internet.
   Handles can be used as Uniform Resource Names (URNs). The Handle
   System defines:(a) an open set of protocols, (b) a global namespace,
   and (c) a distributed service model that provides the global name
   service. The system allows Internet resources to be named as handles.
   A handle may contain the information necessary to locate and access
   its named resources. This associated information can be changed as
   needed to reflect the current state of the identified resource
   without changing the handle, thus allowing the name of the item to
   persist over changes of location and other state information.
   Combined with a centrally administered naming authority registration
   service, the Handle System provides a general purpose, distributed
   global name service for the reliable management of information on
   networks over long periods of time. (Note that in this document we
   do not attempt to distinguish between the terms 'name' and
   'indentifier' and will use them interchangably.)


1. Introduction

   The Handle System is a distributed information system that provides
   a persistent naming service for use on networks such as the
   Internet. Handles can be used to identify any network resources.

   Each handle may be assigned with a set of typed values that describes
   its named object. The Handle System provides the handle resolution
   service that allows these values to be retrieved. It also provides
   the handle administration service that allows individual handles to
   have their own administrator(s) assigned, and be administrated over
   the distributed environment.

   The Handle System ensures that every handle is unique within the
   context of the Handle System and may be retained and resolved over
   long time periods. The resolution information associated with each
   handle can be changed as needed, allowing the handle to persist over
   changes in location and other states of the named resource.

   Specifically, the Handle System was designed to address the
   following problems in network resource identification:

   * Persistence

     A named resource can outlast any specific computer system or
     organization. Any resource name which is inextricably linked
     with a specific system or name of an organization will not be
     able to survive the demise or radical change of that computer
     system or organization. By separating the object's name from
     location, ownership, and other state information, the Handle
     System allows that identifier to persist over time.

   * Location independence

     With handles, the name of the item is unrelated to the location
     of the item. This allows easy reorganization of information.
     Handles make it possible to transfer resources from one
     organization to another without affecting or breaking the
     existing user references (i.e., handles) to those collections.
     This is not possible using location based references.

   * Multiple instances of an item

     A single handle can refer to more than one instance of a network
     resource. A network service may thus define multiple entry points
     for its service with a single handle name. This allows the service
     to distribute its service load into multiple instances.


   The Handle System has been implemented and is currently in use in
   a number of prototype projects, including efforts with the Library
   of Congress, the Association of American Publishers, the Defense
   Technical Information Center, and the United States Information
   Agency.

   This is the first of a series of planned documents that will
   specify the handle protocol and services, and relate the Handle
   System to other IETF activities in URN/URI/URL working groups.
   This document provides a concise overview of the system and the
   syntax of handles. Additional information can be found on CNRI
   and related project web sites [4, 5, 6, 8, 16, 17, 18, 19].


2. Handle Syntax

   Every handle in the Handle System is defined in two parts:
   its naming authority, otherwise know as its prefix, and a unique
   local name under that naming authority, otherwise known as its
   suffix. The Handle System protocol mandates UTF-8 [2] as the only
   encoding for any handles specified in the protocol packet.

   The naming authority identifies the administrative unit of the
   underlying handles. It is globally unique and will be persistent
   once obtained. Naming authorities may consist of any UTF-8 encoded
   characters defined in the Unicode 2.0 [1] standard except '.'
   (%x0E) or '/' (%x2F). ASCII characters in the naming authority
   are case insensitive and are converted into upper case before
   resolution taking place.

   The local name under the naming authority may consist of any
   UTF-8 encoded characters defined in the Unicode 2.0. It does not
   impose any reserved or excluded characters. ACSII characters
   within the local name are case insensitive, and are converted
   into upper case before resolution taking place.

   The following is the handle syntax described in ABNF [21]
   notation:

      <Handle>           = <Naming Authority> "/" <Local Name>

      <Naming Authority> = *( <Naming Authority>  "." ) <NA Name>

      <Local Name>       = 1*( %x00-FF )
                            ; any octets that map to UTF-8 encoded
                            ; Unicode 2.0 characters.

        <NA Name>          = 1*( %x00-2D  /  %x30-FF )
                            ; any octets that map to UTF-8 encoded
                            ; Unicode 2.0 characters except octets
                            ; %x2E-2F which map to ASCII characters
                            ; '.' and '/'.

   Here are some examples of valid handles that may be used in the
   Handle System protocol:

      cnri.dlib/july95-arms

      10.1002/0002-8231(199601)47:1<1:SPOTEO>2.3.TX;2-K

      any-printable-characters/a-zA-Z0-9!@#$%^&*()_"<>,.?/`~|\

      handles-in-germany/Universit~{#?~}-Karlsruhe


3. Handle data

   A handle within the Handle System is associated with, and can
   be resolved to, one or more elements of typed data. Examples of
   data types in use include URLs, object request brokers, and
   other URNs. Other examples might include e-mail addresses or
   public key certificates. There is a controlled set of named
   types accepted by the system. This list can be extended as
   needed at the system level.

   Each handle will also have its administrative data. The
   administrative data, e.g., permissions to create handles or
   edit handle data, is initially provided by the handle server when
   the handle was first created. The administrative data can be used
   to define the handle administrator that manages the handle data.
   This administrative data is not returned as part of the handle
   resolution but is used for handle administration only.

   Other than the relationship between the Global Handle Registry
   and local handle services described above, there are no
   hierarchical relationships assumed among handle records.  Note,
   however, that handles can include in their associated data
   references to other handles, thus allowing hierarchical or other
   relationships to be constructed as needed.


4. Using Handles in the World Wide Web

4.1 Handle URI syntax

   The Handle Syntax in section 2 defines the encoding rules for
   handles transferred over the wire via the Handle System protocol.
   Handles may also be referenced as a URI [22], which can be used in
   Web browsers or in HTML mark-up documents to refer to persistent
   Internet resources. The Handle URI syntax defines the syntax
   rule for handles specified in the URI format.

   Handles defined as a Handle URI may be resolved by the Handle
   System Resolver [4]. The Handle System Resolver will convert the
   URI into the Handle (as defined in section 2) before doing the
   resolution.

   The Handle URI Syntax is defined as follows:

      <Handle URI> = "hdl:" [ <Modifier> "@" ] <HandleRef>

        <Modifier>   = [ <Encoding> ] [ "type=" <Type-id> ]

        <Encoding>   = 1*40( %x01-7F )
                                ; A registered charset name [23] from IANA,
                                ; which may be any printable ASCII characters.

        <Type-id>    = 1*(%x30-39)
                                ; digits 0 - 9.

      <HandleRef>  = 1*( %x00-%xFF )
                                ; Octets that encodes a <Handle> using the
                                ; <encoding> in the optional <Modifier>.
                                ; If no <Modifier> specified, UTF-8 encoding
                                ; is the default encoding.


   When UTF-8 is the encoding used, the Handle URI Syntax has two
   reserved characters, % and ". The character % is used for hex encoding,
   which is necessary to allow any handles specified from the standard
   keyboard. And the character " is reserved to allow handles to be
   separated from the surrounding text in HTML documents. Reserved
   characters must be hex encoded when used in the URI context. The
   choice of % and hex encoding is also compatible with the current URI
   practice. Because some browser implementation (incorrectly!) drops
   the # character when processing the URI regardless of its scheme, hex
   encoding of character # is also recommended.

   Examples of handles using Handle URI Syntax are:


      hdl:cnri.dlib/july95-arms

      (which refers to handle "cnri.dlib/july95-arms")

   and

      hdl:handle-with-hex-encoding/handle%25abc

      (which refers to handle "handle-with-hex-encoding/handle%abc")


   It's worth noting that the handle namespace by itself does not
   impose any hex or escape encoding, nor does the underlying
   Handle System. The reserved characters and hex encoding are
   introduced only when handles are used in the URI context. It is
   the client software's responsibility to decode any hex encoding
   in the handle URI before sending the handles out for resolution.
   And on systems where other character set encoding is used, it is
   also the client software's responsibility to convert a natively
   displayed handle to its UTF-8 encoding before sending it out
   for resolution.

4.2 Handle Resolution service from Web browsers

   Handles specified using Handle URI Syntax (ie, hdl:<HandleRef>)
   can be resolved from a Web browser directly using the Handle
   System Resolver [4]. The Resolver is a freely available
   extension to the current popular Web browsers. It resolves
   handles into corresponding URIs, which are then retrieved by
   the browser in the normal fashion.  This is the suggested way
   to resolve the handles in the future, because it provides
   better performance, is more scalable, and is locally
   configurable.

   Handles can also be resolved using proxy services using Handle
   Proxy Syntax (ie, http:<proxy>/<handle>). In this case, the
   proxy server performs the handle resolution task, and sends
   the resulting URL to the client browser for processing.
   Currently, CNRI provides global handle proxy server through
   "hdl.handle.net", and "dx.doi.org". The proxy server allows
   handles to be resolved without additional software for the
   client. For example, a handle "cnri.dlib/july95-arms" may be
   entered as "http://hdl.handle.net/cnri.dlib/july95-arms"
   resolvable by any browser.

   It is worth noting that even though using the proxy server
   approach is straight-forward and doesn't require any customer
   software customization, it has the effect of connecting the
   handles with the proxy server's URL location. Hence the
   selection of a proxy server should be made with care.

4.3 Creating handles for network resources

   The Handle System allows handles to be created in a distributed
   fashion. Organizations in need of providing a naming service
   for their persistent internet resources will be able to contact
   CNRI or other organizations to register for their own handle
   naming authority, as well as their own local handle services.
   This will enable them to create handles for their own
   organizational use. Policies and procedures for Naming Authority
   registration are currently under development.

   As an initiative for general public service, CNRI has established
   a public handle registration service for the IETF community. This
   service provides an open channel to allow individuals to create
   handles and experiment with the handle system. The service is
   provided for testing purposes only. Future availability of this
   service is not guaranteed. Details on how to use this service, as
   well as its terms and conditions can be obtained from
   http://www.handle.net/ietf/handle/register_handle.html.


5. Handle System Service Architecture

   The Handle System is distributed, scalable, and designed for
   widespread deployment. The current implementation consists of one
   global service and many local handle services. Each handle
   service consists of one of more physically distributed handle
   servers. (Currently, the global service consists of two servers
   in Virginia and two in California. A European location is
   planned.) And each handle server can have one or more secondary
   servers for mirroring. In addition, handle caching servers are
   provided for faster resolution service for a local environment,
   and they can also be used to provide proxy service through
   firewalls.

5.1 Handle services

   The Handle System consists of many services. Each service is
   responsible for part of the handle namespace. One specific
   service, called the Global Handle Registry, is globally unique,
   and has a special function, which is to know of the existence,
   location, and namespace responsibilities of all other public
   services, or local handle services. There can be an unlimited
   number of local handle services, managed by various organizations.
   In the current implementation each local handle service is
   registered with the Global Handle Registry to ensure efficient
   resolution. Policies and procedures for disconnected local handle
   services are under development. The primary issue here is to
   guarantee identifier uniqueness in disconnected systems.

5.2 Handle servers within a service

   Each handle service consists of one or more handle servers.
   Typically, each handle server runs on a separate computer but
   multiple handle servers can run on a single computer. Within a
   handle service, the distribution of handles across its constituent
   servers is determined by a hash table such that each of N servers
   within a service will be responsible for 1/N handles. The number
   of servers can be adjusted as required to meet the needs of a
   service.

5.3 Server replication

   Additionally, it may be desirable to mirror the contents of any of
   the handle servers within a service, presumably on a separate
   computer. This is referred to as replication and is accomplished
   by creating one or more additional servers whose sole purpose is
   to mirror the contents of the original server. Within each set of
   replicated servers, the initial server is called the primary server
   and all others are called secondary servers. The creation and
   administration of handles always takes place on the primary server,
   but resolution can use either the primary or any of its secondaries.
   This provides fault tolerance, as well as the potential for
   performance improvement.

5.4 Caching Server

   The Handle System Caching Server has been built to reduce the
   network traffic between handle clients and handle services and
   its use is strongly encouraged. Caching handle data or routing
   information on the caching server allows some handle resolution
   to be performed within an organization's local area network.

5.5 Proxy Server

   The Handle System Proxy Server has been developed to act as a
   client to the Handle System, allowing handles to be resolved using
   Handle Proxy Syntax (ie, http:<proxy server>/<handle>). Using
   this syntax, the browser passes a handle to the proxy server,
   which in turn passes the handle to the appropriate handle
   service for resolution. If the handle can be resolved into one
   or more URLs, a URL is returned from the handle
   server to the proxy, and from the proxy to the client browser.

5.6 Handle System Resolver

   The Handle System Resolver [4] is a software component which
   extends Netscape or Microsoft Web browsers, and allows handles
   to be resolved using Handle URI Syntax (ie, hdl:<handle>). Using
   this syntax, the browser passes the handle directly to the
   appropriate handle service for resolution. If the handle can
   be resolved into one or more URLs, one of the URLs is returned to
   the browser which then transparently retrieves and displays the
   intended content.


6. Handle resolution

   Handle clients and handle services use the Handle Resolution
   Protocol [5] to conduct resolution transactions. The Handle
   Resolution protocol uses registered port number 2641. By
   default, a handle resolution request will be answered with
   all of the typed data associated with a handle, with the
   exception of the administrative data. It is also possible
   to request data only of a certain type.

   Handle clients that do not know which handle service to
   query for a given handle start with the Global Handle
   Registry, which is guaranteed to know which service contains
   a given handle. Within a given service, a client uses the hash
   table specific to the service to discover the individual
   server, or set of replicated servers, which can resolve the
   given handle.

   A number of handle resolution clients have been constructed,
   all of which utilize the Handle Client Library [6], which
   is currently implemented as a C library. The clients include
   a Web proxy server, the Handle System Resolver [4], and the
   Grail Web browser [7].


7. Handle administration

   Handle System administration is carried out using the
   Handle System Administration Protocol [8]. This protocol
   allows the creation and administration of handles and their
   associated data within the Handle System.  A series of APIs
   currently under construction on top of this protocol will be
   made publicly available.


8. Security Consideration

   The Handle System has been designed to enable secure
   transactions between clients and servers and to allow
   secure and stable storage of handle data. Development and
   documentation of secure practices and policies is underway.

   A handle does not in itself pose a security threat. When
   specified or used in URL context, it is subject to all
   the security considerations in the URL specification [3].


9. Handle System and URL/URN/URI

   While the Handle System is designed to be usable in many
   contexts and is not a subset or extension of current UR* schemes,
   it can be used in conjunction with those schemes. When used
   within those schemes it is, of course, subject to their
   constraints. The Handle System is designed to provide all the
   fundamental requirements outlined in the URN/URI specifications
   [9,10]. On the other hand, the Handle System differs from the
   current proposed URN implementations [11,12,13] discussed in the
   IETF URN working group in the following ways.

   First of all, the Handle System defines a namespace independent
   of URI, and is not subject to the current namespace constraints
   of URI. The namespace of handles is Unicode based, and imposes no
   reserved or excluded characters on the handle string. This
   allows handles to be specified in any national language natively
   in a globally unique and unambiguous manner. The elimination of
   any reserved characters also allows any legacy naming system,
   such as SICI [14], to be used with no or minimum change.

   The Handle System is designed to support, instead of exclude, the
   use of user friendly names in any native language. There are
   situations in which using descriptive names may hurt the persistence
   of the name once the identified object changes its association.
   Objects of this nature may be better served using non-descriptive
   names; for example, digits only. On the other hand, there are
   objects for which descriptive names are desirable.

   The current URN/URI was defined "generally to be for machine,
   rather than human, consumption" [20]. It uses a subset of ASCII
   character set, and requires a set of reserved/excluded characters.
   A Human Friendly Name Service is expected to work with it.

   URN services may be used to resolve handles from the Handle System.
   For example, the handle "cnri.dlib/july95-arms" may be specified as
   "urn:hdl:cnri.dlib/july95-arms". This will allow any URN-aware
   browsers to resolve the handle as a URN. Handles specified as an URN
   must follow the URN syntax [13].


10. History and Acknowledgment

   The initial design and implementation of the Handle System was
   part of the Computer Science Technical Reports project, funded by
   the Defense Advanced Projects Agency (DARPA) under Grant No.
   MDA-972-92-J-1029. One aspect of this project was to develop a
   framework for the underlying infrastructure of digital libraries.
   It is described in a paper by Robert Kahn and Robert Wilensky [15].
   The first implementation was created at CNRI in the fall of 1994.
   Subsequent work on the Handle System has been supported in part
   by the Advanced Research Projects Agency under Grant No.
   MDA972-92-J-1029.

   The following people have contributed to the Handle System design
   and implementation: David Ely, William Arms, Navjeet Chabbewal,
   Judith Grass, Robert Kahn, Timothy Kendall, Connie McLindon,
   Charles Orth, Ed Overly, Varna Puvvada, John Stewart, Allison
   Yu-McNamara, Ron Ely, Catherine Rey, Jane Euler, Larry Lannom,
   and Sam Sun. We also want to acknowledge the contribution of the
   other members of the Computer Science Technical Reports project.


11. Author's Address

   Sam X. Sun
   1895 Preston White Dr.
   Suite 100
   Reston, VA 20191-5434
   (703) 620-8990
   ssun@cnri.reston.va.us


12. References

   [1]  The Unicode Consortium, "The Unicode Standard,
        Version 2.0", Addison-Wesley Developers Press, 1996.
        ISBN 0-201-48345-9

   [2]  Yergeau, Francois, "UTF-8, A Transform Format for Unicode
        and ISO10646", RFC2044, October 1996.
        http://ds.internic.net/rfc/rfc2044.txt

   [3]  Berners-Lee, T., Masinter, L., McCahill, M., et al.,
        "Uniform Resource Locators (URL)", RFC1738, December 1994.
        http://ds.internic.net/rfc/rfc1738.txt

   [4]  Handle System Resolver.
        http://www.handle.net/resolver/

   [5]  Handle System Client Library download site.
        http://www.handle.net/download.html

   [6]  Handle Resolution Protocol.
        http://www.handle.net/client_spec.html

   [7]  The Grail Internet Browser.
        http://grail.cnri.reston.va.us/grail/

   [8]  Handle Administration Protocol.
        http://www.handle.net/handle_admin.html

   [9]  Sollins, K. and L. Masinter, "Functional Requirements
        for Uniform Resource Names", RFC 1737, December 1994.
        http://ds.internic.net/rfc/rfc1737.txt

   [10] Berners-Lee, T., "Universal Resource Identifiers
        in WWW" RFC 1630, June 1994.
        http://ds.internic.net/rfc/rfc1630.txt

   [11] Daniel, Ron and Michael Mealling, "Resolution of
        Uniform Resource Identifiers using the Domain Name
        System", RFC 2168, June 1997.
        http://ds.internic.net/rfc/rfc2168.txt

   [12] Daniel, Jr., Ron, "A Trivial Convention for using
        HTTP in URN Resolution", RFC-2169, June 1997.
        http://ds.internic.net/rfc/rfc2169.txt

   [13] Moats, Ryan, "URN Syntax", RFC-2141, May 1997.
        http://ds.internic.net/rfc/rfc2141.txt

   [14] Serial Item and Contribution Identifier (SICI) Standard.
        http://sunsite.berkeley.edu/SICI/

   [15] Kahn, Robert and Wilensky, Robert. "A Framework for
        Distributed Digital Object Services", May, 1995.
        http://www.cnri.reston.va.us/tmp_hp/k-w.html

   [16] Digital Object Identifier System.
        http://hdl.handle.net/10.1000/1

   [17] National Digital Library Program.
        http://hdl.handle.net/4263537/003

   [18] The CNRI Registry.
        http://hdl.handle.net/4263537/001

   [19] Defense Virtual Library.
        http://hdl.handle.net/4263537/002

   [20] Sollins, K., "Architectural Principles of Uniform Resource
        Name Resolution", September 26, 1997, Work in Progress.
        ftp://ftp.ietf.org/internet-drafts/draft-ietf-urn-req-frame-04.txt

   [21] D. Crocker, Ed., P. Overell, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997,
        http://info.internet.isi.edu/in-notes/rfc/files/rfc2234.txt

   [22] T. Berners-Lee, L. Masinter, R. Fielding, "Uniform Resource
        Identifiers (URI): Generic Syntax", work in progress,
        June 1998, ftp://ftp.ietf.org/internet-drafts/draft-fielding-
        uri-syntax-03.txt

   [23] List of IANA registered charset names.
        ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets


INTERNET-DRAFT
draft-ietf-handle-system-01.txt
Expires Jan 16, 1999