Network Working Group                                    E. Hammer-Lahav
Internet-Draft                                                    Yahoo!
Intended status: Informational                            March 23, 2009
Expires: September 24, 2009


                Link-based Resource Descriptor Discovery
                       draft-hammer-discovery-03

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 24, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   This memo describes LRDD (pronounced 'lard'), a process for obtaining
   information about a resource identified by a URI.  The 'information
   about a resource', a resource descriptor, provides machine-readable



Hammer-Lahav           Expires September 24, 2009               [Page 1]


Internet-Draft            Descriptor Discovery                March 2009


   information that aims to increase interoperability and enhance the
   interaction with the resource.  This memo only defines the process
   for locating and obtaining the descriptor, but leaves the descriptor
   format and its interpretation out of scope.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Notational Conventions . . . . . . . . . . . . . . . . . . . .  4
   3.  The describedby Link Relation  . . . . . . . . . . . . . . . .  4
   4.  Identifying Descriptor Location  . . . . . . . . . . . . . . .  5
     4.1.  Method Selection . . . . . . . . . . . . . . . . . . . . .  5
     4.2.  The <LINK> Element . . . . . . . . . . . . . . . . . . . .  6
     4.3.  The HTTP Link Header . . . . . . . . . . . . . . . . . . .  7
     4.4.  The Host Metadata Document . . . . . . . . . . . . . . . .  8
   5.  Obtaining Resource Descriptor  . . . . . . . . . . . . . . . .  9
   6.  The Link-Pattern host-meta Field . . . . . . . . . . . . . . .  9
     6.1.  Template Syntax  . . . . . . . . . . . . . . . . . . . . . 10
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
     8.1.  The Link-Pattern host-meta Field . . . . . . . . . . . . . 11
     8.2.  The describedby Relation Type  . . . . . . . . . . . . . . 12
   Appendix A.   Descriptor Discovery vs. Service Discovery . . . . . 12
   Appendix B.   Methods Suitability Analysis . . . . . . . . . . . . 13
   Appendix B.1. Requirements . . . . . . . . . . . . . . . . . . . . 13
   Appendix B.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 15
   Appendix C.   Acknowledgments  . . . . . . . . . . . . . . . . . . 22
   Appendix D.   Document History . . . . . . . . . . . . . . . . . . 22
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 24
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 25
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 25


















Hammer-Lahav           Expires September 24, 2009               [Page 2]


Internet-Draft            Descriptor Discovery                March 2009


1.  Introduction

   This memo defines a process for locating descriptors for resources
   identified with URIs.  Resource descriptors are documents (usually
   based on well known serialization languages such as XML, RDF, and
   JSON) which provide machine-readable information about resources
   (resource metadata) for the purpose of promoting interoperability and
   assist in interacting with unknown resources that support known
   interfaces.

   While many methods provide the ability to link a resource to its
   metadata, none of these methods fully address the requirements of a
   uniform and easily implementable process.  These requirements include
   the ability for resources to self-declare the location of their
   descriptors, the ability to access descriptors directly without
   interacting with the resource, and support a wide range of platforms
   and scale of deployment.  They must also be fully compliant with
   existing web protocols, and support extensibility.  These
   requirements, and the analysis used as the basis for this memo are
   explains in detail in Appendix B.

   For example, a web page about an upcoming meeting can provide in its
   descriptor document the location of the meeting organizer's free/busy
   information to potentially negotiate a different time.  A social
   network profile page descriptor can identify the location of the
   user's address book as well as accounts on other sites.  A web
   service implementing an API with optional components can advertise
   which of these are supported.

   This memo describes the first step in the discovery process in which
   the resource descriptor document is located and retrieved.  Other
   steps, which are outside the scope of this memo, include parsing the
   descriptor document based on its format (such as POWDER [POWDER], XRD
   [XRD], and Metalink [I-D.bryan-metalink]) and utilizing it based on
   the application.

   Discovery can be performed before, after, or without obtaining a
   representation of the resource.  Performing discovery ahead of
   accessing a representation allows the client not to reply on
   assumptions about the properties of the resource.  Performing
   discovery after a representation has been obtained enables further
   interaction with it.

   Given the wide range of 'information about a resource', no single
   descriptor format can adequately accommodate such scope.  However,
   there is great value in making the process locating the descriptor
   uniform across formats.  While HTTP is the most common protocol used
   in association with discovery and is explicitly specified in this



Hammer-Lahav           Expires September 24, 2009               [Page 3]


Internet-Draft            Descriptor Discovery                March 2009


   memo, other protocols MAY be used.

   Please discuss this draft on the www-talk@w3.org [1] mailing list.


2.  Notational Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   This document uses the Augmented Backus-Naur Form (ABNF) notation of
   [RFC2616].  Additionally, the following rules are included from
   [RFC3986]: reserved and unreserved, and from
   [I-D.nottingham-http-link-header]: link-param.


3.  The describedby Link Relation

   The methods described in this memo express the location of the
   resource descriptor as a link relation, utilizing the link framework
   defined by [I-D.nottingham-http-link-header].  The association of a
   descriptor document with the resource it describes is declared using
   the "describedby" link relation type.

   The "describedby" link relation is defined in [POWDER] and registered
   as:

      The relationship A "describedby" B asserts that resource B
      provides a description of resource A. There are no constraints on
      the format or representation of either A or B, neither are there
      any further constraints on either resource.

   Since a single resource can have many descriptors, the "describedby"
   link relation has a one-to-many structure (the question whether a
   single descriptor can describe multiple resources is outside the
   scope of this memo).  In the case of multiple "describedby" links
   obtained from a single method, selecting which link to use is
   application-specific.

   To promote interoperability, applications referencing this memo
   SHOULD clearly define the application-specific criteria used to
   select between "describedby" links.  This MAY be done by:

   o  Supporting a single descriptor format, or defining an order of
      precedence for multiple descriptor formats.  Applications MAY
      require the presence of the link "type" attribute with the mime-
      type of the required format.



Hammer-Lahav           Expires September 24, 2009               [Page 4]


Internet-Draft            Descriptor Discovery                March 2009


   o  Using the "describedby" relation type together with another
      application-specific relation type in the same link.  The
      application-specific relation type can be registered or an
      extension.

   o  Specifying additional link attributes using link-extensions.

   Link selection MUST NOT depend on the order in which multiple links
   are obtained from a single method.  Applications MUST NOT impose
   constraints on the usage of the "describedby" relation type as it is
   likely to be used by other applications in association with the same
   resource.


4.  Identifying Descriptor Location

   The descriptor location (URI) is a function of the resource URI.
   This section defines three methods which together satisfy the
   requirements defined in Appendix B.  While each method on its own
   satisfies the requirements partially, together they provide enough
   flexibility for most use cases.  Each of the following three methods
   is performed by using the resource URI to identify its descriptor
   URI.

   In many cases, a request for one URI leads to requesting other URIs,
   as is the case with HTTP redirections.  Because the decision whether
   to use such URIs is application-specific, discovery is constrained to
   a single URI identifying the resource.  Any other resource URIs
   received MUST be considered as a separate and discrete input into the
   discovery function.  If a resource URI obtained during the
   performance of these methods is found to be more relevant to the
   application, the discovery process MUST be restarted with the new
   resource URI as its input.

   For example, an HTTP HEAD request for URI A returns a redirect (307)
   response with a set of "describedby" links, and identifies the
   temporary location of the representation at URI B. An HTTP HEAD
   request for URI B returns a successful (200) response with its own
   set of "describedby" links.  An application MAY choose to define a
   process in which the two sets of links are obtained, prioritized, and
   utilized, however, it MUST do so by explicitly instructing the client
   to perform discovery multiple times, as each is considered separate
   and distinct discovery.

4.1.  Method Selection

   Each method presents a different set of requirements.  The criteria
   used to determine which methods a server SHOULD support and client



Hammer-Lahav           Expires September 24, 2009               [Page 5]


Internet-Draft            Descriptor Discovery                March 2009


   SHOULD attempt are based on a combination of factors:

   o  The ability to offer and obtain a representation of the resource
      by dereferencing its URI.

   o  The availability of a representation supporting <LINK> markup
      compatible with [I-D.nottingham-http-link-header].

   o  The availability of an HTTP representation of the resource and the
      ability to provide and access link information in its response
      header.

   The methods are listed is based on the restrictiveness of their
   requirements in descending order, from the most specialized to the
   most generic.  This ordering however, does not imply the order in
   which multiple applicable methods should be attempted.  Because
   different methods are more appropriate in different circumstances, it
   is up to each application to define how they should be used together.

   To promote interoperability, applications referencing this memo MUST
   clearly define the relationship between the three methods as either:

   o  equal, all methods MUST produce the same set of resource
      descriptors and clients MAY attempt either method according to
      their capabilities, or

   o  with an application-specific order of precedence, where methods
      MUST be attempted in a specific order.

4.2.  The <LINK> Element

   The <LINK> element method is limited to resources with an available
   markup representation that supports typed-relations using the <LINK>
   element, such as HTML [W3C.REC-html401-19991224], XHTML
   [W3C.REC-xhtml1-20020801], and Atom [RFC4287].  Other markup formats
   are permitted as long as the semantics of their <LINK> elements are
   fully compatible with the link framework defined in
   [I-D.nottingham-http-link-header].  This method requires the
   retrieval of a resource representation.  While HTTP is the most
   common transport for such documents, this method is transport
   independent.

   For example:

     <LINK href="http://example.com/resource;about"
             rel="describedby" type="application/powder+xml">

   A client trying to obtain the location of the resource's descriptor



Hammer-Lahav           Expires September 24, 2009               [Page 6]


Internet-Draft            Descriptor Discovery                March 2009


   using this method SHALL:

   1.  Retrieve a representation of the resource using the applicable
       transport for that resource URI.  If the markup document is
       obtained using HTTP, it MUST only be used by the client if the
       document is a valid representation of the resource identified by
       the HTTP request URI, typically in a response with a successful
       (2xx) or redirection (3xx) status code.  If no such valid
       representation of the request URI is found, the method fails.

   2.  Parse the document as defined by its format specification and
       look for <LINK> elements with a "rel" attribute value containing
       the "describedby" relation.  The client MUST obey the document
       markup schema and ignore any invalid elements (such as <LINK>
       elements outside the <HEAD> section of an HTML document).  This
       is done to avoid unintentional markup from other parts of the
       document to be used for discovery purposes, which can have vast
       impact on usability and security.

   3.  Narrow down the selection if more than one "describedby" link is
       found, following the application-specific criteria.  The
       descriptor location is obtained from the value of the "href"
       attribute in the selected <LINK> element.

   <LINK> elements MAY include other relation types together with
   "describedby" in a single "rel" attribute (for example
   'rel="describedby copyright"').  Clients MUST be properly process use
   such multiple relation "rel" attributes as defined by the format
   specification.

4.3.  The HTTP Link Header

   The HTTP Link header method is limited to resources for which an HTTP
   GET or HEAD request returns a 2xx, 3xx, or 4xx HTTP response
   [RFC2616].  This method uses the Link header defined in
   [I-D.nottingham-http-link-header] and requires the retrieval of a
   resource representation header.

   For example:

     Link: <http://example.com/resource;about>; rel="describedby";
               type="application/powder+xml"

   A client trying to obtain the location of the resource's descriptor
   using this method SHALL:

   1.  Make an HTTP (or HTTPS as required) GET or HEAD request to the
       resource URI to obtain a valid response header.  If the HTTP



Hammer-Lahav           Expires September 24, 2009               [Page 7]


Internet-Draft            Descriptor Discovery                March 2009


       response carries a status code other than successful (2xx),
       redirection (3xx), or client error (4xx), the method fails.

   2.  Parse the HTTP response header and look for Link headers with a
       "rel" parameter value containing the "describedby" relation.

   3.  Narrow down the selection if more than one "describedby" link is
       found, following the application-specific criteria.  The
       descriptor location is obtained from the "<>" enclosed URI-
       reference in the selected Link header.

   Link headers MAY include other relation types together with
   "describedby" in a single "rel" parameter (for example
   'rel="describedby copyright"').  Clients MUST be properly process use
   such multiple relation "rel" attributes as defined by
   [I-D.nottingham-http-link-header].

4.4.  The Host Metadata Document

   The host metadata document method is available for any resource
   identified by a URI whose authority supports the host-meta document
   defined in [I-D.nottingham-site-meta].  This method does not require
   obtaining any representation of the resource, and operates solely
   using the resource URI.

   The link relation between the resource URI and the descriptor URI is
   obtained by using a template contained in the host-meta document.  By
   applying the host-wide template to an individual resource URI, a
   resource-specific link is produced which can be used to indicate the
   location of the descriptor document for that resource, bypassing the
   need to access or provide a representation for it.

   For example (line breaks are for formatting only, and are not allowed
   in the document):

     Link-Pattern: <{uri};about">; rel="describedby";
                      type="application/powder+xml"

   A client trying to obtain the location of the resource's descriptor
   using this method SHALL:

   1.  Retrieve the host-meta document for URI's authority as defined by
       [I-D.nottingham-site-meta] section 4.  If the request fails to
       retrieve a valid host-meta document, the method fails.

   2.  Parse host-meta document and look for Link-Pattern fields with a
       "rel" attribute value containing the "describedby" relation.




Hammer-Lahav           Expires September 24, 2009               [Page 8]


Internet-Draft            Descriptor Discovery                March 2009


   3.  Narrow down the selection if more than one "describedby" link is
       found, following the application-specific criteria.  The
       descriptor location is constructed by applying the template
       obtained from the selected Link-Pattern field to the resource URI
       as described by Section 6.1.

   Link-Pattern MAY include other relation types together with
   "describedby" in a single "rel" parameter (for example
   'rel="describedby copyright"').  Clients MUST be properly process use
   such multiple relation "rel" attributes as defined by Section 6.


5.  Obtaining Resource Descriptor

   Once the desired descriptor URI has been obtained, the descriptor
   document is retrieved.  If the descriptor URI scheme is "http" or
   "https", the document is obtained via an HTTP (or HTTPS as required)
   GET request to the identified URI.  The client MUST obey HTTP
   redirections (3xx), and the descriptor document is considered valid
   only if retrieved with a successful HTTP response status (2xx).


6.  The Link-Pattern host-meta Field

   The Link host-meta field [I-D.nottingham-site-meta] conveys a link
   relation between all resource URIs under the host-meta authority and
   a common target URI.  However, there are cases in which relations of
   different resources with the same authority do not share the same
   target URI, but do follow a common pattern in how the target URI is
   constructed.

   For example, a news site with multiple authors can provide
   information about each article's author, but appending a suffix (such
   as ";by") to the URI of each article.  Each article has a unique
   author, but all share the same pattern of where that information is
   located.  The same information can be provided using an HTTP link
   header or HTML <LINK> element, but in a less efficient manner when a
   single pattern can provide the same information:

     Link-Pattern: <{uri};by>; rel="author"

   The Link-Pattern host-meta field uses a slightly modified syntax of
   the HTTP Link header [I-D.nottingham-http-link-header] to convey
   relations whose context is individual resources with the same
   authority as the host-meta document, and whose target is constructed
   by applying a template to the context URI.  The field is not specific
   to any relation type and MAY be used to express any relations
   supported by the Link header [I-D.nottingham-http-link-header].



Hammer-Lahav           Expires September 24, 2009               [Page 9]


Internet-Draft            Descriptor Discovery                March 2009


   The Link-Pattern host-meta field differs from the HTTP Link header in
   the following respects:

   o  The "<>" enclosed token is not a valid URI, but instead contains a
      template as defined in Section 6.1.

   o  Its context URI is defined as the individual resource URI used as
      input to the template.

   o  If the resulting target URI expressed by the template is relative,
      its base URI is the root resource of the authority.


     Link-Pattern   = "Link-Pattern" ":" #pattern-value

     pattern-value  = "<" template ">" *( ";" link-param )

     template       = *( uri-char | "{" [ "%" ] var-name "}" )

     uri-char       = ( reserved | unreserved )

     var-name       = "scheme" | "authority" | "path"
                    | "query"  | "fragment"  | "userinfo"
                    | "host"   | "port"      | "uri"

   [[ should this spec define a filter/map parameter that will allow
   applying link patterns to subsets of the host-meta scope?  This can
   use a regular expression match or something similar to robots.txt.
   If the spec will end up not directly supporting this feature, I will
   add a note suggesting that such a feature could be defined elsewhere
   as an extension. ]]

6.1.  Template Syntax

   The template syntax provides a simple format for URI transformation.
   A template is a string containing brace-enclosed ("{}") variable
   names marking the parts of the string that are to be substituted by
   the variable values.  A template is transformed into a URI by
   substituting the variables with their calculated value.  If a
   variable name is prefixed by "%", any character in the variable value
   other than unreserved MUST be percent-encoded per [RFC3986].

   To construct a URI using a template, the input URI is parsed into its
   URI components and each component value assigned to a variable name.
   The template variable substitution is based on the URI vocabulary
   defined by [RFC3986] section 3 and includes: "scheme", "authority",
   "path", "query", "fragment", "userinfo", "host", and "port".  In
   addition, it defines the "uri" variable as the entire input URI



Hammer-Lahav           Expires September 24, 2009              [Page 10]


Internet-Draft            Descriptor Discovery                March 2009


   excluding the fragment component and the "#" fragment separator.

     foo://william@example.com:8080/over/there?name=ferret#nose
     \_/   \______________________/\_________/ \_________/ \__/
      |              |                  |           |        |
     scheme      authority             path       query   fragment

     foo://william@example.com:8080/over/there?name=ferret#nose
           \_____/ \_________/ \__/
              |         |        |
          userinfo     host     port

     foo://william@example.com:8080/over/there?name=ferret#nose
     \___________________________________________________/
                              |
                             uri

   For example, given the input URI "http://example.com/r/1?f=xml#top",
   each of the following templates will produce the associated output
   URI:

     http://example.org?q={%uri} -->
     http://example.org?q=http%3A%2F%2Fexample.com%2Fr%2F1%3Ff%3Dxml

     http://meta.{host}:8080{path}?{query} -->
     http://meta.example.com:8080/r/1?f=xml

     https://{authority}/v1{path}#{fragment} -->
     https://example.com/v1/r/1#top


7.  Security Considerations

   The methods used to perform discovery are not secure, private or
   integrity-guaranteed, and due caution should be exercised when using
   them.  Applications that perform discovery should consider the attack
   vectors opened by automatically following, trusting, or otherwise
   using links gathered from <LINK> elements, HTTP Link headers, or
   host-meta documents.


8.  IANA Considerations

8.1.  The Link-Pattern host-meta Field

   This specification registers the Link-Pattern host-meta field in the
   host-meta Field Registry [I-D.nottingham-site-meta].




Hammer-Lahav           Expires September 24, 2009              [Page 11]


Internet-Draft            Descriptor Discovery                March 2009


   Field Name:  Link-Pattern

   Change controller:  IETF

   Specification document(s):  [[ this document ]]

   Related information:  [I-D.nottingham-http-link-header]

8.2.  The describedby Relation Type

   [[ this section will be removed if the "describedby" relation type is
   registered by the time it is published ]]

   This specification registers the "describedby" relation type in the
   Link Relation Type Registry [I-D.nottingham-http-link-header].

   o  Relation Name: describedby

   o  Description: The relationship A "describedby" B asserts that
      resource B provides a description of resource A. There are no
      constraints on the format or representation of either A or B,
      neither are there any further constraints on either resource.

   o  Documentation: [POWDER]


Appendix A.  Descriptor Discovery vs. Service Discovery

   Descriptor discovery provides a process for obtaining information
   about a resource identified with a URI.  It allows servers to
   describe their resources in a machine-readable format, enabling
   automatic interoperability by user-agents and resource consuming
   applications.  Discovery enables applications to utilize a wide range
   of web services and resources across multiple providers without the
   need to know about their capabilities in advance, reducing the need
   for manual configuration and resource-specific software.

   When discussing discovery, it is important to differentiate between
   descriptor discovery and service discovery.  Both types attempts to
   associate capabilities with resources, but they approach it from
   opposite ends.

   Service discovery centers on identifying the location of qualified
   resources, typically finding an endpoint capable of certain protocols
   and capabilities.  In contrast, descriptor discovery begins with a
   resource, trying to find which capabilities it supports.

   A simple way to distinguish between the two types of discovery is to



Hammer-Lahav           Expires September 24, 2009              [Page 12]


Internet-Draft            Descriptor Discovery                March 2009


   define the questions they are each trying to answer:

   Descriptor-Discovery:  Given a resource, what are its attributes:
      capabilities, characteristics, and relationships to other
      resources?

   Service-Discovery:  Given a set of attributes, which available
      resources match the desired set and what is their location?

   While this memo deals exclusively with descriptor discovery, it is
   important to note that the two discovery types are closely related
   and are usually used in tandem.  In fact, a typical use case will
   switch between service discovery and descriptor discovery multiple
   times in a single workflow, and can start with either one.

   One reason for this dependency between the two discovery types is
   that resource descriptors usually contain not only a list of
   capabilities, but also relationships to other resources.  Since those
   relationships are usually typed, the process in which an application
   chooses which links to use is in fact service discovery.

   Applications use descriptor discovery to obtain the list of links,
   and service discovery to choose the relevant links.  In another
   common example, the application uses service discovery to find a
   resource with a given capability, then uses descriptor discovery to
   find out what other capabilities it supports.


Appendix B.  Methods Suitability Analysis

   Due to the wide range of use cases requiring resource descriptors,
   and the desire to reuse as much as possible, no single solution has
   been found to sufficiently cover the requirements for linking between
   the resource URI and the descriptor URI.  The following analysis
   attempts to list all the method proposed for addressing descriptor
   discovery.  It is included here to provide background information as
   to why certain methods have been selected while others rejected from
   the discovery process.  It has been updated to match the terms used
   in this memo and its structure.

Appendix B.1.  Requirements

   Getting from a resource URI to its descriptor document can be
   implemented in many ways.  The problem is that none of the current
   methods address all of the requirements presented by the common use
   cases.  The requirements are simple, but the more we try to address,
   the less elegant and accessible the process becomes.  While working
   on the now defunct XRDS-Simple specification [XRDS-Simple] and



Hammer-Lahav           Expires September 24, 2009              [Page 13]


Internet-Draft            Descriptor Discovery                March 2009


   talking to companies and individual about it, the following
   requirements emerged for any proposed process:

   Self Declaration:

         Allow resources to declare the availability of descriptor
         information and its location.  When a resource is accessed, it
         needs to have a way to communicate to the client that it
         supports the discovery protocol and to indicates the location
         of such descriptor.

         This is useful when the client is able or is already
         interacting with the resource but can enhance its interaction
         with additional information.  For example, accessing a blog
         page enhanced if it was generated from an Atom feed or Atom
         entry and that feed supports Atom authoring.

   Direct Descriptor Access:

         Enable direct retrieval of the resource descriptor without
         interacting with the resource itself.  Before a resource is
         accessed, the client should have a way to obtain the resource
         descriptor without accessing the resource.  This is important
         for two reasons.

         First, accessing an unknown resource may have undesirable
         consequences.  After all, the information contained in the
         descriptor is supposed to inform the client how to interact
         with the resource.  The second is efficiency - removing the
         need to first obtain the resource in order to get its
         descriptor (reducing HTTP round-trips, network bandwidth, and
         application latency).

   Web Architecture Compliant:

         Work with well-established web infrastructure.  This may sound
         obvious but it is in fact the most complex requirement.
         Deploying new extensions to the HTTP protocol is a complicated
         endeavor.  Beside getting applications to support a new header,
         method, or content negotiation, existing caches and proxies
         must be enhanced to properly handle these requests, and they
         must not fail performing their normal duties without such
         enhancements.

         For example, a new content negotiation method may cause an
         existing cache to serve the wrong data to a non-discovery
         client due to its inability to distinguish the metadata request
         from the resource representation request.



Hammer-Lahav           Expires September 24, 2009              [Page 14]


Internet-Draft            Descriptor Discovery                March 2009


   Scale and Technology Agnostic:

         Support large and small web providers regardless of the size of
         operations and deployment.  Any solution must work for a small
         hosted web site as well as the world largest search engine.  It
         must be flexible enough to allow developers with restricted
         access to the full HTTP protocol (such as limited access to
         request or response headers) to be able to both provide and
         consume resource descriptors.  Any solution should also support
         caching as much as possible and allow reuse of source code and
         data.

   Extensible:

         Accommodate future enhancements and unknown descriptor formats.
         It should support the existing set of descriptor formats such
         as XRD and POWDER, as well as new descriptor relationships that
         might emerge in the future.  In addition, the solution should
         not depend on the descriptor format itself and work equally
         well with any document format - it should aim to keep the road
         and destination separate.

Appendix B.2.  Analysis

   The following is a list of proposed and implemented methods trying to
   address descriptor discovery.  Each method is reviewed for its
   compliance with the requirements identified previously.  The [-],
   [+], or [+-] symbols next to each requirement indicate how well the
   method complies with the requirement.

Appendix B.2.1.  HTTP Response Header

   When a resource representation is retrieved using and HTTP GET
   request, the server includes in the response a header pointing to the
   location of the descriptor document.  For example, POWDER uses the
   "Link" response header to create an association between the resource
   and its descriptor.  XRDS [XRDS] (based on the Yadis protocol
   [Yadis]) uses a similar approach, but since the Link header was not
   available when Yadis was first drafted, it defines a custom header
   X-XRDS-Location which serves a similar but less generic purpose.

   [+] Self Declaration -  using the Link header, any resource can point
      to its descriptor documents.

   [-] Direct Descriptor Access -  the header is only accessible when
      requesting the resource itself via an HTTP GET request.  While
      HTTP GET is meant to be a safe operation, it is still possible for
      some resource to have side-effects.



Hammer-Lahav           Expires September 24, 2009              [Page 15]


Internet-Draft            Descriptor Discovery                March 2009


   [+] Web Architecture Compliant -  uses the Link header which is an
      IETF Internet Standard [[ currently a standard-track draft ]], and
      is consistent with HTTP protocol design.

   [-] Scale and Technology Agnostic -  since discovery accounts for a
      small percent of resource requests, the extra Link header is
      wasteful.  For some hosted servers, access to HTTP headers is
      limited and will prevent implementation.

   [+] Extensible -  the Link header provides built-in extensibility by
      allowing new link relations, mime-types, and other extensions.

   Minimum roundtrips to retrieve the resource descriptor: 2

Appendix B.2.2.  HTTP Response Header Via HEAD

   Same as the HTTP Response Header method but used with an HTTP HEAD
   request.  The idea of using the HEAD method is to solve the wasteful
   overhead of including the Link header in every reply.  By limiting
   the appearance of the Link header only to HEAD responses, typical GET
   requests are not encumbered by the extra bytes.

   [+] Self Declaration -  Same as the HTTP Response Header method.

   [-] Direct Descriptor Access -  Same as the HTTP Response Header
      method.

   [-] Web Architecture Compliant -  HTTP HEAD should return the exact
      same response as HTTP GET with the sole exception that the
      response body is omitted.  By adding headers only to the HEAD
      response, this solution violates the HTTP protocol and might not
      work properly with proxies as they can return the header of the
      cached GET request.

   [+] Scale and Technology Agnostic -  solves the wasted bandwidth
      associated with the HTTP Response Header method, but still suffers
      from the limitation imposed by requiring access to HTTP headers.

   [+] Extensible -  Same as the HTTP Response Header method.

   Minimum roundtrips to retrieve the resource descriptor: 2

Appendix B.2.3.  HTTP Content Negotiation

   Using the HTTP Accept request header or Transparent Content
   Negotiation as defined in [RFC2295], the client informs the server it
   is interested in the descriptor and not the resource itself, to which
   the server responds with the descriptor document or its location.  In



Hammer-Lahav           Expires September 24, 2009              [Page 16]


Internet-Draft            Descriptor Discovery                March 2009


   Yadis, the client sends an HTTP GET (or HEAD) request to the resource
   URI with an Accept header and content-type application/xrds+xml.
   This informs the server of the client's discovery interest, which in
   turn may reply with the descriptor document itself, redirect to it,
   or return its location via the X-XRDS-Location response header.

   [-] Self Declaration -  does not address as it focuses on the client
      declaring its intentions.

   [+] Direct Descriptor Access -  provides a simple method for directly
      requesting the descriptor document.

   [-] Web Architecture Compliant -  while it can be argued that the
      descriptor can be considered another representation of the
      resource, it is very much external to it.  Using the Accept header
      to request a separate resource (as opposed to a different
      representation of the same resource) violates web architecture.
      It also prevents using the discovery content-type as a valid
      (self-standing) web resource having its own descriptor.

   [-] Scale and Technology Agnostic -  requires access to HTTP request
      and response headers, as well as the registration of multiple
      handlers for the same resource URI based on the Accept header.  In
      addition, improper use or implementation of the Vary header in
      conjunction with the Accept header will cause caches to serve the
      descriptor document instead of the resource itself - a great
      concern to large providers with frequently visited front-pages.

   [-] Extensible -  applies an implicit relation type to the descriptor
      mime-type, limiting descriptor formats to a single purpose.  It
      also prevents using existing mime-types from being used as a
      descriptor format.

   Minimum roundtrips to retrieve the resource descriptor: 1

Appendix B.2.4.  HTTP Header Negotiation

   Similar to the HTTP Content Negotiation method, this solution uses a
   custom HTTP request header to inform the server of the client's
   discovery intentions.  The server responds by serving the same
   resource representation (via an HTTP GET or HEAD requests) with the
   relevant Link headers.  It attempts to solve the HTTP Response Header
   waste issue by allowing the client to explicitly request the
   inclusion of Link headers.  One such header can be called "Request-
   links" to inform the server the client would like it to include
   certain Link headers of a given "rel" type in its reply.





Hammer-Lahav           Expires September 24, 2009              [Page 17]


Internet-Draft            Descriptor Discovery                March 2009


   [+] Self Declaration -  same as HTTP Response Header with the option
      of selective inclusion.

   [-] Direct Descriptor Access -  does not address.

   [-] Web Architecture Compliant -  HTTP does not include any mechanism
      for header negotiation and any custom solution will break existing
      caches.

   [+-] Scale and Technology Agnostic -  Requires advance access to HTTP
      headers on both the client and server sides, but solves the
      bandwidth waste issue of the HTTP Response Header method.

   [+] Extensible -  builds on top of Link header extensibility.

   Minimum roundtrips to retrieve the resource descriptor: 2

Appendix B.2.5.  <Link> Element

   Embeds the location of the descriptor document within the resource
   representation by leveraging the HTML <Link> header element (as
   opposed to the HTTP header).  Applies to HTML resource
   representations or similar markup-based formats with support for
   "Link"-like elements such as Atom.  POWDER uses the <Link> element in
   this manner, while XRDS uses the HTML <meta> element with an "http-
   equiv" attribute equals to X-XRDS-Location (to create an embedded
   version of the X-XRDS-Location custom header).

   [+] Self Declaration -  similar to HTTP Response Header method but
      limited to HTML resources.

   [-] Direct Descriptor Access -  the method requires fetching the
      entire resource representation in order to obtain the descriptor
      location.  In addition, it requires changing the resource HTML
      representation which makes discovery an intrusive process.

   [+] Web Architecture Compliant -  uses the <Link> element as
      designed.

   [+] Scale and Technology Agnostic -  while this solution requires
      direct retrieval of the resource and manipulation of its content,
      it is extremely accessible in many platforms.

   [-] Extensible -  extensibility is restricted to HTML representations
      or similar markup formats with support for a similar element.

   Minimum roundtrips to retrieve the resource descriptor: 2




Hammer-Lahav           Expires September 24, 2009              [Page 18]


Internet-Draft            Descriptor Discovery                March 2009


Appendix B.2.6.  HTTP OPTIONS Method

   The HTTP OPTIONS method is used to interact with the HTTP server with
   regard to its capabilities and communication-related information
   about its resources.  The OPTIONS method, together with an optional
   request header, can be used to request both the descriptor location
   and descriptor content itself.

   [-] Self Declaration -  does not address.

   [+] Direct Descriptor Access -  provides a clean mechanism for
      requesting descriptor information about a resource without
      interacting with it.

   [+] Web Architecture Compliant -  uses an existing HTTP featured.

   [-] Scale and Technology Agnostic -  requires client and server
      access to the OPTIONS HTTP method.  Also does not support caching
      which makes this solution inefficient.

   [+] Extensible -  built-into the OPTIONS method.

   Minimum roundtrips to retrieve the resource descriptor: 1

Appendix B.2.7.  WebDAV PROPFIND Method

   Similar to the HTTP OPTIONS method, the WebDAV PROPFIND method
   defined in [RFC4918] can be used to request resource specific
   properties, one of which can hold the location of the descriptor
   document.  PROPFIND, unlike OPTIONS, cannot return the descriptor
   itself, unless it is returned in the required PROPFIND schema (a
   multi-status XML element).  Other alternatives include URIQA [URIQA],
   an HTTP extension which defines a method called MGET, and ARK
   (Archival Resource Key) [ARK] - a method similar to PROPFIND that
   allows the retrieval of resource attributes using keys (which
   describe the resource).

   [-] Self Declaration -  does not address.

   [+-] Direct Descriptor Access -  does not require interaction with
      the resource, but does require at least two requests to get the
      descriptor (get location, get document).

   [+] Web Architecture Compliant -  uses an HTTP extension with less
      support than core HTTP, but still based on published standards.






Hammer-Lahav           Expires September 24, 2009              [Page 19]


Internet-Draft            Descriptor Discovery                March 2009


   [-] Scale and Technology Agnostic -  same as the HTTP OPTIONS Method.

   [+-] Extensible -  uses extensible protocols but at the same time
      depends on solutions that have already gone beyond the standard
      HTTP protocol, which makes further extensions more complex and
      unsupported.

   Minimum roundtrips to retrieve the resource descriptor: 2

Appendix B.2.8.  Custom HTTP Method

   Similar to the HTTP OPTIONS Method, a new method can be defined (such
   as DISCOVER) to return (or redirect to) the descriptor document.  The
   new method can allow caching.

   [-] Self Declaration -  does not address.

   [+] Direct Descriptor Access -  same as the HTTP OPTIONS Method.

   [-] Web Architecture Compliant -  depends heavily on extending every
      platform to support the extension.  Unlikely to be supported by
      existing proxy services and caches.

   [-] Scale and Technology Agnostic -  same as HTTP OPTIONS Method with
      the additional burden on smaller sites requiring access to the new
      protocol.

   [+] Extensible -  new protocol that can extend as needed.

   Minimum roundtrips to retrieve the resource descriptor: 1

Appendix B.2.9.  Static Resource URI Transformation

   Instead of using HTTP facilities to access the descriptor location,
   this method defines a template to transform any resource URI to the
   descriptor document URI.  This can be done by adding a prefix or
   suffix to the resource URI, which turns it into a new resource URI.
   The new URI points to the descriptor document.  For example, to fetch
   the descriptor document for http://example.com/resource, the client
   makes an HTTP GET request to http://example.com/resource;about using
   a static template that adds the ";about" suffix.

   [-] Self Declaration -  does not address.

   [+] Direct Descriptor Access -  creates a unique URI for the
      descriptor document.





Hammer-Lahav           Expires September 24, 2009              [Page 20]


Internet-Draft            Descriptor Discovery                March 2009


   [+-] Web Architecture Compliant -  uses basic HTTP facilities but
      intrudes on the domain authority namespace as it defines a static
      template for URI transformation that is not likely to be
      compatible with many existing URI naming conventions.

   [+-] Scale and Technology Agnostic -  depending on the static mapping
      chosen.  Some hosted environment will have a problem gaining
      access to the mapped URI based on the URI format chosen.

   [-] Extensible -  provides a very specific and limited method to map
      between resources and their descriptor, since each relation type
      must mint its own static template.

   Minimum roundtrips to retrieve the resource descriptor: 1

Appendix B.2.10.  Dynamic Resource URI Transformation

   Same as the Static Resource URI Transformation method but with the
   ability for each domain authority to specify its own discovery
   transformation template.  This can done by placing a configuration
   file at a known location (such as robots.txt) which contains the
   template needed to perform the URL mapping.  The client first obtains
   the configuration document (which may be cached using normal HTTP
   facilities), parses it, then uses that information to transform the
   resource URI and access the descriptor document.

   [+-] Self Declaration -  does not address individual resources, but
      allows entire domains to declare their support (and how to use
      it).

   [+-] Direct Descriptor Access -  once the mapping template has been
      obtained, descriptors can be accessed directly.

   [+-] Web Architecture Compliant -  uses an existing known-location
      design pattern (such as robots.txt) and standard HTTP facilities.
      The use of a known-location if not ideal and is considered a
      violation of web architecture but if it serves as the last of its
      kind, can be tolerated.  An alternative to the known-location
      approach can be using DNS to store either the location of the
      mapping or the map template itself, but DNS adds a layer of
      complexity not always available.

   [+-] Scale and Technology Agnostic -  works well at the URI authority
      level (domain) but is inefficient at the URI path level (resource
      path) and harder to implement when different paths within the same
      domain need to use different templates.  With the decreasing cost
      of custom domains and sub-domains hosting, this will not be an
      issue for most services, but it does require sharing configuration



Hammer-Lahav           Expires September 24, 2009              [Page 21]


Internet-Draft            Descriptor Discovery                March 2009


      at the domain/sub-domain level.

   [+-] Extensible -  can be, depending on the schema used to format the
      known-location configuration document.

   Minimum roundtrips to retrieve the resource descriptor: initially 2,
   1 after caching


Appendix C.  Acknowledgments

   With the exception of the host-meta template extension, very little
   of this memo is original work.  Many communities and individuals have
   been working on solving discovery for many years and this work is a
   direct result of their hard and dedicated efforts.

   Inspiration for this memo derived from previous work on a descriptor
   format called XRDS-Simple, which in turn derived from another
   descriptor format, XRDS.  Previous discovery workflows include Yadis
   which is currently used by the OpenID community.  While suffering
   from significant shortcomings, Yadis was a breakthrough approach to
   performing discovery using extremely restricted hosting environments,
   and this memo has strived to preserve as much of that spirit as
   possible.

   The use of Link elements and headers and the introduction of the
   "describedby" relation type in this memo is a direct result of the
   dedicated work and contribution of Phil Archer to the W3C POWDER
   specification and Jonathan Rees to the W3C review of Uniform Access
   to Information About.  The host-meta approach was first proposed by
   Mark Nottingham as an alternative to attaching links directly to
   resource representations.

   The author wishes to thanks the OASIS XRI community for their
   support, encouragement, and enthusiasm for this work.  Special thanks
   go to Lisa Dusseault, Joseph Holsten, Mark Nottingham, John Panzer,
   Drummond Reed, and Jonathan Rees for their invaluable feedback.

   The author takes all responsibility for errors and omissions.


Appendix D.  Document History

   [[ to be removed by the RFC editor before publication as an RFC ]]

   -03





Hammer-Lahav           Expires September 24, 2009              [Page 22]


Internet-Draft            Descriptor Discovery                March 2009


   o  Added protocol name LRDD (pronounced 'lard').

   o  Fixed Link-Pattern examples to include missing semicolons.

   -02

   o  Changed focus from an HTTP-based process to Link-based process.

   o  Completely revised and restructured document for better clarity.

   o  Realigned the methods to produce consistent results and changed
      the way redirections and client-errors are handled.

   o  Updated to use newer version of site-meta, now called host-meta,
      including a new plaintext-based format to replace the previous XML
      format.

   o  Renamed Link-Template to Link-Pattern to avoid future conflict
      with a previously proposed Link-Template HTTP header.

   o  Removed support for the "scheme" Link-Template parameter.

   o  Replaced restrictions with interoperability recommendations.

   o  Added IANA considerations per new host-meta registry requirements.

   -01

   o  Rename 'resource discovery' to 'descriptor discovery'.

   o  Added informative reference to Metalink.

   o  Clarified that the resource descriptor URI can use any URI scheme,
      not just "http" or "https".

   o  Removed comment regarding redirects when using <LINK> Elements.

   o  Clarified that HTTPS must be used with "https" URIs for both Link
      headers and host-meta retrieval.

   o  Removed DNS verification step for host-meta with schemes other
      then "http" and "https".  Replaced with a general discussion of
      authority and a security consideration comment.

   o  Organized host-meta section into another sub-section level.

   o  Enlarged the template vocabulary from a single "uri" variable to
      include smaller URI components.



Hammer-Lahav           Expires September 24, 2009              [Page 23]


Internet-Draft            Descriptor Discovery                March 2009


   o  Added informative reference to RFC 2295 in analysis appendix.

   -00

   o  Initial draft.


9.  References

9.1.  Normative References

   [I-D.nottingham-http-link-header]
              Nottingham, M., "Link Relations and HTTP Header Linking",
              draft-nottingham-http-link-header-03 (work in progress),
              November 2008.

   [I-D.nottingham-site-meta]
              Nottingham, M. and E. Hammer-Lahav, "Host Metadata for the
              Web", draft-nottingham-site-meta-01 (work in progress),
              February 2009.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2295]  Holtman, K. and A. Mutz, "Transparent Content Negotiation
              in HTTP", RFC 2295, March 1998.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

   [RFC2818]  Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, January 2005.

   [RFC4287]  Nottingham, M., Ed. and R. Sayre, Ed., "The Atom
              Syndication Format", RFC 4287, December 2005.

   [RFC4918]  Dusseault, L., "HTTP Extensions for Web Distributed
              Authoring and Versioning (WebDAV)", RFC 4918, June 2007.

   [W3C.REC-html401-19991224]
              Raggett, D., Jacobs, I., and A. Hors, "HTML 4.01
              Specification", World Wide Web Consortium
              Recommendation REC-html401-19991224, December 1999,
              <http://www.w3.org/TR/1999/REC-html401-19991224>.



Hammer-Lahav           Expires September 24, 2009              [Page 24]


Internet-Draft            Descriptor Discovery                March 2009


   [W3C.REC-xhtml1-20020801]
              Pemberton, S., "XHTML[TM] 1.0 The Extensible HyperText
              Markup Language (Second Edition)", World Wide Web
              Consortium Recommendation REC-xhtml1-20020801,
              August 2002,
              <http://www.w3.org/TR/2002/REC-xhtml1-20020801>.

9.2.  Informative References

   [ARK]      Kunze, J. and R. Rodgers, "The ARK Identifier Scheme",
              <http://www.cdlib.org/inside/diglib/ark/arkspec.html>.

   [I-D.bryan-metalink]
              Bryan, A., "The Metalink Download Description Format",
              draft-bryan-metalink-05 (work in progress), January 2009.

   [POWDER]   Archer, P., Ed., Smith, K., Ed., and A. Perego, Ed.,
              "POWDER: Protocol for Web Description Resources",
              <http://www.w3.org/TR/powder-dr/>.

   [URIQA]    Nokia, "The URI Query Agent Model",
              <http://sw.nokia.com/uriqa/URIQA.html>.

   [XRD]      Hammer-Lahav, E., Ed., "XRD 1.0 [[ replace with new XRD
              specification reference ]]".

   [XRDS]     Wachob, G., Reed, D., Chasen, L., Tan, W., and S.
              Churchill, "Extensible Resource Identifier (XRI)
              Resolution V2.0", <http://docs.oasis-open.org/xri/2.0/
              specs/xri-resolution-V2.0.html>.

   [XRDS-Simple]
              Hammer-Lahav, E., "XRDS-Simple 1.0",
              <http://xrds-simple.net/core/1.0/>.

   [Yadis]    Miller, J., "Yadis Specification 1.0",
              <http://yadis.org/papers/yadis-v1.0.pdf>.

URIs

   [1]  <http://lists.w3.org/Archives/Public/www-talk/>










Hammer-Lahav           Expires September 24, 2009              [Page 25]


Internet-Draft            Descriptor Discovery                March 2009


Author's Address

   Eran Hammer-Lahav
   Yahoo!

   Email: eran@hueniverse.com
   URI:   http://hueniverse.com












































Hammer-Lahav           Expires September 24, 2009              [Page 26]