INTERNET-DRAFT                                                    B. Bos
draft-bos-http-redirect-00.txt                                     W3C/INRIA
Expires 1 January 2000                                      30 June 1999


          Handling of fragment identifiers in redirected URLs


Status of this memo

   This document is [probably going to be] an Internet-Draft and is in
   full conformance with all provisions of Section 10 of RFC
   2026[RFC2026].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.


Abstract

   The HTTP 1.1 specification describes how a server can answer a
   request with a redirection, instructing the client to get the
   resource from a different URL. It doesn't explain what to do with any
   fragment identifier that might have been on the original URL, and
   this omission has resulted in different clients handling fragments in
   different ways. This draft gives rules towards a more consistent
   handling by future HTTP clients.

   Comments on this draft can be sent to bert@w3.org


Description of the problem

   The HTTP 1.1 protocol[HTTP] contains a facility whereby servers can
   inform clients that the resource they requested is not available at
   the requested address, but at some other. The server sends back a



Bos                                                             [Page 1]


draft               Fragment IDs in redirected URLs         30 June 1999


   status code such as 301 or 302 and the correct URI of the resource.
   Clients then typically issue a new request, to the same or to a
   different server, with the new URI.

   URIs may contain a fragment identifier, indicated by a # (hash mark)
   in the URI[URI]. For example

      http://www.w3.org/TR/REC-xml-names#NT-NCName

   A client that is retrieving this fragment will ask a server for the
   resource "http://www.w3.org/TR/REC-xml-names" and will then locate
   the fragment "NT-NCName" in that resource. It depends on the client
   and on the type of the resource what is done with the fragment.  A
   browser displaying an HTML[HTML] page usually scrolls the view port
   so that the indicated fragment is at the top.

   In the example, the fragment identifier is a single name, but again
   depending on the type of resource, it may be a complex expression.

   The problem is what happens when a URI with a fragment identifier
   gets redirected. Assume that when the client sends the URL
   "http://www.w3.org/TR/REC-xml-names" to a server, it will receive a
   status code 301, which means "Moved permanently", and a new URL.
   Let's assume the new URL is

      http://www.w3.org/TR/REC-xml-names/

   i.e., with an extra slash compared to the original URL. The question
   is whether the client should interpret this as

      http://www.w3.org/TR/REC-xml-names/#NT-NCName

   or as

      http://www.w3.org/TR/REC-xml-names/

   The former assumes that the document may have changed location, but
   that it is still the same document and it still contains the same
   fragment. The latter assumes that, because the document changed
   location, it probably also changed contents, and doesn't have that
   fragment anymore.

   The HTTP 1.1 specification talks about a single resource which is
   available at one or more locations or in one or more representations,
   so the former interpretation appears to be the right one. It may be
   the case that some of those alternative representations do not allow
   fragments to be identified, but we will have to assume that at least
   one of them does.



Bos                                                             [Page 2]


draft               Fragment IDs in redirected URLs         30 June 1999


   But HTTP 1.1 doesn't talk explicitly about fragment identifiers,
   which has resulted in the sad fact that at the time of writing, there
   are clients that drop the fragment identifier upon a redirect.
   Anecdotal evidence suggests that in fact only about one third of Web
   browsers re-applies the fragment identifier to the redirected URL.

   This draft therefore explains how to apply the fragment identifier in
   case of a redirection.


Detailed specification

   There are different cases, depending on which type of redirection is
   used, and on whether the new URI itself contains a fragment
   identifier.

   We assume that a client issued an HTTP GET request for a particular
   URI (referred to as the "original URI"). This draft does not specify
   what happens with other kinds of requests, such as HEAD, PUT and
   POST.

   If the server returns a response code of 300 ("multiple choice"), 301
   ("moved permanently"), 302 ("moved temporarily") or 303 ("see
   other"), and if the server also returns one or more URIs where the
   resource can be found, then the client SHOULD treat the new URIs as
   if the fragment identifier of the original URI was added at the end.

   The exception is when a returned URI already has a fragment
   identifier. In that case the original fragment identifier MUST NOT be
   not added to it.

   If the client retrieves the resource using the new URI and the
   resource turns out to be of a type that doesn't allow fragments to be
   identified, then the client SHOULD silently ignore the fragment ID
   and not issue an error message.

   The response codes 304 ("not modified") and 305 ("use proxy") both
   indicate that the resource can be found in a different way, but do
   not specify a new URI. The resource is still identified by the
   original URI with the original fragment identifier.


Open issue

   If a resource is available in several representations (as indicated
   by the 300 response code), it may be the case that some of these
   representations would be able to identify the fragment, but not using
   the same fragment identifier. For example, one of the representations



Bos                                                             [Page 3]


draft               Fragment IDs in redirected URLs         30 June 1999


   may be an HTML file with elements carrying ID attributes, while
   another may be a Postscript file with page numbers. The author of
   both may consider them to be the same resource and may want to map
   page numbers to IDs and vice versa. There is currently no way for a
   server to tell a client about such mappings of fragment identifiers
   between different representations of a resource.

   A suggestion for a future version of HTTP may be to add an (optional)
   Fragment header to the request, which holds the fragment identifier.

   Even simpler may be to allow an HTTP request to contain a fragment
   identifier.


Security considerations

   No new security considerations are added to those already present in
   HTTP 1.1.


References


   [HTML]
      Dave Raggett, Arnaud Le Hors, Ian Jacobs. "HTML 4.0 Specifica-
      tion." December 1997, revised April 1998. W3C Recommendation REC-
      html40-19980424. Available at URL http://www.w3.org/TR/REC-html40/


   [HTTP]
      R. Fielding, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee.
      "Hypertext Transfer Protocol -- HTTP/1.1." January 1997. Internet
      RFC 2068. Available at URL
      http://www.w3.org/Protocols/rfc2068/rfc2068


   [RFC2026]
      S. Bradner. "The Internet Standards Process -- Revision 3."
      October 1996. Internet RFC 2026. Available at URL
      ftp://ftp.nordu.net/rfc/rfc2026.txt


   [URI]
      T. Berners-Lee, L. Masinter, M. McCahill. "Uniform Resource Loca-
      tors (URL)." December 1994. Internet RFC 1738. Available at URL
      ftp://ftp.nordu.net/rfc/rfc1738.txt





Bos                                                             [Page 4]


draft               Fragment IDs in redirected URLs         30 June 1999


Author's address


      Bert Bos
      W3C/INRIA
      2004, route des Lucioles
      B.P. 93
      06902 Sophia Antipolis Cedex
      France

      tel: +33 (0)4 92 38 76 92
      e-mail: bert@w3.org







































Bos                                                             [Page 5]