Uniform Resource Locators (URL)
RFC 1738
Document | Type |
RFC - Proposed Standard
(December 1994; Errata)
Was draft-ietf-uri-url (uribof WG)
|
|
---|---|---|---|
Authors | Tim Berners-Lee , Larry Masinter , Mark McCahill | ||
Last updated | 2020-01-21 | ||
Stream | Legacy | ||
Formats | plain text html pdf htmlized with errata bibtex | ||
Stream | Legacy state | (None) | |
Consensus Boilerplate | Unknown | ||
RFC Editor Note | (None) | ||
IESG | IESG state | RFC 1738 (Proposed Standard) | |
Telechat date | |||
Responsible AD | (None) | ||
Send notices to | (None) |
Network Working Group T. Berners-Lee Request for Comments: 1738 CERN Category: Standards Track L. Masinter Xerox Corporation M. McCahill University of Minnesota Editors December 1994 Uniform Resource Locators (URL) Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Abstract This document specifies a Uniform Resource Locator (URL), the syntax and semantics of formalized information for location and access of resources via the Internet. 1. Introduction This document describes the syntax and semantics for a compact string representation for a resource available via the Internet. These strings are called "Uniform Resource Locators" (URLs). The specification is derived from concepts introduced by the World- Wide Web global information initiative, whose use of such objects dates from 1990 and is described in "Universal Resource Identifiers in WWW", RFC 1630. The specification of URLs is designed to meet the requirements laid out in "Functional Requirements for Internet Resource Locators" [12]. This document was written by the URI working group of the Internet Engineering Task Force. Comments may be addressed to the editors, or to the URI-WG <uri@bunyip.com>. Discussions of the group are archived at <URL:http://www.acl.lanl.gov/URI/archive/uri-archive.index.html> Berners-Lee, Masinter & McCahill [Page 1] RFC 1738 Uniform Resource Locators (URL) December 1994 2. General URL Syntax Just as there are many different methods of access to resources, there are several schemes for describing the location of such resources. The generic syntax for URLs provides a framework for new schemes to be established using protocols other than those defined in this document. URLs are used to `locate' resources, by providing an abstract identification of the resource location. Having located a resource, a system may perform a variety of operations on the resource, as might be characterized by such words as `access', `update', `replace', `find attributes'. In general, only the `access' method needs to be specified for any URL scheme. 2.1. The main parts of URLs A full BNF description of the URL syntax is given in Section 5. In general, URLs are written as follows: <scheme>:<scheme-specific-part> A URL contains the name of the scheme being used (<scheme>) followed by a colon and then a string (the <scheme-specific-part>) whose interpretation depends on the scheme. Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http"). 2.2. URL Character Encoding Issues URLs are sequences of characters, i.e., letters, digits, and special characters. A URLs may be represented in a variety of ways: e.g., ink on paper, or a sequence of octets in a coded character set. The interpretation of a URL depends only on the identity of the characters used. In most URL schemes, the sequences of characters in different parts of a URL are used to represent sequences of octets used in Internet protocols. For example, in the ftp scheme, the host name, directory name and file names are such sequences of octets, represented by parts of the URL. Within those parts, an octet may be represented by Berners-Lee, Masinter & McCahill [Page 2] RFC 1738 Uniform Resource Locators (URL) December 1994 the chararacter which has that octet as its code within the US-ASCII [20] coded character set. In addition, octets may be encoded by a character triplet consisting of the character "%" followed by the two hexadecimal digits (from "0123456789ABCDEF") which forming the hexadecimal value of the octet. (The characters "abcdef" may also be used in hexadecimal encodings.) Octets must be encoded if they have no corresponding graphic character within the US-ASCII coded character set, if the use of theShow full document text