A URN Namespace for Public Identifiers
RFC 3151

Document Type RFC - Informational (August 2001; No errata)
Was draft-walsh-urn-publicid (individual)
Authors Norman Walsh  , John Cowan  , Paul Grosso 
Last updated 2013-03-02
Stream Legacy
Formats plain text html pdf htmlized bibtex
Stream Legacy state (None)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state RFC 3151 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                           N. Walsh
Request for Comments: 3151                        Sun Microsystems, Inc.
Category: Informational                                         J. Cowan
                                              Reuters Health Information
                                                               P. Grosso
                                                         Arbortext, Inc.
                                                             August 2001

                 A URN Namespace for Public Identifiers

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2001).  All Rights Reserved.


   This document describes a URN (Uniform Resource Name) namespace that
   is designed to allow Public Identifiers to be expressed in URI
   (Uniform Resource Identifiers) syntax.

1. Introduction

   XML [1] external entities have two identifiers: a system identifier
   and a public identifier.  The system identifier is a URI, by
   definition, but the public identifier is simply a string.

   Historically, the system identifier of an external entity has been a
   local, or system-specific identifier while the public identifier has
   been a more global, persistent name.

   Unfortunately, public identifiers do not fit neatly into the existing
   web architecture because they are not legal URIs.  Many new
   specifications (XSLT, XML Schema, etc.) have the implicit or explicit
   requirement that all external identifiers be URIs.

   The purpose of this namespace is to allow public identifiers to be
   encoded in URNs in a reliable, comparable way.

Walsh, et al.                Informational                      [Page 1]
RFC 3151         A URN Namespace for Public Identifiers      August 2001

   This document describes a scheme for representing public identifiers
   as URNs by introducing a public identifier namespace, "publicid".

   This namespace specification is for a formal namespace.

1.1 Public Identifiers

   Any string which consists only of the public identifier characters
   (defined by Production 13 of Extensible Markup Language (XML) 1.0
   Second Edition [1]) is a legal public identifier.

   In addition to the character set restriction, public identifiers must
   be normalized by changing all strings of whitespace (the characters
   #x20, #x9, #xD, and #xA) to single space characters (#x20), and
   removing all leading and trailing whitespace.

   In keeping with this specification's goal of allowing public
   identifiers to be encoded in a reliable, comparable way, this
   specification mandates that public identifiers be normalized before
   encoding them into URNs.  Throughout this specification, we assume
   that normalization has already been performed.

1.2 Formal Public Identifiers

   SGML [2] defines a restricted subset of public identifier called a
   "Formal Public Identifier" (FPI).

   FPIs are strings composed from the same range of characters as public
   identifiers, but with an explicit internal structure.  The structure
   of Formal Public Identifiers is normatively described in SGML [2]; we
   review it here for convenience.

   Most Formal Public Identifiers consist of the following fields, in
   this order: an owner identifier, a public text class, a public text
   description, a public text language or public text designating
   sequence, and an optional public text display version.

   Owner identifiers may begin with "-//" or "+//"; otherwise "//" is
   used to delimit fields in the FPI (with the exception of the public
   text class which is delimited from the public text description by a

   In other words, most FPIs look like this:

      owner//class description//language//version

   and most owners begin with "+//" or "-//", although they are not
   required to.  Here are some example FPIs:

Walsh, et al.                Informational                      [Page 2]
RFC 3151         A URN Namespace for Public Identifiers      August 2001

   +//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML
   -//OASIS//DTD DocBook XML V4.1.2//EN
   -//ArborText::prod//DTD Help Navigation Document::19970708//EN
   ISO/IEC 10179:1996//DTD DSSSL Architecture//EN
   ISO 8879:1986//ENTITIES Added Latin 1//EN

   This document describes an algorithm for encoding public identifiers
   into URNs that explicitly allows the structured nature of formal
   public identifiers to be preserved.  However, an algorithm for
   correctly identifying a Formal Public Identifier and determining the
   various fields within it is out of scope for this document and not
   necessary for the implementation of this URN namespace.

2. Specification Template

   Namespace ID:

      "publicid" requested.

   Registration Information:
Show full document text