Canonical XML Version 1.0
RFC 3076

 
Document Type RFC - Informational (March 2001; No errata)
Last updated 2013-03-02
Stream IETF
Formats plain text pdf html
Stream WG state (None)
Document shepherd No shepherd assigned
IESG IESG state RFC 3076 (Informational)
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                           J. Boyer
Request for Comments: 3076                       PureEdge Solutions Inc.
Category: Informational                                       March 2001

                       Canonical XML Version 1.0

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2001).  All Rights Reserved.

Abstract

   Any XML (Extensible Markup Language) document is part of a set of XML
   documents that are logically equivalent within an application
   context, but which vary in physical representation based on syntactic
   changes permitted by XML 1.0 and Namespaces in XML.  This
   specification describes a method for generating a physical
   representation, the canonical form, of an XML document that accounts
   for the permissible changes.  Except for limitations regarding a few
   unusual cases, if two documents have the same canonical form, then
   the two documents are logically equivalent within the given
   application context.  Note that two documents may have differing
   canonical forms yet still be equivalent in a given context based on
   application-specific equivalence rules for which no generalized XML
   specification could account.

Boyer                        Informational                      [Page 1]
RFC 3076                     Canonical XML                    March 2001

Table of Contents

   1. Introduction...............................................  2
   1.1 Terminology...............................................  3
   1.2 Applications..............................................  4
   1.3 Limitations...............................................  4
   2. XML Canonicalization.......................................  6
   2.1 Data Model................................................  6
   2.2 Document Order............................................ 10
   2.3 Processing Model.......................................... 10
   2.4 Document Subsets.......................................... 13
   3. Examples of XML Canonicalization........................... 14
   3.1 PIs, Comments, and Outside of Document Element............ 14
   3.2 Whitespace in Document Content............................ 15
   3.3 Start and End Tags........................................ 16
   3.4 Character Modifications and Character References.......... 17
   3.5 Entity References......................................... 19
   3.6 UTF-8 Encoding............................................ 19
   3.7 Document Subsets.......................................... 20
   4. Resolutions................................................ 21
   4.1 No XML Declaration........................................ 21
   4.2 No Character Model Normalization.......................... 21
   4.3 Handling of Whitespace Outside Document Element........... 22
   4.4 No Namespace Prefix Rewriting............................. 22
   4.5 Order of Namespace Declarations and Attributes............ 23
   4.6 Superfluous Namespace Declarations........................ 23
   4.7 Propagation of Default Namespace Declaration in Document
       Subsets................................................... 24
   4.8 Sorting Attributes by Namespace URI....................... 24
   Security Considerations....................................... 24
   References.................................................... 25
   Author's Address.............................................. 26
   Acknowledgements.............................................. 27
   Full Copyright Statement...................................... 28

1. Introduction

   The XML 1.0 Recommendation [XML] specifies the syntax of a class of
   resources called XML documents.  The Namespaces in XML Recommendation
   [Names] specifies additional syntax and semantics for XML documents.
   It is possible for XML documents which are equivalent for the
   purposes of many applications to differ in physical representation.
   For example, they may differ in their entity structure, attribute
   ordering, and character encoding.  It is the goal of this
   specification to establish a method for determining whether two
   documents are identical, or whether an application has not changed a
   document, except for transformations permitted by XML 1.0 and
   Namespaces.

Boyer                        Informational                      [Page 2]
RFC 3076                     Canonical XML                    March 2001

1.1 Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
Show full document text