|Internet-Draft||JSON Encoding for HTTP Field Values||September 2020|
|Reschke||Expires 5 March 2021||[Page]|
- Network Working Group
- Intended Status:
A JSON Encoding for HTTP Field Values
This document establishes a convention for use of JSON-encoded field values in HTTP fields.¶
This note is to be removed before publishing as an RFC.¶
Distribution of this document is unlimited. Although this is not a work item of the HTTPbis Working Group, comments should be sent to the Hypertext Transfer Protocol (HTTP) mailing list at email@example.com, which may be joined by sending a message with subject "subscribe" to firstname.lastname@example.org.¶
XML versions and latest edits for this document are available from <http://greenbytes.de/tech/webdav/#draft-reschke-http-jfv>.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 5 March 2021.¶
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
- There is no common syntax for complex field values. Several well-known fields do use a similarly looking syntax, but it is hard to write generic parsing code that will both correctly handle valid field values but also reject invalid ones.¶
- The HTTP message format allows fields to repeat, so field syntax needs to be designed in a way that these cases are either meaningful, or can be unambiguously detected and rejected.¶
- HTTP does not define a character encoding scheme ([RFC6365], Section 2), so fields are either stuck with US-ASCII ([RFC0020]), or need out-of-band information to decide what encoding scheme is used. Furthermore, APIs usually assume a default encoding scheme in order to map from octet sequences to strings (for instance, [XMLHttpRequest] uses the IDL type "ByteString", effectively resulting in the ISO-8859-1 character encoding scheme [ISO-8859-1] being used).¶
This specification addresses the issues listed above by defining both a generic JSON-based ([RFC8259]) data model and a concrete wire format that can be used in definitions of new fields, where the goals were:¶
In HTTP, fields with the same field name can occur multiple times within a single message (Section 5.1 of [HTTP]). When this happens, recipients are allowed to combine the field values using commas as delimiter. This rule matches nicely JSON's array format (Section 5 of [RFC8259]). Thus, the basic data model used here is the JSON array.¶
Field definitions that need only a single value can restrict themselves to arrays of length 1, and are encouraged to define error handling in case more values are received (such as "first wins", "last wins", or "abort with fatal error message").¶
JSON arrays are mapped to field values by creating a sequence of serialized member elements, separated by commas and optionally whitespace. This is equivalent to using the full JSON array format, while leaving out the "begin-array" ('[') and "end-array" (']') delimiters.¶
CR = %x0D ; carriage return HTAB = %x09 ; horizontal tab LF = %x0A ; line feed SP = %x20 ; space VCHAR = %x21-7E ; visible (printing) characters¶
Characters in JSON strings that are not allowed or discouraged in HTTP field values - that is, not in the "VCHAR" definition - need to be represented using JSON's "backslash" escaping mechanism ([RFC8259], Section 7).¶
The control characters CR, LF, and HTAB do not appear inside JSON strings, but can be used outside (line breaks, indentation etc.). These characters need to be either stripped or replaced by space characters (ABNF "SP").¶
json-field-value = #json-field-item json-field-item = JSON-Text ; see [RFC8259], Section 2, ; post-processed so that only VCHAR characters ; are used¶
To map a JSON array to an HTTP field value, process each array element separately by:¶
- generating the JSON representation,¶
- stripping all JSON control characters (CR, HTAB, LF), or replacing them by space ("SP") characters,¶
- replacing all remaining non-VSPACE characters by the equivalent backslash-escape sequence ([RFC8259], Section 7).¶
The resulting list of strings is transformed into an HTTP field value by combining them using comma (%x2C) plus optional SP as delimiter, and encoding the resulting string into an octet sequence using the US-ASCII character encoding scheme ([RFC0020]).¶
To map a set of HTTP field instances to a JSON array:¶
- combine all field instances into a single field as per Section 5.1 of [HTTP],¶
- add a leading begin-array ("[") octet and a trailing end-array ("]") octet, then¶
- run the resulting octet sequence through a JSON parser.¶
The result of the parsing operation is either an error (in which case the field values needs to be considered invalid), or a JSON array.¶
Specifications defining new HTTP fields need to take the considerations listed in Section 5.7 of [HTTP] into account. Many of these will already be accounted for by using the format defined in this specification.¶
Readers of HTTP-related specifications frequently expect an ABNF definition of the field value syntax. This is not really needed here, as the actual syntax is JSON text, as defined in Section 2 of [RFC8259].¶
A very simple way to use this JSON encoding thus is just to cite this specification - specifically the "json-field-value" ABNF production defined in Section 2 - and otherwise not to talk about the details of the field syntax at all.¶
This frees the specification from defining the concrete on-the-wire syntax. What's left is defining the field value in terms of a JSON array. An important aspect is the question of extensibility, e.g. how recipients ought to treat unknown field names. In general, a "must ignore" approach will allow protocols to evolve without versioning or even using entire new field names.¶
This JSON-based syntax will only apply to newly introduced fields, thus backwards compatibility is not a problem. That being said, it is conceivable that there is existing code that might trip over double quotes not being used for HTTP's quoted-string syntax (Section 5.4.1 of [HTTP]).¶
As described in Section 4 of [RFC8259], JSON parser implementations differ in the handling of duplicate object names. Therefore, senders MUST NOT use duplicate object names, and recipients SHOULD either treat field values with duplicate names as invalid (consistent with [RFC7493], Section 2.3) or use the lexically last value (consistent with [ECMA-262], Section 188.8.131.52).¶
Furthermore, ordering of object members is not significant and can not be relied upon.¶
In current versions of HTTP, field values are represented by octet sequences, usually used to transmit ASCII characters, with restrictions on the use of certain control characters, and no associated default character encoding, nor a way to describe it ([HTTP], Section 5). HTTP/2 does not change this.¶
This specification maps all characters which can cause problems to JSON escape sequences, thereby solving the HTTP field internationalization problem.¶
Future specifications of HTTP might change to allow non-ASCII characters natively. In that case, fields using the syntax defined by this specification would have a simple migration path (by just stopping to require escaping of non-ASCII characters).¶
Other than that, any syntax that makes extensions easy can be used to smuggle information through field values; however, this concern is shared with other widely used formats, such as those using parameters in the form of name/value pairs.¶
- Fielding, R., Ed., Nottingham, M., Ed., and J. F. Reschke, Ed., "HTTP Semantics", Work in Progress, Internet-Draft, draft-ietf-httpbis-semantics-11, , <https://tools.ietf.org/html/draft-ietf-httpbis-semantics-11>.
- Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, , <https://www.rfc-editor.org/info/rfc20>.
- Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/info/rfc5234>.
- Bray, T., Ed., "The I-JSON Message Format", RFC 7493, DOI 10.17487/RFC7493, , <https://www.rfc-editor.org/info/rfc7493>.
- Ecma International, "ECMA-262 6th Edition, The ECMAScript 2015 Language Specification", Standard ECMA-262, , <http://www.ecma-international.org/ecma-262/6.0/>.
- Nottingham, M. and P-H. Kamp, "Structured Field Values for HTTP", Work in Progress, Internet-Draft, draft-ietf-httpbis-header-structure-19, , <https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-19>.
- International Organization for Standardization, "Information technology -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1", ISO/IEC 8859-1:1998, .
- Hoffman, P. and J. Klensin, "Terminology Used in Internationalization in the IETF", BCP 166, RFC 6365, DOI 10.17487/RFC6365, , <https://www.rfc-editor.org/info/rfc6365>.
- WhatWG, "XMLHttpRequest", , <https://xhr.spec.whatwg.org/>.
- West, M., "Clear Site Data", W3C Working Draft WD-clear-site-data-20171130, , <https://www.w3.org/TR/2017/WD-clear-site-data-20171130/>. Latest version available at <https://www.w3.org/TR/clear-site-data/>.
- Clelland, I., "Feature Policy", W3C Editor's Draft , , <https://w3c.github.io/webappsec-feature-policy/>.
- Creager, D., Grigorik, I., Meyer, P., and M. West, "Reporting API", W3C Working Draft WD-reporting-1-20180925, , <https://www.w3.org/TR/2018/WD-reporting-1-20180925/>. Latest version available at <https://www.w3.org/TR/reporting-1/>.
This section is to be removed before publishing as an RFC.¶
Since work started on this document, various specifications have adopted this format. At least one of these moved away after the HTTP Working Group decided to focus on [HSTRUCT] (see thread starting at <https://lists.w3.org/Archives/Public/ietf-http-wg/2016OctDec/0505.html>).¶
The sections below summarize the current usage of this format.¶
Used in earlier versions of "Clear Site Data". The current version replaces the use of JSON with a custom syntax that happens to be somewhat compatible with an array of JSON strings (see Section 3.1 of [CLEARSITE] and <https://lists.w3.org/Archives/Public/ietf-http-wg/2017AprJun/0214.html> for feedback).¶
This section is to be removed before publishing as an RFC.¶
Mention slightly increased risk of smuggling information in header field values.¶
Mention Kazuho Oku's proposal for abbreviated forms.¶
Added a bit of text about the motivation for a concrete JSON subset (ack Cory Benfield).¶
Expand I18N section.¶
Between June and December 2016, this was a work item of the HTTP working group (see <https://datatracker.ietf.org/doc/draft-ietf-httpbis-jfv/>). Work (if any) continues now on <https://datatracker.ietf.org/doc/draft-reschke-http-jfv/>.¶
Changes made while this was a work item of the HTTP Working Group:¶
Added example for "Accept-Encoding" (inspired by Kazuho's feedback), showing a potential way to optimize the format when default values apply.¶
Add interop discussion, building on I-JSON and ECMA-262 (see <https://github.com/httpwg/http-extensions/issues/225>).¶
Move non-essential parts into appendix.¶
Updated XHR reference.¶
Add meat to "Using this Format in Header Field Definitions".¶
Add a few lines on the relation to "Key".¶
Summarize current use of the format.¶
RFC 5987 is obsoleted by RFC 8187.¶
Update CLEARSITE comment.¶
Update JSON and HSTRUCT references.¶
FEATUREPOL doesn't use JSON syntax anymore.¶
Update HSTRUCT reference.¶
Update notes about CLEARSITE and FEATUREPOL.¶
Update HSTRUCT and FEATUREPOL references.¶
Update note about REPORTING.¶
Changed category to "informational".¶
Update HSTRUCT reference.¶
Update note about FEATUREPOL (now using Structured Fields).¶
Remove discussion about the relation to KEY (as that spec is dormant: <https://datatracker.ietf.org/doc/draft-ietf-httpbis-key/>).¶
Remove appendices "Examples" and "Discussion".¶
Mark "Use of JSON Field Value Encoding in the Wild" for removal in RFC.¶