Internet-Draft                                            M. Milutinovic
Expires: Mar 16, 2020                                        UC Berkeley
Intended status: Proposed Standard                             M. Toomim
                                                       Invisible College
                                                              B. Bellomy
                                                       Invisible College
                                                            Nov 18, 2019

                             Range Patch
                  draft-toomim-httpbis-range-patch-00

Abstract

   A uniform approach for expressing changes to state over HTTP.


Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.  The list of current Internet-Drafts is at
   http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   https://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   https://www.ietf.org/shadow.html



Table of Contents

   1. Introduction ....................................................3
   2. Range Patch .....................................................3
      2.1. Multiple Range Patches .....................................5
      2.2. Stand-Alone Range Patch ....................................5
      2.3. URI Fragment Identifiers ...................................6
   3. Range Units .....................................................7
      3.1. Bytes Range Unit ...........................................7
      3.2. JSON Range Unit ............................................8
      3.3. Lines Range Unit ..........................................10
   4. IANA Considerations ............................................11
      4.1. Range Unit Registrations ..................................11
      4.2. The +patch Structured Syntax Suffix .......................12
   5. Checking Capabilities ..........................................13
   6. Race Conditions ................................................14
   7. Security Considerations ........................................14
   8. Conventions ....................................................14
   9. Copyright Notice ...............................................14
   10. References ....................................................15
      10.1. Normative References .....................................15
      10.2. Informative References ...................................15



1.  Introduction

   This documents describes a uniform approach for expressing changes to
   state over HTTP.  It builds upon [RFC7233] and details how patches
   can be defined using range units, ranges, and content.  Any patch is
   expressed in the form:

     "range X in units Y of the data was replaced with content Z"

   Range units define how original content (being patched) should be
   parsed to obtain a region of the content which is being patched, and
   then how that region is replaced with new content.

2.  Range Patch

   [RFC7233] effectively already defines how a patch operating on byte
   units can be represented over HTTP, using Content-Range,
   Content-Type, and Content-Length HTTP headers.  Example:

      HTTP/1.1 206 Partial Content
      Date: Wed, 15 Nov 1995 06:25:24 GMT
      Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT
      Content-Range: bytes 21010-47021/47022
      Content-Length: 26012
      Content-Type: image/gif

      ... 26012 bytes of partial image data ...

   The same approach can be used to describe a range inside content
   interpreted not as bytes, but, for example, as JSON [RFC8259] or
   JSON-compatible structure.  We define such JSON range unit in
   Section 4.1.  For example, given the following JSON document:

      {"foo": {"bar": [
        {"some": "thing"},
        {"no": "thing"},
        {"mo": "re"},
        {"baz": {"1": {"two": "tree"}}}
      ]}}

   One might make the following request:

      GET /api/document/1 HTTP/1.1
      Host: example.com
      Accept: application/json
      Range: json=/foo/bar/3/baz



   And receive the following response:

      HTTP/1.1 206 Partial Content
      Date: Thu, 31 Oct 2019 07:51:08 GMT
      Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT
      Content-Range: json /foo/bar/3/baz
      Content-Length: 22
      Content-Type: application/json

      {"1": {"two": "tree"}}

   [RFC7233] defines and allows a Range header only for the GET request
   method.  In this document, we define the behavior for other request
   methods.  Which methods a given resource supports and which
   methods accept range patches as defined in this document is left to
   the server to define.

   When issuing a non-GET request to a resource, a range patch can be
   provided using Range header field.

      PATCH /api/image/1 HTTP/1.1
      Host: example.com
      Range: bytes=21010-47021
      Content-Length: 26012
      Content-Type: image/gif

      ... 26012 bytes of new partial image data ...

   And for JSON:

      PATCH /api/document/1 HTTP/1.1
      Host: example.com
      Range: json=/foo/bar/3/baz
      Content-Length: 25
      Content-Type: application/json

      {"2": {"three": "flour"}}

   A patch with empty contents corresponds to deletion of existing
   content at the specified range.  A patch with a zero-length range but
   non-empty contents corresponds to inserting content immediately
   before the location of the zero-length range.  A patch with non-empty
   contents at a non-zero-length range corresponds to replacing existing
   content at the range with new content.

   When server supports Range header with non-GET requests, server MUST
   NOT ignore the Range header when used with a non-GET request.  When
   server does not support Range header with non-GET requests, a server
   SHOULD generate a 416 (Range Not Satisfiable) or a 400 (Bad Request)
   response when a non-GET request with a Range header is made.  Proxies
   SHOULD NOT drop Range header for non-GET requests.  To assure correct
   handling of non-GET requests with the Range header, requester can
   check server's support for it as described in Section 5.


2.1.  Multiple Range Patches

   Multiple range patches can also be combined in one request. This can
   be done by reusing [RFC7233] for transferring multiple parts using
   multipart/byteranges payload as described in Section 4.1. of
   [RFC7233].

   When issuing a non-GET request to a resource, multiple range patches
   can be provided as well:

      PATCH /api/document/1 HTTP/1.1
      Host: example.com
      Content-Length: 200
      Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES

      --THIS_STRING_SEPARATES
      Content-Type: application/json
      Range: json=/foo/bar/2/mo

      42
      --THIS_STRING_SEPARATES
      Content-Type: application/json
      Range: json=/foo/bar/1/no

      "person"

2.2.  Stand-Alone Range Patch

   When range patches are transmitted outside of HTTP session, a
   stand-alone range patch format can be used.  For example, in this
   format a patch can be stored in a file, send to a mailing list, or a
   code version control system can display the patch in the range patch
   format.  The format reuses structure from HTTP and consists of
   headers separated from the patch body by an empty line.  Only
   Content-Range header is required.  Example:

      Content-Range: json /foo/bar/3/baz

      {"1": {"two": "tree"}}

   Additional headers can be provided.  This can be used even for
   multiple range patches.  In such case the patch starts with
   Content-Type header defining the boundary.  Example:

      Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES

      --THIS_STRING_SEPARATES
      Content-Range: json /foo/bar/2/mo

      42
      --THIS_STRING_SEPARATES
      Content-Range: json /foo/bar/1/no

      "person"

   Stand-alone range patches can be transmitted over HTTP as-is as well.
   This can be used to provide the patch which has been used in a
   previous non-GET request.  A Content-Type with "+patch" suffix
   identifies such stand-alone range patch.  For example, the patch used
   in the PATCH request example above could be retrieved as:

      HTTP/1.1 200 OK
      Date: Thu, 31 Oct 2019 07:51:08 GMT
      Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT
      Content-Length: 62
      Content-Type: application/json+patch

      Content-Range: json /foo/bar/3/baz

      {"2": {"three": "flour"}}

   Stand-alone range patches are binary data.

2.3.  URI Fragment Identifiers

   For media types which support range patches, ranges can be used as
   URI fragment identifies as well.  For example, URI:

      /api/document/1#json=/foo/bar/0

   identifies a fragment with the following content:

      {"some": "thing"}

   Multiple ranges are supported as well and they identify multiple
   fragments:

      /api/document/1#json=/foo/bar/0,/foo/bar/1


3.  Range Units

   Range units define how content is parsed into a structure.  They
   define a corresponding range specification which is a string
   describing range under the unit.

   Different range units can be compatible with content expresses
   through different media types.

3.1.  Bytes Range Unit

   Bytes range unit is already specified in [RFC7233].  We extend it by
   allowing a zero-length range using a zero-length-byte-range-spec.

      zero-length-byte-range-spec = 1*DIGIT

   A zero-length range is a byte offset used to identify a location
   immediately before which new content can be inserted with a patch.

   Additionally, we note that the range "-0" is allowed, is a
   zero-length range, and identifies a location immediately after the
   last byte of data.  This allows appending bytes to data.

   Note that bytes range unit operates on encoded content as specified
   in any Content-Encoding header.  That holds both for GET and non-GET
   requests.


3.2.  JSON Range Unit

   JSON range unit operates on JSON and JSON-compatible data structures.
   Its range specification is based on JSON pointer as described in
   [RFC6901].  The content of the range MUST always be a valid JSON by
   itself.  JSON range unit is identified with "json".

   JSON pointer provides capabilities to identify a single element of a
   data structure.  Here we extend it to allow a range of elements for
   arrays and strings, by extending the scheme how reference token
   modifies which value is referenced from Section 4 of [RFC6901]:

   o If the currently referenced value is a JSON array, the reference
     token can be compromised of two sets of digits (according to the
     ABNF syntax for array indices as specified in Section 4 of
     [RFC6901]), delimited by the character "-".  Each set of digits
     represent an unsigned base-10 integer value.  The first integer
     value MUST be smaller than the number of elements in the array. The
     second integer value MUST be smaller than or equal to the number of
     elements in the array. The second integer value MUST be larger than
     or equal to the first integer value.  If any of these requirements
     are violated, an error condition is raised.

     The new referenced value is a new array with a subset of elements
     starting at the zero-based index of the first integer value, and
     ending at the element before the zero-based index of the second
     integer value (the first index is inclusive, the second index is
     exclusive).

   o If the currently referenced value is a JSON array, the reference
     token can be the character "-".  The new referenced value is a
     zero-length array corresponding to the position immediately after
     the end of the current array.  This design makes such JSON pointer
     compatible with the use of JSON pointers in JSON Patch [RFC6902].
     This allows appending array elements to an array.

   o If the currently referenced value is a JSON string, the scheme for
     JSON arrays is used to index into a string and makes the new
     referenced value a substring of the currently referenced value.
     String indexing is done by code units.

   A range of elements can be specified only as the last reference token
   in JSON range.  It follows that a range of elements can be specified
   only once.

   For example, given the JSON document:

      {
        "foo": [
          "bar",
          "baz",
          "bax"
        ]
      }

   The following JSON strings evaluate to the accompanying JSON values:

      "/foo"       ["bar", "baz"]
      "/foo/0"     "bar"
      "/foo/0-1"   ["bar"]
      "/foo/1-3"   ["baz", "bax"]
      "/foo/1-1"   []
      "/foo/-"     []
      "/foo/3-3"   // error
      "/foo/4-4"   // error
      "/foo/1-0"   // error
      "/foo/1-4"   // error
      "/foo/1-3/0" // error
      "/foo/0/1-3" "ar"

   JSON ranges "/foo/1-1" and "/foo/-" are on its own of little utility,
   but serve as a zero-length range to identify a location immediately
   before which new content can be inserted with a patch.

   JSON range unit operates always on non-encoded content, ignoring any
   Content-Encoding header.  That holds both for GET and non-GET
   requests.


3.3.  Lines Range Unit

   For textual contents lines range unit operates on lines.  Line
   positions are numbered starting with zero (with line position zero
   always being identical with character position zero). Ranges
   identified by lines include the line endings.  If a content does not
   contain any line endings, then it consists of a single (the first)
   line.

   Implementers should be aware of the fact that line endings in textual
   contents can be represented by other characters or character
   sequences than CR+LF.  Besides the CR and LF, there are also NEL and
   CR+NEL.  In general, the encoding of line endings can also depend on
   the character encoding of textual contents, and implementations have
   to take this into account where necessary.

   Lines range unit is identified with "lines".  Lines range
   specification is defined by:

      lines-range-spec = first-line "-" second-line
      first-line = 1*DIGIT
      second-line = 1*DIGIT

   Each lines range consists of two sets of digits, delimited by a
   character "-". Each set of digits represent an unsigned base-10
   integer value.  The first integer value MUST be smaller than the
   number of lines in contents.  The second integer value MUST be
   smaller than or equal to the number of lines in contents. The second
   integer value MUST be larger than or equal to the first integer
   value.

   The range are lines starting at the line corresponding to the first
   integer value, and ending at the line before the line corresponding
   to the second integer value (the first integer is inclusive, the
   second integer is exclusive).

   Lines range where the first and second integer value are equal are
   empty and are on its own of little utility, but serve as a
   zero-length range to identify a location immediately before which new
   content can be inserted with a patch.

   Additionally, lines range specification can be the character "-",
   representing a zero-length range, and identifies a location
   immediately after the last line of textual contents.  This allows
   appending lines to textual contents.

   Lines range unit operates always on non-encoded content, ignoring any
   Content-Encoding header.  That holds both for GET and non-GET
   requests.


4.  IANA Considerations

4.1.  Range Unit Registrations

   This document registers the following range units:

   +-------------+---------------------------------------+-------------+
   | Range Unit  | Description                           | Reference   |
   | Name        |                                       |             |
   +-------------+---------------------------------------+-------------+
   | json        | a JSON pointer range on JSON and      | Section 3.2 |
   |             | JSON-compatible data structures       |             |
   +-------------+---------------------------------------+-------------+
   | lines       | a range of lines of textual contents  | Section 3.3 |
   +-------------+---------------------------------------+-------------+

   The change controller is: "IETF (iesg@ietf.org) - Internet
   Engineering Task Force".

4.2.  The +patch Structured Syntax Suffix

   This document registers the following media type structured syntax
   suffix:

   Name:  Range patch

   +suffix:  +patch

   References:  See Section 2.2 of this document.

   Encoding considerations:  Stand-alone range patches are binary data.

   Fragment identifier considerations:

      The syntax and semantics of fragment identifiers specified for
      +patch SHOULD be as specified for range patches themselves.  (At
      publication of this document, there is no fragment identification
      syntax defined for range patches themselves.)

      The syntax and semantics for fragment identifiers for a specific
      "xxx/yyy+patch" SHOULD be processed as follows:

         For cases defined in +patch, where the fragment identifier
         resolves per the +patch rules, then process as specified in
         +patch.

         For cases defined in +patch, where the fragment identifier does
         not resolve per the +patch rules, then such fragment SHOULD
         identifies a fragment which is obtained by intersection of the
         fragment identifier and the underlying range patch range
         specification for "xxx/yyy+patch".

         For cases not defined in +patch, then such fragment SHOULD
         identifies a fragment which is obtained by intersection of the
         fragment identifier and the underlying range patch range
         specification for "xxx/yyy+patch".

   Interoperability considerations:  n/a

   Security considerations:  See Section 7 of this document.

   Contact:  IETF HTTP Working Group (ietf-http-wg@w3.org)

   Author/Change controller:

      IETF (iesg@ietf.org) - Internet Engineering Task Force


5.  Checking Capabilities

   A server may support or not support non-GET requests with a Range
   header.  The default behavior of servers is simply to ignore unknown
   or unsupported headers.  In the case of a range patch, this
   implies that a request issuing a patch to a specific subsection of a
   resource might be interpreted by a server as a request to overwrite
   the entire resource with the patch, leaving the resource in a
   corrupted state.

   To determine whether or not the server can fulfill such a request
   correctly, the requester may first issue an OPTIONS request:

      OPTIONS /api/document/1
      Range-Request-Method: PATCH
      Range-Request-Units: json,bytes

   To which the server may reply in the affirmative:

      HTTP/1.1 204 No Content
      Connection: keep-alive
      Range-Request-Allow-Methods: PATCH
      Range-Request-Allow-Units: json,bytes
      Version: 33a64df551425fcc55e4d42a148795d9f25f89d4

   In the partial negative:

      HTTP/1.1 204 No Content
      Connection: keep-alive
      Range-Request-Allow-Methods: PATCH
      Range-Request-Allow-Units: json
      Version: 33a64df551425fcc55e4d42a148795d9f25f89d4

   Or in the complete negative:

      HTTP/1.1 204 No Content
      Connection: keep-alive
      Range-Request-Allow-Methods:
      Range-Request-Allow-Units:

   Empty header fields are allowed per [RFC2616] section 2.1.

   Also note the presence of the Version header, discussed in section
   6.  The server may preemptively send this to obviate the need for
   another GET prior to a range patch request.


6.  Race Conditions

   As with standard PUT, POST, and PATCH requests, a non-GET request
   with a Range header carries the risk of a mid-air collision with
   another simultaneous request.  If one requester updates a resource,
   and another requester, not being aware of that update, issues a
   second update, the resource may be left in an unexpected state.

   Standard PUT, POST, and PATCH requests handle this with the ETag and
   If-Match headers.  However, these headers vary based on the
   Content-Encoding of the request.  Alternatively, requests can use the
   Versioning in [Braid-HTTP] to determine the ordering of simultaneous
   requests, and can specify consistency guarantees with [Merge-Types].

   The server may return a Version header in response to HTTP requests
   directed at a given resource.  Correspondingly, when issuing a range
   patch, the requester may include a Version header containing the
   version of the resource it intends to update.  If the server cannot
   merge the patch at the given version, it must return a 409 Conflict
   response.

7.  Security Considerations

   Both GET and non-GET requests with a Range header are potentially
   susceptible to denial-of-service attacks because the effort required
   to compute the patch or apply the patch.

8.  Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

9.  Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

10.  References

10.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC7233]  Fielding, R., Lafon, Y., and J. Reschke, "Hypertext
              Transfer Protocol (HTTP/1.1): Range Requests", RFC 7233,
              June 2014.

   [RFC6901]  Bryan, P., Zyp, K., and M. Nottingham, "JavaScript Object
              Notation (JSON) Pointer", RFC 6901, April 2013.

   [RFC6902]  Bryan, P.,  and M. Nottingham, "JavaScript Object Notation
              (JSON) Patch", RFC 6902, April 2013.


10.2.  Informative References

   [Merge-Types] draft-toomim-httpbis-merge-types-00

   [Braid-HTTP] draft-toomim-httpbis-braid-http-00

   [RFC8259]  T. Bray, "The JavaScript Object Notation (JSON) Data
              Interchange Format", RFC 8259, December 2017.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee,
              "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616,
              June 1999.

Authors' Addresses

   For more information, the authors of this document are best contacted
   via Internet mail:


   Mitar Milutinovic
   UC Berkeley, EECS Department
   775 Soda Hall #1776
   Berkeley, CA 94720-1776

   EMail: mitar.ietf@tnode.com
   Web:   https://mitar.tnode.com/


   Michael Toomim
   Invisible College, Berkeley
   2053 Berkeley Way
   Berkeley, CA 94704

   EMail: toomim@gmail.com
   Web:   https://invisible.college/@toomim


   Bryn Bellomy
   Invisible College, Berkeley
   2053 Berkeley Way
   Berkeley, CA 94704

   EMail: bryn@signals.io
   Web:   https://invisible.college/@bryn