Skip to main content

Header Delta-Compression for HTTP/2.0
draft-rpeon-httpbis-header-compression-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Replaced".
Author Roberto Peon
Last updated 2012-10-13
Replaced by draft-ietf-httpbis-header-compression, RFC 7541
RFC stream (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-rpeon-httpbis-header-compression-00
Network Working Group                                            R. Peon
Internet-Draft                                               Google, Inc
Intended status: Informational                              Oct 12, 2012
Expires: April 15, 2013

                 Header Delta-Compression for HTTP/2.0
               draft-rpeon-httpbis-header-compression-00

Abstract

   This document describes a mechanism for compressing streams of groups
   of key-value pairs, often known as Headers in an HTTP session.  See
   RFC 2616 [RFC2616] or successors for more information about headers.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 15, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Peon                     Expires April 15, 2013                 [Page 1]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

Table of Contents

   1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Example-data-model . . . . . . . . . . . . . . . . . . . . . .  3
   4.  Compression-overview . . . . . . . . . . . . . . . . . . . . .  4
   5.  Decompression-overview . . . . . . . . . . . . . . . . . . . .  6
   6.  On-the-wire-overview . . . . . . . . . . . . . . . . . . . . .  7
   7.  Operations . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
   9.  Requirements Notation  . . . . . . . . . . . . . . . . . . . . 10
   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
   11. Normative References . . . . . . . . . . . . . . . . . . . . . 10
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10

Peon                     Expires April 15, 2013                 [Page 2]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

1.  Overview

   There have been several problems pointed out with the use of the gzip
   compressor in SPDY [SPDY].  The biggest of these problems is that it
   is possible for a smart attacker to inject content into the
   compressor, and then to test hypotheses about the prior contents of
   the compressor by examining the output size after each such content
   injection.  The other issue is that gzip often consumes more CPU than
   many would like, especially in situations where one is doing forward
   or reverse proxying.  The compressor proposed here intends to solve
   the first issue and significantly mitigate the second, while
   providing compression that is not too much worse than gzip.

2.  Definitions

         user-agent: The program or device which a human interacts with
         directly and which typically initiates the transport layer
         connection or session

         client: Synonym for user-agent

         server: The computer or device which typically accepts a
         connection, stores, and serves data

         proxy: An entity acting as a server for the client, and a
         client for the server

         header: An complete set of key-value pairs, either request-
         headers, or response headers as defined in RFC2616 [RFC2616]
         section 5.3 or 6.2, respectively

3.  Example-data-model

   The author's current experimental implementation uses something very
   close to the following data-model for doing compression.

Peon                     Expires April 15, 2013                 [Page 3]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

            struct ValEntry {
              KeyMapIterator krt;
              ValMapIterator vrt;
              LRUIterator lrt;
              Index val_idx;
            };

            typedef map<string, ValEntry*> ValMap;

            struct KeyEntry {
              int refcnt;
              Index key_idx;
              ValMap valmap;
            }
            typedef map<string, KeyEntry> KeyMap;
            typedef set<Index> HeaderGroup;
            typedef map<Index, HeaderGroup> HeaderGroups;

            struct CompressorState {
              HeaderGroups header_groups;
              KeyMap key_map;
              list<ValEntry*> lru;
            };

            typedef map<KeyIdx, string>; KeyIdxMap;
            typedef map<ValIdx, pair<KeyIdxMap::iterator,string>; ValIdxMap;

            struct DecompressorState {
              KeyIdxMap key_idx_map;
              ValIdxMap val_idx_map;
              list<KeyIdxMap::iterator> lru;
            };

4.  Compression-overview

   Note that this only describes conceptually how the algorithm works
   and is very much pseudo-code

   User-agents and servers SHOULD NOT NOT use EREF operations.

   User-agents and servers SHOULD use huffman-coding when so doing
   reduces the size of the bytes on the wire

   Proxies MAY use EREF operations when throughput is more important
   than client experienced latency.

Peon                     Expires April 15, 2013                 [Page 4]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

   Proxies MAY choose to encode all strings as literal strings and skip
   huffman coding.

Compress(headers):
  first_used_val_idx = CurrentValIdx()
  old_keymap = keymap

  For header in headers:
    if key in keymap:
      keymap[key].refcnt += 1

  for header in headers:
    key_map_iterator = lookup(header.key)
    if (!key_map_iterator):
      key_iterator = insert_new_key(header.key)
      insert_new_value_from_key(key_iterator, header.val)
      instructions.append(KVSto(key,val))
    else if (val_iterator = lookup_val_in_keymap_entry(key_map_iterator,
                                                        header.val)):
      // we've found that the key,value pair already exists in the compressor
      // we now must check to see if it is already visible, in which case
      // there is nothing to do. If it isn't visible, however, we'll toggle
      // the visibility on.
      if (!header_group.includes(val_iterator_>val_idx)):
        instructions.append(Toggle(val_iterator_>val_idx));
    else:
      // we know we found a key, but not the entire key_value.
      // thus, we emit a clone() instructions.
      instructions.append(Clone(key_iterator_>key_id, header.val)

    if key in old_keymap:
      keymap[key].refcnt += 1

    for header_group_entry in header_group:
      if header_group_entry not in header:
        instructions.append(header_group_entry_>idx())
    for instruction in instructions:
      ExecuteInstruction(instructions)
    SerializeInstructions(output_buffer, instructions, first_used_val_idx)

ExecuteInstruction(instruction):
  while (instruction.size_delta() + state_size &gt; max_state_size):
    RemoveOldestLRUItem()
  if instruction.opcode == "clone":
    InsertKV(instruction.lookup_key(), instruction.val(), GetNextLRUIdx())
  elif instruction.opcode == "kvsto":
    InsertKV(instruction.key(), instruction.val(), GetNextLRUIdx())

Peon                     Expires April 15, 2013                 [Page 5]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

  elif instruction.opcode == "togglerange":
    for (i=instruction.start; i &lt;= instruction.last; ++i)
      bool prev_state = ToggleIdx(i)
      if prev_state == NOT_VISIBLE:
        MoveToFrontOfLRU(i)
  elif instruction.opcode == "toggle":
    bool prev_state = ToggleIdx(instruction.index)
    if prev_state == NOT_VISIBLE:
      MoveToFrontOfLRU(instruction.index)

SerializeInstructions(output_buffer, instructions, first_val_idx):
  // Note that huffman-encoding is optional-- a bit (somwhere) would
  // indicate that it was or was not used for any particular string.
  output_buffer.WriteValIdx(first_val_idx)
  for toggle in filter(TOGGLE, instructions):
    output_buffer.WriteBits(TOGGLE)
    output_buffer.WriteValIdx(toggle.val_idx)
  for toggle in filter(TOGGLE_RANGE, instructions):
    output_buffer.WriteBits(TOGGLE_RANGE)
    output_buffer.WriteValIdx(toggle.start)
    output_buffer.WriteValIdx(toggle.last)
  for clone in filter(CLONE, instructions):
    output_buffer.WriteBits(CLONE)
    output_buffer.WriteKeyIdx(clone.key_idx)
    output_buffer.WriteString(huffman.encode(clone.val))
  for kvsto in filter(KVSTO, instructions):
    output_buffer.WriteString(huffman.encode(kvsto.key))
    output_buffer.WriteString(huffman.encode(kvsto.val))
  for eref in filter(EREF, instructions):
    output_buffer.WriteString(huffman.encode(kvsto.key))
    output_buffer.WriteString(huffman.encode(kvsto.val))

5.  Decompression-overview

   Note that this only describes conceptually how the algorithm works
   and is very much pseudo-code

Peon                     Expires April 15, 2013                 [Page 6]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

  Decode():
    header_group = headers.header_group // as received from sender
    val_idx = headers_frame.first_val_idx // as received from sender.
    header_groups[header_group].ephemereal_headers.clear()
    for instruction in instructions:
      ExecuteInstruction(instruction, header_group)

    ExecuteInstruction(instruction, header_group):
      while (instruction.size_delta() + state_size &gt; max_state_size):
        RemoveOldestLRUItem()
      if instruction.opcode == TOGGLE:
        if instruction.idx in header_groups[header_group]:
          header_groups[header_group].remove(instruction.idx)
        else:
          header_groups[header_group].insert(instruction.idx)
          MoveToHeadOfLRU(instruction.idx)
      elif instruction.opcode == TOGGLE_GROUP:
        for i=instruction.begin; i <= instruction.last; ++i:
          ExecuteInstruction(Toggle(i), header_group);
      elif instruction.opcode == CLONE:
        KeyIdxMap::iterator kimi = key_idx_map.lookup(instruction.key_idx)
        val_idx_map.insert(++val_idx, pair(kimi, instruction.val))
      elif instruction.opcode == KVSTO:
        key_idx = GetNextKeyIdx()
        KeyIdxMap::iterator kimi = key_idx_map.insert(key_idx,
                                                      instruction.key)
        val_idx_map.insert(++val_idx, pair(kimi, instruction.val))
      elif instruction.opcode == EREF:
        header_groups[header_group].ephemereal_headers.append(
            pair(instruction.key, instruction.val))

    // This would return the complete set of headers, e.g. for an HTTP
    // request or response

    InflateHeaders(header_group):
      Headers headers
      for idx in header_groups[header_group]:
        headers.append(ValIdxMap[idx].first->first, ValIdxMap[idx].second)
      headers.append(header_groups[header_group].ephemereal_headers);
      return headers

6.  On-the-wire-overview

   This would replace the current HEADERS frame described in the SPDY/2
   draft specification.

Peon                     Expires April 15, 2013                 [Page 7]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

   The header-group-id MUST be emitted immediately after the HEADERS
   frame boilerplate.

   A field indicating the start of the ID-sequence space MUST be
   immediately after the header-group-id

   A sequence of 'Toggle' or 'ToggleRange' operations MUST follow, if
   any of these operations are to be emitted at all

   A sequence of 'Clone' or 'KVSto' operations MUST then follow, if any
   of these operations are to be emitted at all

   A sequence of 'ERef' operations MUST then follow, if any of these are
   to be emitted.

   If the size of the aggregate set of instructions is larger than the
   frame-length, the 'finished' bit of the HEADERS frame boilerplate
   MUST be set to 0.  This will indicate that the next subsequent
   HEADERS frame on that stream is a continuation of a previous HEADERS
   frame

   If all of the 'Toggle', 'ToggleRange', 'Clone', and 'KVSTo'
   operations have been emitted, the 'finished' bit of the HEADERS frame
   which encapsulates the last instruction MUST be set to 1.

   The rough ABNF for this would be:

HEADERS_FRAME: HEADER_FRAME_BOILERPLATE GROUP-ID FIRST-VAL-IDX INSTRUCTIONS
HEADER_FRAME_BOILERPLATE: LENGTH STREAM_ID HEADERS_OPCODE HEADER_DONE_FLAG
HEADER_DONE_FLAG: (TRUE | FALSE)
INSTRUCTIONS : *(TOGGLE | TOGGLERANGE) *( CLONE | KVSTO) *(EREF)
TOGGLE: TOGGLE_OPCODE VAL_IDX
TOGGLERANGE: TOGGLERANGE_OPCODE VAL_IDX_START VAL_IDX_END
CLONE:  CLONE_OPCODE KEY_IDX VAL_STRING
KVSTO:  KVSTO_OPCODE KEY_STRING VAL_STRING
EREF:   EREF_OPCODE KEY_STRING VAL_STRING

KEY_STRING: MAYBE_COMPRESSED_STRING
VAL_STRING: MAYBE_COMPRESSED_STRING
MAYBE_COMPRESSED_STRING:  TRUE HUFFMAN_COMPRESSED_STRING |
                         FALSE STRING_LITERAL
VAL_IDX_START: int
VAL_IDX_END: int
VAL_IDX: int

   A complete set of headers has only been expressed when the HEADERS
   frame has the HEADERS_DONE bit set to true.  If the headers being

Peon                     Expires April 15, 2013                 [Page 8]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

   serialized do not fit into one HEADERS frame, the HEADERS_DONE flag
   for all but the last HEADERS frame MUST be set to false

   There are alternate serializations which have various interesting
   properties.  Here are some interesting variations among these
   alternatives:

         Serialize all constant-length information before serializing
         any string data.  This allows for a clear separation of byte-
         aligned vs. potentially non-byte-aligned data

         Have explicit sections for each opcode type, with a sentinel
         denoting the end of each range of opcode.  Since the number of
         opcode types is small, this could reduce the overhead consumed
         in repeating the opcode.

         As above, but have a count for the number of each opcode
         instead of using a sentinel

         Use huffman coding on the entire payload of the HEADERS frame,
         not just the strings.  TOGGLES are far more common than other
         operations, and using fewer bits for these could be beneficial

7.  Operations

   Toggle: This encodes a single ValIdx, which, if already present in
   the HeaderGroup associated with the current HEADERS frame, will be
   removed, else added.  If added, then the LRU entry which the index
   references MUST be moved to the head of the LRU.

   ToggleRange: This incodes two ValIdx, denoting an inclusive range of
   ValIdx which shall be toggled as per the Toggle operation, above.

   Clone: This encodes a KeyIdx which references a pre-existing key, and
   references a new string.  The Clone operation inserts a new ValEntry
   into the ValMap associated with the KeyMap entry identified by the
   KeyIdx.  The ValIdx for this new ValEntry is the ++current_val_idx;
   The new ValEntry MUST be inserted at the head of the LRU.

   KVSto: This encodes a new key string and new value string.  The key
   will be inserted into the KeyMap, and assigned a new KeyIdx.  The
   value string will be inserted into the Valmap associated with the
   KeyMap entry that was just added.  As with Clone, a new unique ValIdx
   will be associated with it in a monotonically increasing sequence by
   using the value of ++current_val_idx.  The new ValEntry MUST be
   inserted at the head of the LRU.

Peon                     Expires April 15, 2013                 [Page 9]
Internet-Draft          HTTP/2 Header Compression               Oct 2012

   Eref: This encodes a key and value literal which, while interpreted
   as part of the header frame, doesn't cause any persistent change in
   the compressor state.  ERef stands for "ephemereal reference".

8.  Security Considerations

   The compressor algorithm described here is expected to be immune to
   the current attacks against encrypted stream-based compressors such
   as TLS+gzip, but more scrutiny is warranted.  The reason that it is
   believed that the algorithm(s) expressed here is immune is that any
   backreference to a header key or value always requires a whole-text
   match, and thus any probe of the compression context confirms no
   hypothesis unless the attacker has guessed the entire plaintext.

9.  Requirements Notation

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

10.  Acknowledgements

11.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

   [SPDY]     Belshe, M. and R. Peon, "SPDY PROTOCOL",
              <http://tools.ietf.org/html/draft-mbelshe-httpbis-spdy>.

Author's Address

   Roberto Peon
   Google, Inc

   Email: fenix@google.com

Peon                     Expires April 15, 2013                [Page 10]