The binarypack JSON-like representation format
draft-bormann-apparea-bpack-00
The information below is for an old version of the document.
| Document | Type | Active Internet-Draft (individual) | |
|---|---|---|---|
| Author | Carsten Bormann | ||
| Last updated | 2012-10-11 | ||
| Stream | (None) | ||
| Formats | plain text htmlized pdfized bibtex | ||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-bormann-apparea-bpack-00
APP area C. Bormann
Internet-Draft Universitaet Bremen TZI
Intended status: Informational October 11, 2012
Expires: April 14, 2013
The binarypack JSON-like representation format
draft-bormann-apparea-bpack-00
Abstract
JSON (RFC 4627) is an extremely successful format for the
representation of structured information, supporting Boolean values,
numbers, strings, arrays, and tables. Recently, a number of
applications have started to look for binary representation formats
that solve a similar problem. In particular, constrained node
networks can benefit from such a binary representation format.
A very successful binary representation that is otherwise comparable
to JSON is MessagePack. Recently, a slight modification of
MessagePack has been defined that allows for distinguishing UTF-8
strings from binary data. This draft, as an independent effort,
documents the resulting BinaryPack format.
The current version -00 of this document is an early draft that is
intended to spark further discussion.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 14, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
Bormann Expires April 14, 2013 [Page 1]
Internet-Draft binarypack October 2012
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Notation . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. The binarypack Representation Format . . . . . . . . . . . . . 5
2.1. Data Types . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Integers . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3. Floating Point Values . . . . . . . . . . . . . . . . . . 6
2.4. Special Values . . . . . . . . . . . . . . . . . . . . . . 6
2.5. Opaque Byte Strings . . . . . . . . . . . . . . . . . . . 7
2.6. UTF-8 Strings . . . . . . . . . . . . . . . . . . . . . . 7
2.7. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.8. Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1. JSON roundtripping . . . . . . . . . . . . . . . . . . . . 9
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
5. Security Considerations . . . . . . . . . . . . . . . . . . . 11
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.1. Normative References . . . . . . . . . . . . . . . . . . . 13
7.2. Informative References . . . . . . . . . . . . . . . . . . 13
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14
Bormann Expires April 14, 2013 [Page 2]
Internet-Draft binarypack October 2012
1. Introduction
(To be written - for now please see the Abstract.)
A description of the MessagePack binary representation format can be
found in [msgpack].
An implementation of BinaryPack is available in [binarypack].
1.1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
The term "byte" is used in its now customary sense as a synonym for
"octet".
All multi-byte integers in this protocol are interpreted in network
byte order.
Where arithmetic is used, this specification uses the notation
familiar from the programming language C, except that the operator
"**" stands for exponentiation.
1.2. Notation
This specification uses a trivial notation for code bytes and the
bitfields in them the meaning of which should be mostly obvious.
More formally speaking, the meaning of the notation is:
Potential values for the code bytes themselves are expressed by
templates that represent 8-bit most-significant-bit-first binary
numbers (without any special prefix), where 0 stands for 0, 1 for 1,
and variable segments in these code byte templates are indicated by
sequences of the same letter such as kkkkkkk or ssss, the length of
which indicates the length of the variable segment in bits.
In the notation of values derived from the code bytes, 0b is used as
a prefix for expressing binary numbers in most-significant-bit first
notation (akin to the use of 0x for most-significant-digit-first
hexadecimal numbers in the C programming language). Where the
abovementioned sequences of letters are then referenced in such a
binary number in the text, the intention is that the value from these
bitfields in the actual code byte be inserted.
Example: The code byte template
Bormann Expires April 14, 2013 [Page 3]
Internet-Draft binarypack October 2012
101nssss
stands for a byte that starts (most-significant-bit-first) with the
bits 1, 0, and 1, and continues with five variable bits, the first of
which is referenced as "n" and the next four are referenced as
"ssss". Based on this code byte template, a reference to
0b0ssss000
means a binary number composed from a zero bit, the four bits that
are in the "ssss" field (for 101nssss, the four least significant
bits) in the actual byte encountered, kept in the same order, and
three more zero bits.
Also, 0xhh stands for the hexadecimal value hh, and 1b, 2b, 4b, 8b,
nb stand for 1, 2, 4, 8, or n bytes of data following; (1b) etc.
stand for the numerical value as an integer interpreted in network
byte order; nd stands for n data objects, each in binarypack
representation format.
Bormann Expires April 14, 2013 [Page 4]
Internet-Draft binarypack October 2012
2. The binarypack Representation Format
2.1. Data Types
The binarypack representation format is able to represent the
following data types:
o Integers (represented in signed and unsigned forms)
o Floating point values (in IEEE 754 32-bit and 64-bit forms)
o special values nil, false, true
o opaque ("raw") byte strings
o UTF-8 strings
o arrays, which can contain any combination of data types
o tables (often called maps, hashes, dictionaries; objects in JSON),
which contain pairs, key and value, which may in turn be of any
data type
This list is mostly faithful to JSON [RFC4627], which however does
not distinguish integer from floating point number types. Based on
recent discussions on the use of binary representation formats, this
specification distinguishes UTF-8 strings from opaque binary strings.
(Interestingly, this was already done in the binaryjs implementation
of a "95 % MessagePack" format [binarypack], so the author of the
present specification could just lazily copy that.)
2.2. Integers
binarypack provides a number of representations for integer values,
assuming that these occur often. The encoder is free to choose any
of these representations that is able to represent the desired value.
Bormann Expires April 14, 2013 [Page 5]
Internet-Draft binarypack October 2012
+----------+--------------+---------------------------+
| Bits | Value | Description |
+----------+--------------+---------------------------+
| 0nnnnnnn | 0bnnnnnnn | Positive Fixnum (0..127) |
| | | |
| 111nnnnn | 0bnnnnn - 32 | Negative Fixnum (-32..-1) |
| | | |
| 0xcc 1b | 1b as uint | Unsigned Integer |
| | | |
| 0xcd 2b | 2b as uint | Unsigned Integer |
| | | |
| 0xce 4b | 4b as uint | Unsigned Integer |
| | | |
| 0xcf 8b | 8b as uint | Unsigned Integer |
| | | |
| 0xd0 1b | 1b as sint | Signed Integer |
| | | |
| 0xd1 2b | 2b as sint | Signed Integer |
| | | |
| 0xd2 4b | 4b as sint | Signed Integer |
| | | |
| 0xd3 8b | 8b as sint | Signed Integer |
+----------+--------------+---------------------------+
2.3. Floating Point Values
binarypack provides 32-bit and 64-bit IEEE 754 values.
+---------+-----------------------+-------------+
| Bits | Value | Description |
+---------+-----------------------+-------------+
| 0xca 4b | 4b as 32-bit IEEE 754 | Float |
| | | |
| 0xcb 8b | 8b as 64-bit IEEE 754 | Double |
+---------+-----------------------+-------------+
2.4. Special Values
+------+-------+---------------+
| Bits | Value | Description |
+------+-------+---------------+
| 0xc0 | nil | null, nothing |
| | | |
| 0xc2 | false | Boolean false |
| | | |
| 0xc3 | true | Boolean true |
+------+-------+---------------+
Bormann Expires April 14, 2013 [Page 6]
Internet-Draft binarypack October 2012
2.5. Opaque Byte Strings
+-------------+------------+----------------------------------+
| Bits | Value | Description |
+-------------+------------+----------------------------------+
| 1010nnnn nb | n = 0bnnnn | Short byte string (0..15 bytes) |
| | | |
| 0xda 2b nb | n = (2b) | byte string (0..(2**16-1) bytes) |
| | | |
| 0xdb 4b nb | n = (4b) | byte string (0..(2**32-1) bytes) |
+-------------+------------+----------------------------------+
2.6. UTF-8 Strings
+-------------+------------+-----------------------------------+
| Bits | Value | Description |
+-------------+------------+-----------------------------------+
| 1011nnnn nb | n = 0bnnnn | Short UTF-8 string (0..15 bytes) |
| | | |
| 0xd8 2b nb | n = (2b) | UTF-8 string (0..(2**16-1) bytes) |
| | | |
| 0xd9 4b nb | n = (4b) | UTF-8 string (0..(2**32-1) bytes) |
+-------------+------------+-----------------------------------+
The assumption is that the UTF-8 strings are in net unicode
[RFC5198].
2.7. Arrays
+-------------+------------+------------------------------------+
| Bits | Value | Description |
+-------------+------------+------------------------------------+
| 1001nnnn nd | n = 0bnnnn | Short array (0..15 data elements) |
| | | |
| 0xdc 2b nd | n = (2b) | array (0..(2**16-1) data elements) |
| | | |
| 0xdd 4b nd | n = (4b) | array (0..(2**32-1) data elements) |
+-------------+------------+------------------------------------+
Bormann Expires April 14, 2013 [Page 7]
Internet-Draft binarypack October 2012
2.8. Tables
+-------------+----------------+---------------------------------+
| Bits | Value | Description |
+-------------+----------------+---------------------------------+
| 1000nnnn nd | n = 2 * 0bnnnn | Short table (0..15 data pairs) |
| | | |
| 0xde 2b nd | n = 2 * (2b) | table (0..(2**16-1) data pairs) |
| | | |
| 0xdf 4b nd | n = 2 * (4b) | table (0..(2**32-1) data pairs) |
+-------------+----------------+---------------------------------+
The sequence of n elements is a sequence of pairs of one key followed
by its associated value.
Bormann Expires April 14, 2013 [Page 8]
Internet-Draft binarypack October 2012
3. Discussion
This draft tries to be faithful to the successful MessagePack
[msgpack] format, with the exception of enabling the distinction
between opaque binary byte strings and UTF-8 byte strings, as
introduced in the binaryjs project [binarypack].
Little analysis has been made whether a slightly different bit
allocation (e.g., using up fewer of the code combination for fixnums)
would be advantageous.
A short floating point (e.g., based on the 16-bit IEEE 754 floating
point value) might be useful. Adding decimal floating point values
probably is not so useful, except where high fidelity to JSON is
desired.
Some additional data types might be useful for some protocols, e.g.
UUIDs. This would further increase the distance from JSON created by
distinguishing opaque and UTF-8 strings.
3.1. JSON roundtripping
binarypack enables mostly lossless translation to JSON. JSON
[RFC4627]. JSON roundtripping, however, is not necessarily the
primary design goal of binarypack, but it is a consideration.
In the translation of binarypack to JSON, opaque byte strings SHOULD
be converted to equivalent base64url [RFC4648] UTF-8 strings.
Without a schema, it is hard to do the inverse consistently, as
base64url encoded byte strings are not specially marked up in JSON.
When translating binarypack floating point values to JSON, the usual
problem of converting binary fractions to decimal representation
arises. In the other direction, the choice of a floating point
format may be hard to do properly. Clearly, any number that can be
transformed from a 64-bit IEEE 754 number to a 32-bit IEEE 754 number
without loss of information can be represented as the latter.
Without schema information, it may be hard to find other cases where
the precision maybe is not that important.
Bormann Expires April 14, 2013 [Page 9]
Internet-Draft binarypack October 2012
4. IANA Considerations
Once this has received some discussion, we will understand how
exactly to register Internet media types for this.
Bormann Expires April 14, 2013 [Page 10]
Internet-Draft binarypack October 2012
5. Security Considerations
(Nothing but generic warnings about correctly implementing protocol
encoders/decoders so far; this section will certainly grow as
additional security considerations become known.)
Bormann Expires April 14, 2013 [Page 11]
Internet-Draft binarypack October 2012
6. Acknowledgements
MessagePack was developed and promoted by Sadayuki Furuhashi
("frsyuki").
BinaryPack is a minor derivation of MessagePack that was developed by
Eric Zhang for the binaryjs project.
The author of the present specification deserves absolutely no
credits whatsoever for any of this.
Bormann Expires April 14, 2013 [Page 12]
Internet-Draft binarypack October 2012
7. References
7.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data
Encodings", RFC 4648, October 2006.
[RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
Interchange", RFC 5198, March 2008.
7.2. Informative References
[RFC4627] Crockford, D., "The application/json Media Type for
JavaScript Object Notation (JSON)", RFC 4627, July 2006.
[binarypack]
Zhang, E., "BinaryPack for Javascript browsers", 2012,
<https://github.com/binaryjs/js-binarypack>.
[msgpack] Ohta, K. and S. Colebourne, "MessagePack format
specification", 2011, <http://wiki.msgpack.org/display/
MSGPACK/Format+specification>.
Bormann Expires April 14, 2013 [Page 13]
Internet-Draft binarypack October 2012
Author's Address
Carsten Bormann
Universitaet Bremen TZI
Postfach 330440
Bremen D-28359
Germany
Phone: +49-421-218-63921
Email: cabo@tzi.org
Bormann Expires April 14, 2013 [Page 14]