Network Working Group B. Greevenbosch
Internet-Draft Huawei Technologies
Intended status: Informational January 2, 2014
Expires: July 6, 2014
CBOR data definition language: a notational convention to express CBOR
data structures.
draft-greevenbosch-appsawg-cbor-cddl-00
Abstract
This document proposes a notational convention to express CBOR data
structures. Its main goal is to make it easy to express message
structures for protocols that use CBOR.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 6, 2014.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Greevenbosch Expires July 6, 2014 [Page 1]
Internet-Draft CBOR notation January 2014
Table of Contents
1. Requirements notation . . . . . . . . . . . . . . . . . . . . 2
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
4. Notational conventions . . . . . . . . . . . . . . . . . . . 3
4.1. General conventions . . . . . . . . . . . . . . . . . . . 3
4.2. Keywords for data types . . . . . . . . . . . . . . . . . 3
4.3. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.4. Structures . . . . . . . . . . . . . . . . . . . . . . . 4
4.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.6. Constants . . . . . . . . . . . . . . . . . . . . . . . . 5
4.7. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.8. Optional variables . . . . . . . . . . . . . . . . . . . 8
5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.1. Moves in a computer game . . . . . . . . . . . . . . . . 9
5.2. Fruit . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6. Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . 14
7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 14
8. Security considerations . . . . . . . . . . . . . . . . . . . 14
9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 15
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15
11. Normative References . . . . . . . . . . . . . . . . . . . . 15
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15
1. Requirements notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Introduction
In this document, a notational convention to express CBOR [RFC7049]
data structures is defined.
The main goal for the convention is to provide a unified notation
that can be used when defining protocols that use CBOR.
The CBOR notational convention has the following goals:
(G1) Able to provide an unambiguous description of a CBOR data
structures.
(G2) Easy for humans to read and write.
(G3) Flexibility to express the freedoms of choice in the CBOR data
format.
Greevenbosch Expires July 6, 2014 [Page 2]
Internet-Draft CBOR notation January 2014
(G4) Possibility to restrict format choices where appropriate.
(G5) Able to express common CBOR data types and structures.
(G6) Human and machine readable and processable.
(G7) Usable for automatic verification of whether CBOR data is
compliant to a predefined format.
3. Definitions
The following contains a list of used words in this document:
"datatype" defines the format of a variable.
"variable" a data component encoded in CBOR.
4. Notational conventions
4.1. General conventions
The basic syntax is as follows:
o Each field has a name and a type.
o The name is written first, followed by a colon and then the type.
The declarations is finished with a semicolon. Whitespace may
appear around the colon and semicolon, as well as in front of the
name.
o The name does not appear in the actual CBOR encoding.
o If there is a following field, then the type of the previous field
is followed by a whitespace and then the name of the following
field.
o Comments are preceded by a '#' character.
4.2. Keywords for data types
The following keywords are used:
"bool" Boolean value (major type 7, additional information 20 or
21).
"bstr" A byte string (major type 2).
Greevenbosch Expires July 6, 2014 [Page 3]
Internet-Draft CBOR notation January 2014
"float(16)" IEEE 754 half-precision float (major type 7, additional
information 25).
"float(32)" IEEE 754 single-precision float (major type 7,
additional information 26).
"float(64)" IEEE 754 double-precision float (major type 7,
additional information 27).
"int" An unsigned integer (mayor type 0) or a negative integer
(mayor type 1).
"nint" A negative integer (mayor type 1).
"simple" Simple value (mayor type 7, additional information 24).
"tstr" Text string (major type 3)
"uint" An unsigned integer (mayor type 0).
4.3. Arrays
Arrays can be of fixed length or of variable length. Both fixed
length and variable length arrays can be implemented as definite and
indefinite length arrays.
A fixed length array is is indicated by '[' and ']' characters behind
its type, where number in between specifies the number of elements.
A variable length array can be indicated with a "*" behind its type.
4.4. Structures
Structures are a logical grouping of CBOR fields.
A structure has a name, which can be used as a value type for other
fields. The name is followed by a '{' character and the definitions
of the variables inside of the structure. The structure is closed by
a '}' character.
A structure MAY be encoded as an array, in which case its name is
preceded by a '*' character. Otherwise there is no CBOR encoding for
the grouping.
Greevenbosch Expires July 6, 2014 [Page 4]
Internet-Draft CBOR notation January 2014
4.5. Maps
If an entity is a map (mayor type 5), it its datatype has the form
map( x, y )
where the keys have datatype x, and the values a datatype y.
If either x or y is unspecified (i.e. free to choose per entry), it
is replaced by a '.'.
4.6. Constants
In some contexts, it is useful to give special values a name. These
constants are defined using the "const" construct.
The "const" construct has the form
x : const( y ) {
... bundle of constants ...
}
where x is a name for the bundle of constants, and y is the datatype
of the values.
The bundle of constants consists of a list of name value pairs. The
list is encapsulated by a starting '{' and a closing '}' character.
The name x defines a datatype that can be used for variables that
take values from the const struct.
For example, the "const" construct
Weekday const( uint ) {
Sunday : 1;
Monday : 2;
Tuesday : 3;
Wednesday : 4;
Thursday : 5;
Friday : 6;
Saturday : 7;
}
defines integer values associated with the week days.
An variable using this structure could be as follows:
weekday Weekday;
Greevenbosch Expires July 6, 2014 [Page 5]
Internet-Draft CBOR notation January 2014
and would be encoded as an unsigned int.
Since the weekdays are defined as part of a Weekday structure, they
can also be referenced as "Weekday.Sunday", "Weekday.Monday", ...,
"Weekday.Saturday".
The definition could also have been as follows:
const {
Sunday : 1;
Monday : 2;
Tuesday : 3;
Wednesday : 4;
Thursday : 5;
Friday : 6;
Saturday : 7;
}
In this case, the weekdays are just referred to as "Sunday",
"Monday", ..., "Saturday". However, since it has no name, the
"const" construct cannot be used as a datatype for CBOR variables.
Since this also makes the suffix "(uint)" superfluous, that suffix
has been omitted.
The definition of the datatype can also be left to the definition of
the variable, in which case the datatype is encapsulated in round
brackets and follows the datatype. In this case the datatype is
omitted in the definition of the constants.
The following example illustrates this:
Weekday const {
Sunday : 1;
Monday : 2;
Tuesday : 3;
Wednesday : 4;
Thursday : 5;
Friday : 6;
Saturday : 7;
}
weekday Weekday( uint );
TBD: there may be too many options for this. We could consider
omitting the "const( x )" syntax and mandate definition of the true
datatype when defining a CBOR variable.
Greevenbosch Expires July 6, 2014 [Page 6]
Internet-Draft CBOR notation January 2014
4.7. Tags
A variable can have an associated CBOR tag (major type 6). This is
indicated by the tag encapsulated between the square brackets '[' and
']', just before the variable's datatype definition.
For example, the following defines a positive bignum N:
N : [2]bstr;
The tag may also be indicated using values from the following "const"
struct:
Tag const {
StandardDT : 0;
EpochDT : 1;
PBigNum : 2;
NBigNum : 3;
DFraction : 4;
BigFloat : 5;
URI : 32;
Base64URL : 33;
Base64 : 34;
RegEx : 35;
MimeMsg : 36;
}
We refer to [RFC7049] for the semantics of these tags.
Using above constants, the definition of N can also be as follows:
N : [Tag.PBigNum]bstr;
A abbreviation of a tagged datatype can be defined using the
following construct:
x = [y]z;
where x is the abbreviation, y is the tag and z is the datatype.
For example, once again we can define N, now as follows:
BigNum = [Tag.PBigNum]bstr;
N : BigNum;
Greevenbosch Expires July 6, 2014 [Page 7]
Internet-Draft CBOR notation January 2014
4.8. Optional variables
There may be variables or structures whose inclusion is optional. In
this case, the name of the variable is preceded by a '?'.
For example, the following defines a CBOR structure that is dependent
on a boolean value.
*MainStruct {
whichForm : bool;
?data1 : Form1; # when whichForm == true
?data2 : Form2; # when whichForm == false
}
Form1 {
anInteger : int;
aTextString : tstr;
}
Form2 {
aFloat : float(16);
aBinaryString : bstr;
}
Notice that it is not possible to define the relationship between
"whichForm" and inclusion of either "data1" or "data2" with CBOR
content rules. Such relationship should be otherwise communicated to
the implementer, for example in the text of the specification that
uses the CBOR structure, or with comments as was done in this
example.
Protocol designers should exhibit utmost care when defining CBOR
structures with optional variables, especially when some of these
variables have the same datatype.
For example, the following CBOR data structure is ambiguous:
*DataStruct {
?OptionalVariable : uint;
MandatoryVariable : uint;
?AnotherOptionalVariable : uint;
}
Since optional variables are often detected from their datatype, it
is RECOMMENDED to not have a following of multiple variables of the
same datatype, when some of these variables are optional.
Greevenbosch Expires July 6, 2014 [Page 8]
Internet-Draft CBOR notation January 2014
5. Examples
This section contains various examples of structures defined using
the CBOR notational convention.
5.1. Moves in a computer game
A multiplayer computer game uses CBOR to exchange moves between the
players. To ensure a good game experience, the move information
needs to be exchanged quickly and frequently. Therefore, the game
uses CBOR to send its information in a compact format. Figure 1
shows definition of the CBOR information exchange format.
Greevenbosch Expires July 6, 2014 [Page 9]
Internet-Draft CBOR notation January 2014
*UpdateMsg {
move_no : uint; # increases for each move
player_info : PlayerInfo; # general information
moves : Moves*; # moves in this message
}
PlayerInfo {
alias : tstr;
player_id : uint;
experience : Experience;
gold : uint;
supplies : map( Supplies, uint );
avg_strength : float(16);
}
Experience const( uint ) {
Beginner : 0;
Amateur : 1;
Professional : 2;
Expert : 3;
}
Supplies const( uint ) {
Wood : 0;
Iron : 1;
Grain : 2;
}
*Moves {
unit_id : uint;
unit_strength : uint; # between 0 and 100
source_pos : uint[2]; # (x,y)
target_pos : uint[2]; # (x,y)
}
Figure 1: CBOR definition of an information exchange format for a
computer game
Player "Johnny" does two moves. The game server has assigned Johnny
the ID 0x7a3b871f. Johnny is an amateur player, and currently has
1200 gold. He has 13 units of wood, 70 units of iron and 29 units of
grain. He has several units, with a total average strength of 30.25.
The units Johnny plays in move 250 are the unit with ID 19, strength
20 from (5,7) to (6,9), and the unit with ID 87, strength 40 from
(7,10) to (6,10).
This information is coded in CBOR as depicted in Figure 2.
Greevenbosch Expires July 6, 2014 [Page 10]
Internet-Draft CBOR notation January 2014
9F
18 FA # move 250
66 4A 6F 68 6E 6E 79 # "Johnny"
1A 7A 3B 87 1F # player_id
01 # experience, "amateur"
19 04 B0 # 1200 gold as uint
A3 # begin map "supplies" with 3 elements
00 # "wood":
0C # 13 as uint
01 # "iron":
18 86 # 70 as uint
02 # "grain":
18 1D # 29 as uint
F9 4F 90 # average strength 30.25 half-precision float
9F # indefinite length "moves" array
84 # 4-element array Moves
13 # unit id 19 as uint
14 # strength 20 as uint
82 # 2-element array source_pos
05 # source_pos.x=5
07 # source_pos.y=7
82 # 2-element array target_pos
06 # target_pos.x=6
09 # target_pos.y=9
84 # 4-element array Moves
18 57 # unit id 87
18 28 # strength 40
82 # 2-element array source_pos
07 # source_pos.x=7
0a # source_pos.y=10
82 # 2-element array target_pos
06 # target_pos.x=6
0a # target_pos.y=10
FF # end of "moves" array
FF
Figure 2: CBOR instance for game example
5.2. Fruit
Figure 3 contains an example for a CBOR structure that contains
information about fruit.
Greevenbosch Expires July 6, 2014 [Page 11]
Internet-Draft CBOR notation January 2014
fruitlist : Fruit*;
*Fruit {
name : tstr;
colour : Colour[];
avg_weight : float( 16 );
price : uint;
international_names : map( Lang, tstr );
rfu : bstr; # reserved for future use
}
Colour const( uint ) {
black : 0;
red : 1;
green : 2;
yellow : 3;
blue : 4;
magenta : 5;
cyan : 6;
white : 7;
orange : 8;
pink : 9;
purple : 10;
brown : 11;
grey : 12;
}
Lang const( tstr ) {
Chinese : "CN";
Dutch : "NL";
English : "EN";
French : "FR";
German : "DE";
}
Figure 3: Example CBOR structure
For example, apples can be red, yellow or green. They have an
average weight of 0.195kg and a price of 30 cents. Chinese for
"apple" in UTF-8 is [ E8 8B B9 E6 9E 9C ], the Dutch word is "appel"
and the French word "pomme".
For simplicity, let's assume that the colour of oranges can only be
orange. They have an average weight of 0.230kg and a price of 50
cents. Chinese for "orange" in UTF-8 is [ E6 A9 99 E5 AD 90 ], the
Dutch word is "sinaasappel" and the German word "Orange".
This information would be encoded as depicted in Figure 4.
Greevenbosch Expires July 6, 2014 [Page 12]
Internet-Draft CBOR notation January 2014
9F # indefinite length "fruitlist" array
86 # First "Fruit" instance, 6 elements
65 # text string "name" length 5
61 70 70 6C 65 # "apple"
83 # array for "Colour", 3 elements
01 # "red" as uint
02 # "green" as uint
03 # "yellow" as uint
F9 # Floating point half precision
32 3D # "avg_weight" 0.195
18 1E # "price" 30 as uint
A3 # map "international_names", 3 pairs
62 43 4E # text string length 2, "CN"
66 E8 8B B9 E6 9E 9C # Chinese word for apple
62 4E 4C # "NL"
65 61 70 70 65 6C # "appel"
62 46 52 # "FR"
65 70 6F 6D 6D 65 # "pomme"
40 # byte string "rfu", 0 bytes length
86 # Second "Fruit" instance
66 # text string "name" length 6
6F 72 61 6E 67 65 # "orange"
81 # array for "Colour", 3 elements
08 # "orange" as uint
F9 # Floating point half precision
33 5C # "avg_weight" 0.230
18 32 # "price" 50 as uint
A3 # map "international_names", 3 pairs
62 43 4E # text string length 2, "CN"
66 E6 A9 99 E5 AD 90 # Chinese word for orange
62 4E 4C # "NL"
6B 73 69 6E 61 61 73 61 70 70 65 6C # "sinaasappel"
62 44 45 # "DE"
66 4F 72 61 6E 67 65 # "Orange"
40 # byte string "rfu", 0 bytes length
FF # end of "fruitlist" array
Figure 4: Example CBOR instance
Notice that if the "Fruit" structure did not have the preceding "*",
the two "Fruit" instance arrays would have been omitted. In
addition, the "fruitlist" array would have had 12 elements instead of
2. (Although for "fruitlist" the indefinite length approach was
chosen, such that the number of elements is not explicitely
signalled.)
Greevenbosch Expires July 6, 2014 [Page 13]
Internet-Draft CBOR notation January 2014
6. Philosophy
The CBOR notational convention can be used to efficiently define the
layout of CBOR data.
In addition, it has been specified such that a machine can verify
whether or not CBOR data is compliant to its definition.
The matter in how far the data description must be enforced by an
application is left solely to the implementers and specifiers of that
application. For example, an application may decide not to verify
the data structure at all, and use the CBOR content rules solely as a
means to indicate the structure of the data to the programmer. On
the other hand, the application may also implement a verification
method that goes as far as verifying that variables that depend on
the "const" construction actually only take values defined in that
construction.
The content rules do not specify the length of a CBOR integer. But
this can be done in the text specification of a protocol that uses
CBOR.
7. Open Issues
At least the following issues need further consideration:
o Whether or not to allow optional variables.
o Removal of some "const" construct possibilities.
o Definition of constants for missing tags.
o More extensive security considerations.
o The various flavours of consts and tags increase implementation
complexity of a verifier. It is to be considered which flavours
provide enough benefit to justify their implementation complexity.
o For optional inclusion, one could define structures such as
"switch"/"case". However, this would again increase complexity,
and would cater only for cases where inclusion is dependent on a
simple variable.
8. Security considerations
This document presents a content rules language for expressing CBOR
data structures. As such, it does not bring any security issues on
Greevenbosch Expires July 6, 2014 [Page 14]
Internet-Draft CBOR notation January 2014
itself, although specification of protocols that use CBOR naturally
need security analysis when defined.
9. IANA considerations
This document does not require any IANA registrations.
10. Acknowledgements
For this draft, there has been inspiration from the C and Pascal
languages, MPEG's conventions for describing structures in the ISO
base media file format, and Andrew Lee Newton's "JSON Content Rules"
draft.
Useful feedback came from Carsten Bormann.
11. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object
Representation (CBOR)", RFC 7049, October 2013.
Author's Address
Bert Greevenbosch
Huawei Technologies Co., Ltd.
Huawei Industrial Base
Bantian, Longgang District
Shenzhen 518129
P.R. China
Email: bert.greevenbosch@huawei.com
Greevenbosch Expires July 6, 2014 [Page 15]