Network Working Group                                             Y. Nir
Internet-Draft                                               Check Point
Intended status: Informational                          January 27, 2014
Expires: July 31, 2014


                ChaCha20 and Poly1305 for IETF protocols
                  draft-nir-cfrg-chacha20-poly1305-00

Abstract

   This document defines the ChaCha20 stream cipher, as well as the use
   of the Poly1305 authenticator, both as stand-alone algorithms, and as
   a :"combined mode", or Authenticated Encryption with Additional Data
   (AEAD) algorithm.

   This document does not introduce any new crypto, but is meant to
   serve as a stable reference and an implementation guide.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on July 31, 2014.

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.




Nir                       Expires July 31, 2014                 [Page 1]


Internet-Draft             ChaCha20 & Poly1305              January 2014


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Conventions Used in This Document  . . . . . . . . . . . .  3
   2.  The Algorithms . . . . . . . . . . . . . . . . . . . . . . . .  4
     2.1.  The ChaCha Quarter Round . . . . . . . . . . . . . . . . .  4
       2.1.1.  Test Vector for the ChaCha Quarter Round . . . . . . .  4
     2.2.  A Quarter Round on the ChaCha State  . . . . . . . . . . .  5
       2.2.1.  Test Vector for the Quarter Round on the ChaCha
               state  . . . . . . . . . . . . . . . . . . . . . . . .  5
     2.3.  The ChaCha20 block Function  . . . . . . . . . . . . . . .  6
       2.3.1.  Test Vector for the ChaCha20 Block Function  . . . . .  7
     2.4.  The ChaCha20 encryption algorithm  . . . . . . . . . . . .  8
       2.4.1.  Example and Test Vector for the ChaCha20 Cipher  . . .  8
     2.5.  The Poly1305 algorithm . . . . . . . . . . . . . . . . . . 10
       2.5.1.  Poly1305 Example and Test Vector . . . . . . . . . . . 12
     2.6.  Generating the Poly1305 key using ChaCha20 . . . . . . . . 13
     2.7.  AEAD Construction  . . . . . . . . . . . . . . . . . . . . 14
       2.7.1.  Example and Test Vector for AEAD_CHACHA20-POLY1305 . . 15
   3.  Implementation Advice  . . . . . . . . . . . . . . . . . . . . 17
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 17
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 19
   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 19
     7.2.  Informative References . . . . . . . . . . . . . . . . . . 19
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20
























Nir                       Expires July 31, 2014                 [Page 2]


Internet-Draft             ChaCha20 & Poly1305              January 2014


1.  Introduction

   The Advanced Encryption Standard (AES - [FIPS-197]) has become the
   gold standard in encryption.  Its efficient design, wide
   implementation, and hardware support allow for high performance in
   many areas.  On most modern platforms, AES is anywhere from 4x to 10x
   as fast as the previous most-used cipher, 3-key Data Encryption
   Standard (3DES - [FIPS-46]), which makes it not only the best choice,
   but the only choice.

   The problem is that if future advances in cryptanalysis reveal a
   weakness in AES, users will be in an unenviable position.  With the
   only other widely supported cipher being the much slower 3DES, it is
   not feasible to re-configure implementations to use 3DES.
   [standby-cipher] describes this issue and the need for a standby
   cipher in greater detail.

   This document defines such a standby cipher.  We use ChaCha20
   ([chacha]) with or without the Poly1305 ([poly1305]) authenticator.
   These algorithms are not just fast and secure.  They are fast even if
   software-only C-language implementations, allowing for much quicker
   deployment when compared with algorithms such as AES that are
   significantly accelerated by hardware implementations.

   These document does not introduce these new algorithms.  They have
   been defined in scientific papers by D. J. Bernstein, which are
   referenced by this document.  The purpose of this document is to
   serve as a stable reference for IETF documents making use of these
   algorithms.

1.1.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The description of the ChaCha algorithm will at various time refer to
   the ChaCha state as a "vector" or as a "matrix".  This follows the
   use of these terms in DJB's paper.  The matrix notation is more
   visually convenient, and gives a better notion as to why some rounds
   are called "column rounds" while others are called "diagonal rounds".
   Here's a diagram of how to martices relate to vectors (using the C
   language convention of zero being the index origin).

      0   1   2   3
      4   5   6   7
      8   9  10  11
     12  13  14  15



Nir                       Expires July 31, 2014                 [Page 3]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   The elements in this vector or matrix are 32-bit unsigned integers.

   The algorithm name is "ChaCha".  "ChaCha20" is a specific instance
   where 20 "rounds" (or 80 quarter rounds - see Section 2.1) are used.
   Other variations are defined, with 8 or 12 rounds, but in this
   document we only describe the 20-round ChaCha, so the names "ChaCha"
   and "ChaCha20" will be used interchangeably.


2.  The Algorithms

   The subsections below describe the algorithms used and the AEAD
   construction.

2.1.  The ChaCha Quarter Round

   The basic operation of the ChaCha algorithm is the quarter round.  It
   operates on four 32-bit unsigned integers, denoted a, b, c, and d.
   The operation is as follows (in C-like notation):
   o  a += b; d ^= a; d <<<= 16;
   o  c += d; b ^= c; b <<<= 12;
   o  a += b; d ^= a; d <<<= 8;
   o  c += d; b ^= c; b <<<= 7;
   Where "+" denotes integer addition without carry, "^" denotes a
   bitwise XOR, and "<<< n" denotes an n-bit left rotation (towards the
   high bits).

   For example, let's see the add, XOR and roll operations from the
   first line with sample numbers:
   o  b = 0x01020304
   o  a = 0x11111111
   o  d = 0x01234567
   o  a = a + b = 0x11111111 + 0x01020304 = 0x12131415
   o  d = d ^ a = 0x01234567 ^ 0x12131415 = 0x13305172
   o  d = d<<<16 = 0x51721330

2.1.1.  Test Vector for the ChaCha Quarter Round

   For a test vector, we will use the same numbers as in the example,
   adding something random for c.
   o  a = 0x11111111
   o  b = 0x01020304
   o  c = 0x9b8d6f43
   o  d = 0x01234567

   After running a Quarter Round on these 4 numbers, we get these:





Nir                       Expires July 31, 2014                 [Page 4]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   o  a = 0xea2a92f4
   o  b = 0xcb1cf8ce
   o  c = 0x4581472e
   o  d = 0x5881c4bb

2.2.  A Quarter Round on the ChaCha State

   The ChaCha state does not have 4 integer numbers, but 16.  So the
   quarter round operation works on only 4 of them - hence the name.
   Each quarter round operates on 4 pre-determined numbers in the ChaCha
   state.  We will denote by QUATERROUND(x,y,z,w) a quarter-round
   operation on the numbers at indexes x, y, z, and w of the ChaCha
   state when viewed as a vector.  For example, if we apply
   QUARTERROUND(1,5,9,13) to a state, this means running the quarter
   round operation on the elements marked with an asterisk, while
   leaving the others alone:

      0  *a   2   3
      4  *b   6   7
      8  *c  10  11
     12  *d  14  15

   Note that this run of quarter round is part of what is called a
   "column round".

2.2.1.  Test Vector for the Quarter Round on the ChaCha state

   For a test vector, we will use a ChaCha state that was generated
   randomly:

   Sample ChaCha State

       879531e0  c5ecf37d  516461b1  c9a62f8a
       44c20ef3  3390af7f  d9fc690b  2a5f714c
       53372767  b00a5631  974c541a  359e9963
       5c971061  3d631689  2098d9d6  91dbd320

   We will apply the QUARTERROUND(2,7,8,13) operation to this state.
   For obvious reasons, this one is part of what is called a "diagonal
   round":











Nir                       Expires July 31, 2014                 [Page 5]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   After applying QUARTERROUND(2,7,8,13)

       879531e0  c5ecf37d  bdb886dc  c9a62f8a
       44c20ef3  3390af7f  d9fc690b  cfacafd2
       e46bea80  b00a5631  974c541a  359e9963
       5c971061  ccc07c79  2098d9d6  91dbd320

   Note that only the numbers in positions 2, 7, 8, and 13 changed.

2.3.  The ChaCha20 block Function

   The ChaCha block function transforms a ChaCha state by running
   multiple quarter rounds.

   The inputs to ChaCha20 are:
   o  A 256-bit key, treated as a concatenation of 8 32-bit little-
      endian integers.
   o  A 32-bit sender ID, treated as a little-endian integer.
   o  A 64-bit nonce, treated as a concatenation of 2 32-bit little-
      endian integers.
   o  A 32-bit block count parameter, treated as a 32-bit little-endian
      integer.

   The output is 64 random-looking bytes.

   The ChaCha algorithm described here uses a 256-bit key.  The original
   algorithm also specified 128-bit keys and 8- and 12-round variants,
   but these are out of scope for this document.  In this section we
   describe the ChaCha block function.

   The ChaCha20 as follows:
   o  The first 4 words (0-3) are constants: 0x61707865, 0x3320646e,
      0x79622d32, 0x6b206574.
   o  The next 8 words (4-11) are taken from the 256-bit key by reading
      the bytes in little-endian order, in 4-byte chunks.
   o  Word 12 is a block counter.  Since each block is 64-byte, a 32-bit
      word is enough for 256 Gigabytes of data.
   o  Word 13 is called Sender ID, or SID.  Note that in the original
      ChaCha20 word 13 is also part of the block count, in case the
      encrypted data exceeds 256 gigabytes.  That is not necessary for
      protocols such as TLS, IPsec, or SSH.
   o  Words 14-15 are a nonce, which should not be repeated for the same
      combination of key and sender ID.  The 14th word is the least
      significant 32 bits of the input nonce (nonce & 0xffffffff), while
      the 15th word is the most significant 32 bits (nonce >> 32).

   ChaCha20 runs 20 rounds, alternating between "column" and "diagonal"
   rounds.  Each round is 4 quarter-rounds, and they are run as follows.



Nir                       Expires July 31, 2014                 [Page 6]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   Rounds 1-4 are part of the "column" round, while 5-8 are part of the
   "diagonal" round:
   1.  QUARTERROUND ( 0, 4, 8,12)
   2.  QUARTERROUND ( 1, 5, 9,13)
   3.  QUARTERROUND ( 2, 6,10,14)
   4.  QUARTERROUND ( 3, 7,11,15)
   5.  QUARTERROUND ( 0, 5,10,15)
   6.  QUARTERROUND ( 1, 6,11,12)
   7.  QUARTERROUND ( 2, 7, 8,13)
   8.  QUARTERROUND ( 3, 4, 9,14)

   At the end of 20 rounds, the original input words are added to the
   output words, and the result is serialized by sequencing the words
   one-by-one in little-endian order.

2.3.1.  Test Vector for the ChaCha20 Block Function

   For a test vector, we will use the following inputs to the ChaCha20
   block function:
   o  Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
      14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.  The key is a sequence of
      octets with no particular structure before we copy it into the
      ChaCha state.
   o  Nonce = 74 (00:00:00:00:00:00:00:4a)
   o  Sender ID = 00:00:00:09.  Usually this will be all zeros, but
      we've set it to non-zero here so it will be visually conspicuous.
   o  Block Count = 1.

   After setting up the ChaCha state, it looks like this:

   ChaCha State with the key set up.

       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000001  09000000  4a000000  00000000

   After running 20 rounds (10 column rounds interleaved with 10
   diagonal rounds), the ChaCha state looks like this:

   ChaCha State after 20 rounds

       837778ab  e238d763  a67ae21e  5950bb2f
       c4f2d0c7  fc62bb2f  8fa018fc  3f5ec7b7
       335271c2  f29489f3  eabda8fc  82e46ebd
       d19c12b4  b04e16de  9e83d0cb  4e3c50a2

   Finally we add the original state to the result (simple vector or



Nir                       Expires July 31, 2014                 [Page 7]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   matrix addition), giving this:

   ChaCha State at the end of the ChaCha20 operation

       e4e7f110  15593bd1  1fdd0f50  c47120a3
       c7f4d1c7  0368c033  9aaa2204  4e6cd4c3
       466482d2  09aa9f07  05d7c214  a2028bd9
       d19c12b5  b94e16de  e883d0cb  4e3c50a2

2.4.  The ChaCha20 encryption algorithm

   ChaCha20 is a stream cipher designed by D. J. Bernstein.  It is a
   refinement of the Salsa20 algorithm, and uses a 256-bit key.

   ChaCha20 successively calls the ChaCha20 block function, with the
   same key, sender ID and nonce, and with successively increasing block
   counter parameters.  The resulting state is then serialized by
   writing the numbers in little-endian order.  Concatenating the
   results from the successive blocks forms a key stream, which is then
   XOR-ed with the plaintext.  There is no requirement for the plaintext
   to be an integral multiple of 512-bits.  If there is extra keystream
   from the last block, it is discarded.  Specific protocols MAY require
   that the plaintext and ciphertext have certain length.  Such
   protocols need to specify how the plaintext is padded, and how much
   padding it receives.

   The inputs to ChaCha20 are:
   o  A 256-bit key
   o  A 32-bit sender ID.  Ordinarily, this will be zero.  However, in
      protocols that allow multiple senders for the same keys, different
      sender IDs MAY be used.
   o  A 32-bit initial counter.  This can be set to any number, but will
      usually be zero or one.  It makes sense to use 1 if we use the
      zero block for something else, such as generating a one-time
      authenticator key as part of an AEAD algorithm.
   o  A 64-bit nonce.  In some protocols, this is known as the Initial
      Vector.
   o  an arbitrary-length plaintext

   The output is an encrypted message of the same length.

2.4.1.  Example and Test Vector for the ChaCha20 Cipher

   For a test vector, we will use the following inputs to the ChaCha20
   block function:
   o  Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
      14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.




Nir                       Expires July 31, 2014                 [Page 8]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   o  Nonce = 74 (00:00:00:00:00:00:00:4a).
   o  Sender ID = zero (0).
   o  Initial Counter = 1.

   We use the following for the plaintext.  It was chosen to be long
   enough to require more than one block, but not so long that it would
   make this example cumbersome (so, less than 3 blocks):

   Plaintext Sunscreen:
   000  4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c|Ladies and Gentl
   016  65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73|emen of the clas
   032  73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63|s of '99: If I c
   048  6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f|ould offer you o
   064  6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20|nly one tip for
   080  74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73|the future, suns
   096  63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69|creen would be i
   112  74 2e                                          |t.

   The following figure shows 4 ChaCha state matrices:
   1.  First block as it is set up.
   2.  Second block as it is set up.  Note that these blocks are only
       two bits apart - only the counter in position 12 is different.
   3.  Third block is the first block after the ChaCha20 block
       operation.
   4.  Final block is the second block after the ChaCha20 block
       operation was applied.
   After that, we show the keystream.

   First block setup:
       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000001  00000000  4a000000  00000000


   Second block setup:
       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000002  00000000  4a000000  00000000


   First block after block operation:
       f3514f22  e1d91b40  6f27de2f  ed1d63b8
       821f138c  e2062c3d  ecca4f7e  78cff39e
       a30a3b8a  920a6072  cd7479b5  34932bed
       40ba4c79  cd343ec6  4c2c21ea  b7417df0




Nir                       Expires July 31, 2014                 [Page 9]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   Second block after block operation:
       9f74a669  410f633f  28feca22  7ec44dec
       6d34d426  738cb970  3ac5e9f3  45590cc4
       da6e8b39  892c831a  cdea67c1  2b7e1d90
       037463f3  a11a2073  e8bcfb88  edc49139


   Keystream:
   22:4f:51:f3:40:1b:d9:e1:2f:de:27:6f:b8:63:1d:ed:8c:13:1f:82:3d:2c:06
   e2:7e:4f:ca:ec:9e:f3:cf:78:8a:3b:0a:a3:72:60:0a:92:b5:79:74:cd:ed:2b
   93:34:79:4c:ba:40:c6:3e:34:cd:ea:21:2c:4c:f0:7d:41:b7:69:a6:74:9f:3f
   63:0f:41:22:ca:fe:28:ec:4d:c4:7e:26:d4:34:6d:70:b9:8c:73:f3:e9:c5:3a
   c4:0c:59:45:39:8b:6e:da:1a:83:2c:89:c1:67:ea:cd:90:1d:7e:2b:f3:63

   Finally, we XOR the Keystream with the plaintext, yielding the
   Ciphertext:

   Ciphertext Sunscreen:
   000  6e 2e 35 9a 25 68 f9 80 41 ba 07 28 dd 0d 69 81|n.5.%h..A..(..i.
   016  e9 7e 7a ec 1d 43 60 c2 0a 27 af cc fd 9f ae 0b|.~z..C`..'......
   032  f9 1b 65 c5 52 47 33 ab 8f 59 3d ab cd 62 b3 57|..e.RG3..Y=..b.W
   048  16 39 d6 24 e6 51 52 ab 8f 53 0c 35 9f 08 61 d8|.9.$.QR..S.5..a.
   064  07 ca 0d bf 50 0d 6a 61 56 a3 8e 08 8a 22 b6 5e|....P.jaV....".^
   080  52 bc 51 4d 16 cc f8 06 81 8c e9 1a b7 79 37 36|R.QM.........y76
   096  5a f9 0b bf 74 a3 5b e6 b4 0b 8e ed f2 78 5e 42|Z...t.[......x^B
   112  87 4d                                          |.M

2.5.  The Poly1305 algorithm

   Poly1305 is a one-time authenticator designed by D. J. Bernstein.
   Poly1305 takes a 32-byte one-time key and a message and produces a
   16-byte tag.

   The original article ([poly1305]) is entitled "The Poly1305-AES
   message-authentication code", and the MAC function there requires a
   128-bit AES key, a 128-bit "additional key", and a 128-bit (non-
   secret) nonce.  AES is used there for encrypting the nonce, so as to
   get a unique (and secret) 128-bit string, but as the paper states,
   "There is nothing special about AES here.  One can replace AES with
   an arbitrary keyed function from an arbitrary set of nonces to 16-
   byte strings.".

   Regardless of how the key is generated, the key is partitioned into
   two parts, called "r" and "k". "k" MUST be unique and secret for each
   invocation (that is why it was originally obtained by encrypting a
   nonce), while "r" MAY be constant, but needs to be modified as
   follows before being used: ("r" is treated as a 16-octet little-
   endian number):



Nir                       Expires July 31, 2014                [Page 10]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   o  r[3], r[7], r[11], and r[15] are required to have their top four
      bits clear (be smaller than 16)
   o  r[4], r[8], and r[12] are required to have their bottom two bits
      clear (be divisible by 4)

   The following sample code clamps "r" to be appropriate.  Note that
   the parameter is a 32-byte key, and "r" is considered to be at offset
   16, so it's the 2nd half of the key:

   /*
   poly1305aes_test_clamp.c version 20050207
   D. J. Bernstein
   Public domain.
   */

   #include "poly1305aes_test.h"

   void poly1305aes_test_clamp(unsigned char kr[32])
   {
   #define r (kr + 16)
     r[3] &= 15;
     r[7] &= 15;
     r[11] &= 15;
     r[15] &= 15;
     r[4] &= 252;
     r[8] &= 252;
     r[12] &= 252;
   }

   The "k" portion MUST NOT be re-used, but it is perfectly acceptable
   to generate both "k" and "r" uniquely each time.  Because each of
   them is 128-bit, randomly generating them (see Section 2.6) is also
   acceptable.

   The inputs to Poly1305 are:
   o  A 256-bit one-time key
   o  An arbitrary length message

   The output is a 128-bit tag.

   First, the "r" value should be clamped.

   Next, set the constant "P" (for "polynomial") be 2^130-5:
   3fffffffffffffffffffffffffffffffb.  Also set a variable "accumulator"
   to zero.

   Next, divide the message into 16-byte blocks.  The last one might be
   shorter:



Nir                       Expires July 31, 2014                [Page 11]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   o  Read the block as a little-endian number.
   o  Add one bit beyond the number of octets.  For a 16-byte block this
      is equivalent to adding 2^128 to the number.  For the shorter
      block it can be 2^120, 2^112, or any power of two that is evenly
      divisible by 8, all the way down to 2^8.
   o  If the block is not 17 bytes long (the last block), pad it with
      zeros.  This is meaningless if you're treating it them as numbers.
   o  Add this number to the accumulator.
   o  Multiply by "r"
   o  Set the accumulator to the result modulu p.  To summarize: Acc =
      ((Acc+block)*r) % p.

   Finally, the value of the secret key "k" is added to the accumulator,
   and the 128 least significant bits are serialized in little-endian
   order to form the tag.

2.5.1.  Poly1305 Example and Test Vector

   For our example, we will dispense with generating the one-time key
   using AES, and assume that we got the following keying material:
   o  Key Material: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01:
      03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b
   o  k as an octet string: 01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:
      f5:1b
   o  k as a 128-bit number: 1BF54941AFF6BF4AFDB20DFB8A800301
   o  r before clamping: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8
   o  Clamped r as a number: 806D5400E52447C036D555408BED685.

   For our message, we'll use a short text:

   Message to be Authenticated:
   000  43 72 79 70 74 6f 67 72 61 70 68 69 63 20 46 6f|Cryptographic Fo
   016  72 75 6d 20 52 65 73 65 61 72 63 68 20 47 72 6f|rum Research Gro
   032  75 70                                          |up

   Since Poly1305 works in 16-byte chunks, the 34-byte message divides
   into 3 blocks.  In the following calculation, "Acc" denotes the
   accumulator and "Block" the current block:

   Block #1

   Acc = 00
   Block = 6f4620636968706172676f7470797243
   Block with 0x01 byte = 016f4620636968706172676f7470797243
   Acc + block = 016f4620636968706172676f7470797243
   (Acc+Block) * r =
        B83FE991CA66800489155DCD69E8426BA2779453994AC90ED284034DA565ECF
   Acc = ((Acc+Block)*r) % P = 2C88C77849D64AE9147DDEB88E69C83FC



Nir                       Expires July 31, 2014                [Page 12]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   Block #2

   Acc = 2C88C77849D64AE9147DDEB88E69C83FC
   Block = 6f7247206863726165736552206d7572
   Block with 0x01 byte = 016f7247206863726165736552206d7572
   Acc + block = 437FEBEA505C820F2AD5150DB0709F96E
   (Acc+Block) * r =
        21DCC992D0C659BA4036F65BB7F88562AE59B32C2B3B8F7EFC8B00F78E548A26
   Acc = ((Acc+Block)*r) % P = 2D8ADAF23B0337FA7CCCFB4EA344B30DE

   Last Block

   Acc = 2D8ADAF23B0337FA7CCCFB4EA344B30DE
   Block = 7075
   Block with 0x01 byte = 017075
   Acc + block = 2D8ADAF23B0337FA7CCCFB4EA344CA153
   (Acc + Block) * r =
        16D8E08A0F3FE1DE4FE4A15486ACA7A270A29F1E6C849221E4A6798B8E45321F
   ((Acc + Block) * r) % P = 28D31B7CAFF946C77C8844335369D03A7

   Adding k we get this number, and serialize if to get the tag:

   Acc + k = 2A927010CAF8B2BC2C6365130C11D06A8

   Tag: a8:06:1d:c1:30:51:36:c6:c2:2b:8b:af:0c:01:27:a9

2.6.  Generating the Poly1305 key using ChaCha20

   As said in Section 2.5, it is acceptable to generate the one-time
   Poly1305 pseudo-randomly.  This section proposes such a method.

   To generate such a key pair (k,r), we will use the ChaCha20 block
   function described in Section 2.3.  This assumes that we have a 256-
   bit session key for the MAC function, such as SK_ai and SK_ar in
   IKEv2, the integrity key in ESP and AH, or the client_write_MAC_key
   and server_write_MAC_key in TLS.  Any document that specifies the use
   of Poly1305 as a MAC algorithm for some protocol must specify that
   256 bits are allocated for the integrity key.

   The method is to call the block function with the following
   parameters:
   o  The 256-bit session integrity key is used as the ChaCha20 key.
   o  The specific protocol specifies the sender ID, but for any sender,
      all calls are made with the same sender ID.  Usually, this will be
      set to zero.  For the AEAD protocol specified in Section 2.7, the
      sender ID is set to the same value as for the encryption.





Nir                       Expires July 31, 2014                [Page 13]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   o  The block counter is set to zero.
   o  The protocol will specify a 64-bit nonce.  This MUST be unique per
      invocation with the same key, so it MUST NOT be randomly
      generated.  A counter is a good way to implement this, but other
      methods, such as an LFSR are also acceptable.

   After running the block function, we have a 512-bit state.  We take
   the first 256 bits or the serialized state, and use those as the one-
   time Poly1305 key.  The other 256 bits are discarded.

   Note that while many protocols have provisions for a nonce for
   encryption algorithms (often called Initialization Vectors, or IVs),
   they usually don't have such a provision for the MAC function.  In
   that case the per-invocation nonce will have to come from somewhere
   else, such as a message counter.

2.7.  AEAD Construction

   Note: Much of the content of this document, including this AEAD
   construction is taken from Adam Langley's draft ([agl-draft]) for the
   use of these algorithms in TLS.  The AEAD construction described here
   is called AEAD_CHACHA20-POLY1305.

   AEAD_CHACHA20-POLY1305 is an authenticated encryption with additional
   data algorithm.  The inputs to AEAD_CHACHA20-POLY1305 are:
   o  A 256-bit key
   o  A 32-bit sender ID
   o  A 64-bit nonce - different for each invocation with the
      combination of key and sender ID.
   o  An arbitrary length plaintext
   o  Arbitrary length additional data

   The ChaCha20 and Poly1305 primitives are combined into an AEAD that
   takes a 256-bit key and 64-bit IV as follows:
   o  First, a Poly1305 one-time key is generated from the 256-bit key,
      sender ID and nonce using the procedure described in Section 2.6.
   o  The ChaCha20 encryption function is called to encrypt the
      plaintext, using the same key, sender ID, nonce and plaintext, and
      with the initial counter set to 1.
   o  The Poly1305 function is called with the Poly1305 key calculated
      above, and a message constructed as a concatenation of the
      following:
      *  The additional data
      *  The length of the additional data in octets (as a 64-bit
         little-endian integer).  TBD: bit count rather than octets?
         network order?





Nir                       Expires July 31, 2014                [Page 14]


Internet-Draft             ChaCha20 & Poly1305              January 2014


      *  The ciphertext
      *  The length of the ciphertext in octets (as a 64-bit little-
         endian integer).  TBD: bit count rather than octets? network
         order?

   Decryption is pretty much the same thing.

   The output from the AEAD is twofold:
   o  A ciphertext of the same length as the plaintext.
   o  A 128-bit tag (the result of the Poly1305 function.

   A few notes about this design:
   1.  The amount of encrypted data possible in a single invocation is
       2^32-1 blocks of 64 bytes each, for a total of 247,877,906,880
       bytes, or nearly 256 GB.  This should be enough for traffic
       protocols such as IPsec and TLS, but may be too small for file
       and/or disk encryption.  For such uses, we can return to the
       original design, eliminate the Sender ID parameter, and use the
       integer at position 13 as the top 32 bits of a 64-bit block
       counter, increasing the total message size to over a million
       petabytes (1,180,591,620,717,411,303,360 bytes to be exact).
   2.  Despite the previous item, the ciphertext length field in the
       construction of the buffer on which Poly1305 runs limits the
       ciphertext (and hence, the plaintext) size to 2^64 bytes, or
       sixteen thousand petabytes (18,446,744,073,709,551,616 bytes to
       be exact).
   3.  TBD: perhaps we allocate only 6 bits for sender ID, leaving 58
       bits for block counter, which means the maximum number of blocks
       is just enough to fill the 64-bit octet counter.

2.7.1.  Example and Test Vector for AEAD_CHACHA20-POLY1305

   For a test vector, we will use the following inputs to the
   AEAD_CHACHA20-POLY1305 function:

   Plaintext:
   000  4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c|Ladies and Gentl
   016  65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73|emen of the clas
   032  73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63|s of '99: If I c
   048  6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f|ould offer you o
   064  6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20|nly one tip for
   080  74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73|the future, suns
   096  63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69|creen would be i
   112  74 2e                                          |t.


   Key:
   000  80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f|................



Nir                       Expires July 31, 2014                [Page 15]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   016  90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f|................


   IV:
   000  40 41 42 43 44 45 46 47                          @ABCDEFG


   Set up for generating poly1305 one-time key (sender id=7):
       61707865  3320646e  79622d32  6b206574
       83828180  87868584  8b8a8988  8f8e8d8c
       93929190  97969594  9b9a9998  9f9e9d9c
       00000000  00000007  43424140  47464544


   After generating Poly1305 one-time key:
       252bac7b  af47b42d  557ab609  8455e9a4
       73d6e10a  ebd97510  7875932a  ff53d53e
       decc7ea2  b44ddbad  e49c17d1  d8430bc9
       8c94b7bc  8b7d4b4b  3927f67d  1669a432


   Poly1305 Key:
   000  7b ac 2b 25 2d b4 47 af 09 b6 7a 55 a4 e9 55 84|{.+%-.G...zU..U.
   016  0a e1 d6 73 10 75 d9 eb 2a 93 75 78 3e d5 53 ff|...s.u..*.ux>.S.

   Poly1305 r = 455E9A4057AB6080F47B42C052BAC7B
   Poly1305 K = FF53D53E7875932AEBD9751073D6E10A


   Keystream bytes:
   9f:7b:e9:5d:01:fd:40:ba:15:e2:8f:fb:36:81:0a:ae:
   c1:c0:88:3f:09:01:6e:de:dd:8a:d0:87:55:82:03:a5:
   4e:9e:cb:38:ac:8e:5e:2b:b8:da:b2:0f:fa:db:52:e8:
   75:04:b2:6e:be:69:6d:4f:60:a4:85:cf:11:b8:1b:59:
   fc:b1:c4:5f:42:19:ee:ac:ec:6a:de:c3:4e:66:69:78:
   8e:db:41:c4:9c:a3:01:e1:27:e0:ac:ab:3b:44:b9:cf:
   5c:86:bb:95:e0:6b:0d:f2:90:1a:b6:45:e4:ab:e6:22:
   15:38


   Ciphertext:
   000  d3 1a 8d 34 64 8e 60 db 7b 86 af bc 53 ef 7e c2|...4d.`.{...S.~.
   016  a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7 36 ee 62 d6|...Q)n......6.b.
   032  3d be a4 5e 8c a9 67 12 82 fa fb 69 da 92 72 8b|=..^..g....i..r.
   048  1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6 7e cd 3b 36|.q.....)....~.;6
   064  92 dd bd 7f 2d 77 8b 8c 98 03 ae e3 28 09 1b 58|...-w......(..X
   080  fa b3 24 e4 fa d6 75 94 55 85 80 8b 48 31 d7 bc|..$...u.U...H1..
   096  3f f4 de f0 8e 4b 7a 9d e5 76 d2 65 86 ce c6 4b|?....Kz..v.e...K



Nir                       Expires July 31, 2014                [Page 16]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   112  61 16                                          |a.


   AEAD Construction for Poly1305:
   000  50 51 52 53 0c 00 00 00 00 00 00 00 0c 00 00 00|PQRS............
   016  00 00 00 00 d3 1a 8d 34 64 8e 60 db 7b 86 af bc|.......4d.`.{...
   032  53 ef 7e c2 a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7|S.~....Q)n......
   048  36 ee 62 d6 3d be a4 5e 8c a9 67 12 82 fa fb 69|6.b.=..^..g....i
   064  da 92 72 8b 1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6|..r..q.....)....
   080  7e cd 3b 36 92 dd bd 7f 2d 77 8b 8c 98 03 ae e3|~.;6...-w......
   096  28 09 1b 58 fa b3 24 e4 fa d6 75 94 55 85 80 8b|(..X..$...u.U...
   112  48 31 d7 bc 3f f4 de f0 8e 4b 7a 9d e5 76 d2 65|H1..?....Kz..v.e
   128  86 ce c6 4b 61 16 72 00 00 00 00 00 00 00      |...Ka.r.......


   Tag:
   58:7f:5c:12:9f:27:f8:e9:2e:a7:e3:2d:9f:2a:76:f2


3.  Implementation Advice

   Each block of ChaCha20 involves 16 move operations and one increment
   operation for loading the state, 80 each of XOR, addition and Roll
   operations for the rounds, 16 more add operations and 16 XOR
   operations for protecting the plaintext.  Section 2.3 describes the
   ChaCha block function as "adding the original input words".  This
   implies that before starting the rounds on the ChaCha state, it is
   copied aside only to be added in later.  This would be correct, but
   it saves a few operations to instead copy the state and do the work
   on the copy.  This way, for the next block you don't need to recreate
   the state, but only to increment the block counter.  This saves
   approximately 5.5% of the cycles.

   It is NOT RECOMMENDED to use a generic big number library such as the
   one in OpenSSL for the arithmetic operations in Poly1305.  Such
   libraries use dynamic allocation to be able to handle any-sized
   integer, but that flexibility comes at the expense of performance as
   well as side-channel security.  More efficient implementations that
   run in constant time are easy to find, and one is even available in
   [poly1305] paper (although DJB describes its performance as
   "intolerable").


4.  Security Considerations

   The ChaCha20 cipher is designed to provide 256-bit security.

   The Poly1305 authenticator is designed to ensure that forged messages



Nir                       Expires July 31, 2014                [Page 17]


Internet-Draft             ChaCha20 & Poly1305              January 2014


   are rejected with a probability of 1-(n/(2^102)) for a 16n-byte
   message, even after sending 2^64 legitimate messages, so it is SUF-
   CMA in the terminology of [AE].

   Proving the security of either of these is beyond the scope of this
   document.  Such proofs are available in the referenced academic
   papers.

   The most important security consideration in implementing this draft
   is the uniqueness of the nonce used in ChaCha20.  Counters and LFSRs
   are both acceptable ways of generating unique nonces, as is
   encrypting a counter using a 64-bit cipher such as DES.  Note that it
   is not acceptable to use a truncation of a counter encrypted with a
   128-bit or 256-bit cipher, because such a truncation may repeat after
   a short time.

   The same kind of collision risk as in the above paragraph also exists
   in the selection of the Poly1305 key in both the AEAD construction
   (Section 2.7).  ChaCha20 is used to generate a 512-bit value, which
   is unique to each packet, but this uniqueness is not guaranteed for
   its 256-bit truncation, which serves as the actual key for Poly1305.
   While uniqueness is not guaranteed, it is still very likely.
   Birthday paradox calculations show that generating Poly1305 keys
   needs to happen over 2^101 times (way more than the 2^64 allowed by
   IPsec or TLS) to drive the chances of collision up to
   0.00000000000001%.  This is why we may safely ignore the possibility
   of collision.

   The algorithms presented here were designed to be easy to implement
   in constant time to avoid side-channel vulnerabilities.  The
   operations used in ChaCha20 are all additions, XORs, and fixed
   rotations.  All of these can and should be implemented in constant
   time.  Access to offsets into the ChaCha state and the number of
   operations do not depend on any property of the key, eliminating the
   chance of information about the key leaking through the timing of
   cache misses.

   For Poly1305, the operations are addition, multiplication and
   modulus, all on >128-bit numbers.  This can be done in constant time,
   but a naive implementation (such as using some generic big number
   library) will not be constant time.  For example, if the
   multiplication is performed as a separate operation from the modulus,
   the result will some times be under 2^256 and some times be above
   2^256.  Implementers should be careful about timing side-channels for
   Poly1305 by using the appropriate implementation of these operations.






Nir                       Expires July 31, 2014                [Page 18]


Internet-Draft             ChaCha20 & Poly1305              January 2014


5.  IANA Considerations

   There are no IANA considerations for this document.


6.  Acknowledgements

   None of the algorithms here are my own.  ChaCha20 and Poly1305 were
   invented by Daniel J. Bernstein, and the AEAD construction was
   invented by Adam Langley.

   Much of the text in this document was inspired by Adam Langley's
   draft for TLS, and by "inspired" I mean "shamelessly copied".  The
   author would also like to thank Adam, Watson Ladd and Dave McGrew for
   allaying his fears about writing a document describing crypto.


7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [chacha]   Bernstein, D., "ChaCha, a variant of Salsa20", Jan 2008.

   [poly1305]
              Bernstein, D., "The Poly1305-AES message-authentication
              code", Mar 2005.

7.2.  Informative References

   [AE]       Bellare, M. and C. Namprempre, "Authenticated Encryption:
              Relations among notions and analysis of the generic
              composition paradigm",
              <http://cseweb.ucsd.edu/~mihir/papers/oem.html>.

   [FIPS-197]
              National Institute of Standards and Technology, "Advanced
              Encryption Standard (AES)", FIPS PUB 197, November 2001.

   [FIPS-46]  National Institute of Standards and Technology, "Data
              Encryption Standard", FIPS PUB 46-2, December 1993,
              <http://www.itl.nist.gov/fipspubs/fip46-2.htm>.

   [agl-draft]
              Langley, A. and W. Chang, "ChaCha20 and Poly1305 based
              Cipher Suites for TLS", draft-agl-tls-chacha20poly1305-04



Nir                       Expires July 31, 2014                [Page 19]


Internet-Draft             ChaCha20 & Poly1305              January 2014


              (work in progress), November 2013.

   [standby-cipher]
              McGrew, D., Grieco, A., and Y. Sheffer, "Selection of
              Future Cryptographic Standards",
              draft-mcgrew-standby-cipher (work in progress).


Author's Address

   Yoav Nir
   Check Point Software Technologies Ltd.
   5 Hasolelim st.
   Tel Aviv  6789735
   Israel

   Email: synp71@live.com


































Nir                       Expires July 31, 2014                [Page 20]