Crypto Forum Research Group                              David A. McGrew
Internet Draft                                       Cisco Systems, Inc.
Expires June, 2003                                         October, 2002



     The Truncated Multi-Modular Hash Function (TMMH), Version Two
                     <draft-irtf-cfrg-tmmh-00.txt>


Status of this Memo

   This document is an Internet Draft and is in full conformance with
   all provisions of Section 10 of RFC-2026. Internet Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and working groups.  Note that other groups may also distribute
   working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet Drafts as reference
   material or to cite them other than as "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.


1. Abstract

   TMMH is a `universal' hash function suitable for message
   authentication in the Wegman-Carter paradigm.  It is simple, quick,
   and especially appropriate for Digital Signal Processors and other
   processors with a fast multiply operation.  This document specifies
   version two of TMMH, which eliminates the large storage requirement
   for hashing large messages that was present in version one.


2. The TMMH Specification.

   TMMH is a hash function that maps a key and a message to a hash
   value.  TMMH uses sixteen bit unsigned integers as a basic data
   unit, and uses the multiply-and-accumulate operation as its core.
   TMMH can be used as a message authentication code, as described in
   Section 5.



McGrew                                                          [Page 1]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


   The key, message, and hash value are treated as a sequence of
   unsigned sixteen bit integers in network byte order.  In the
   following, we call such an integer a word.  The number of words in
   the hash value is denoted as TAG_WORDS.  The number of words in the
   key is 35 + 6 * TAG_WORDS.  The message can have any length between
   zero and 65,536 octets (32,768 words).

   The message may contain an odd number of octets, in which case the
   all-zero octet is appended to the message to align it on a word
   boundary.  The number of octets in the message before this padding
   operation is denoted as MSG_LEN, and is used in the computations
   below.

   In the following, we define p to be equal to 2^16 + 1, and we use
   the symbol to * denote integer multiplication and the symbol +32 to
   denote integer addition modulo 2^32.

   TMMH uses several key-dependent internal data structures: the
   length multiplier array L, and an array of subkeys A. The length
   multiplier array L is an array of words, the ith element of which
   is denoted as L[i], with i ranging from zero to (TAG_WORDS - 1).  A
   subkey is an array consisting of (TAG_WORDS + 7) words, and the ith
   element of the subkey S is denoted as S[i].  Five subkeys are used
   in TMMH; the subkey array is denoted as A, and the ith subkey is
   denoted as A[i], with i ranging from zero to 4.

     Figure 1.  An illustration of how the arrays L and A are assigned
     from the words of the TMMH key K.  In this example, TAG_WORDS is
     equal to two.  Here K[i] denotes the ith word of the TMMH key
     (where i is a hexadecimal number).

          +----+----+----+----+----+----+----+----+----+----+----+---
    Key   |K[0]|K[1]|K[2]|K[3]|K[4]|K[5]|K[6]|K[7]|K[8]|K[9]|K[a]|...
          +----+----+----+----+----+----+----+----+----+----+----+---
          +----+----+--------------------------------------------+---
    Field |L[0]|L[1]|                    A[0]                    |...
          +----+----+--------------------------------------------+---

   The length multiplier array L and the subkey array A are taken from
   the TMMH key K as follows.

      1.  The value L[i] is set to K[i], for i from zero to
          TAG_WORDS-1.

      2.  The value A[j][i] (the ith element of the jth subkey) is
          set to K[ TAG_WORDS + i + (TAG_WORDS + 7) * j ].

   This process is illustrated in Figure 1.



McGrew                                                          [Page 2]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


   The function V(S, M) defined below maps a subkey S and a data
   string M to a 32-bit unsigned integer, and is defined as

     V(S,M) = S[1] * M[1] +32 S[2] * M[2] +32 ... +32 S[8] * M[8]

   with the convention that M is padded by appending null words if its
   length is less than eight.  Here +32 denotes integer addition
   modulo 2^32.  The length of the subkey K may be greater than eight,
   but the excess words are ignored by the function V.  (The
   definition of V is such that the most significant words of the
   subkey may be ignored simplifies the exposition below.)

   The function U(S, M) is defined as

     U(S, M) = [ V(S, M) modulo p ] modulo 2^16,

   and it maps a subkey S and a data string M to a single word.

   The core of TMMH is a `compression' function C which maps a subkey
   value and an input string to an output string which is about eight
   times smaller than the input string.  To compute C(S, D) for a
   given subkey value S and data string D of w words, do the
   following.

     1) Divide up D into blocks of eight words each, padding the last
        block by appending null words until its length is a multiple
        of eight if needed:

        D = D[0] || D[1] || ... || D[ ceil(w/8) ]

        where D[i] is the ith block, || denotes concatenation, and
        ceil(x) denotes the largest integer not less than x.

     2) Apply the function U to each block, using the subkey value S
        each time, then concatenate the outputs as follows:

        C(S, D) = U(S, D[0]) || U(S, D[1]) || ...
                         ... || U(S, D[ ceil(w/8) ].

   We next give a definition of TMMH that applies only when TAG_WORDS
   is one, as this case is simpler than the more general case.  When
   TAG_WORDS is one, TMMH is computed using the following algorithm:

      set X to M and set i to zero
      while the number of words in X is greater than eight, do
         set X to C(A[i], X)
         increment i
      end while



McGrew                                                          [Page 3]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


      return [ [ [ L[0] * MSG_LEN ] +32 V(A[i], X) ] mod p ] mod 2^16

   In the general case of TAG_WORDS > 1, the jth word of the TMMH tag
   is computed using the following algorithm:

      set X to M and set i to zero
      while the number of words in X is greater than eight, do
         set X to C((A[i] << j), X)
         increment i
      end while
      return [ [ [ L[j] * MSG_LEN ]
                   +32 V((A[i] << j), X) ] mod p ] mod 2^16

   where (x << j) denotes the word-wise left-shift of x, the result of
   which has null words in its j rightmost positions.  For example, if
   x is the string { 5ea7 f27a c536 2192 11be ea35 db9d 63d6 fa8a } of
   words (in hexadecimal), then the value of (x << 1) is { f27a c536
   2192 11be ea35 db9d 63d6 fa8a 0000 }.  Note that the shift
   operation causes all of the words of a given subkey to be used by
   the function V.


2.1 Implementation Notes

   The reduction modulo p can be implemented without the explicit use
   of a division operation, as it is close to 2^16 (See Section 3.1 of
   [MMH] for details).  The reduction modulo 2^16 can be implemented
   by truncating all but the lower sixteen bits of a higher-precision
   value.

   It is possible to implement TMMH in a scatter/gather approach or
   with an API consisting of initial, update, and final operations.
   One way to do this is to note that TMMH defines a tree of hash
   values, and to implement TMMH as a postorder traversal of that
   tree.


3. Test Vectors

   This section provides test vectors which can be used to test an
   implementation of TMMH.  The key, message, and outputs are
   expressed as octet sequences, with each octet in hexadecimal.


3.1 Test Vector One

   TAG_WORDS: 2
   key:     { e627 6a01 5ea7 f27a c536 2192 11be ea35



McGrew                                                          [Page 4]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


              db9d 63d6 fa8a fc45 e08b d216 ced2 7853
              1a82 22f5 90fb 1c29 708e d06f 82c3 bee6
              4f21 6f33 65c0 d211 c25e 9138 4fa3 7c1f
              61ac 3489 2976 8c19 8252 ddbf cad3 c28f
              68d6 58dd 504f 2bbf 0278 70b7 cfca }
   L:       { e627 6a01 }
   A[0]:    { 5ea7 f27a c536 2192 11be ea35 db9d 63d6 fa8a }
   A[1]:    { fc45 e08b d216 ced2 7853 1a82 22f5 90fb 1c29 }
   A[2]:    { 708e d06f 82c3 bee6 4f21 6f33 65c0 d211 c25e }
   A[3]:    { 9138 4fa3 7c1f 61ac 3489 2976 8c19 8252 ddbf }
   A[4]:    { cad3 c28f 68d6 58dd 504f 2bbf 0278 70b7 cfca }
   message: { 6015 f141 5ba1 29a0 f604 0d1c 02d9 aa8a 7931 }
   tag:     { 8a82 4bb0 }

3.2 Test Vector Two

   TAG_WORDS: 2
   key:     { 2337 47e0 1564 671b 6f80 dcdd a6cc 5ff1
              3e5d 88eb 612e 7c99 02e8 d8b2 77a5 09f9
              f0bc 9997 1dc9 d478 396f 9602 8538 aa7f
              16a0 a456 e77e 5262 a1dc 6b06 5e67 b2d5
              74ee 5045 82c1 310a 28a5 bb49 f15a 3834
              59c8 f3a 36f7 1b8c 953d bf74 2080 }
   message: { 5741 5220 4953 2050 4541 4345 2c20 4652
              4545 444f 4d20 4953 2053 4c41 5645 5259
              2c20 4947 4e4f 5241 4e43 4520 4953 2053
              5452 454e 4754 4800 }
   tag:     { 1d0f 7536 }

   N.B. This message is given by the 56-octet ASCII string
   "WAR IS PEACE, FREEDOM IS SLAVERY, IGNORANCE IS STRENGTH"
   in network byte order.

3.3 Test Vector Three

  TAG_WORDS: 2
  key:     { 0001 0001 0001 0001 0001 0001 0001 0001
             0001 0001 0001 0001 0001 0001 0001 0001
             0001 0001 0001 0001 0001 0001 0001 0001
             0001 0001 0001 0001 0001 0001 0001 0001
             0001 0001 0001 0001 0001 0001 0001 0001
             0001 0001 0001 0001 0001 0001 0001 }
  message: the word 0001 repeated 32,768 times
  tag:     { 7fff 7fff }

4. Optimizations

   It is not necessary to store the entire TMMH key in memory if an



McGrew                                                          [Page 5]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


   upper bound on the length of the messages to be hashed is known.
   For example, if the messages of a particular protocol are no
   greater than 256 octets in length, then only the first three subkey
   values need to be stored; the other subkeys are not needed in the
   computation of TMMH for messages of that length.

   To describe this property, we define the parameter MAX_HASH_LENGTH,
   which denotes the maximum value which MSG_LEN may take.  The number
   of subkeys which must be stored are described in terms of this
   parameter in Figure 2.

      Figure 2.  The storage and the number of subkeys as a
      function of MAX_HASH_LENGTH.

      Subkeys  MAX_HASH_LENGTH  Storage (octets)
      ---------------------------------------------
      1        16               14 +  4 * TAG_WORDS
      2        128              28 +  6 * TAG_WORDS
      3        1024             42 +  8 * TAG_WORDS
      4        8192             56 + 10 * TAG_WORDS
      5        65536            70 + 12 * TAG_WORDS


5. Using TMMH as a Message Authentication Code

   TMMH can be used to make a Wegman-Carter type message
   authentication code.  To use TMMH to compute the authentication tag
   of a message, the TMMH hash value of that message is computed, then
   that value is combined with a pseudorandom string of TAG_LENGTH
   octets.  The combining operation is word-wise addition modulo 2^16.
   The pseudoranom strings MUST NOT be correlated with each other nor
   with the messages they protect.

   A message/authentication tag pair is verified as follows.  Compute
   a new authentication tag using the algorithm above, and compare it
   to the tag associated with the message.  If the two are equal, then
   the message/tag pair is valid; otherwise, it is not.


6. Security Considerations

   TMMH is (2^(-11.1)^TAG_WORDS)-Delta Universal with respect to
   addition modulo 2^16.  This property enables TMMH to be used as a
   message authentication code in the Wegman-Carter paradigm [WC81].
   When used in this manner, the probability with which an adversary
   can forge a tag for a given message is no more than
   2^(-11.1)^TAG_WORDS.  This value is actually an upper bound; it
   is slightly less for shorter messages.



McGrew                                                          [Page 6]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


   These security claims for TMMH follow from the derivations in
   [MMH].  The interested reader should adapt Definition 2 of that
   document to TMMH as a starting point.

   The cyclic use of the words of the key is known as a Toeplitz
   construction, and is known to be secure (see Section 2.4 of [MMH]
   and the references therein).


7. History


   This document is based on the Internet Draft
   draft-mcgrew-saag-tmmh-01.txt, which was submitted to SAAG in
   November, 2001 and which expired in May, 2002.

   The second version of TMMH improves upon the initial version by
   enabling long messages to be hashed without requiring the key to be
   large.  Version two essentially uses MMH recursively, and handles
   the variable message length in the same manner as version one.

   Changes to draft-mcgrew-saag-tmmh-01.txt from version 00 of that
   document include:

     * a correction of one of the test vector cases,

     * the addition of a section on using TMMH in a MAC,

     * some scattered corrections and clarifications,

     * removed references to other Internet Drafts.

   TMMH is an extension of the MMH hash function [MMH].  Unlike the
   function introduced in that seminal work, TMMH can deal efficiently
   and securely with variable-length messages, and is defined for both
   sixteen and thirty-two bit words.

   The PacketCable Security specification includes a MMH variant for
   RTP authentication [PC].  This variant can only be used for fixed
   message lengths, and is not interoperable with TMMH.

   To the best knowledge of the authors, this specification is not
   covered by any patents [H01].


8. Acknowledgments

   Thanks are due to Mats Naslund, Karl Norrman, Doug Smith, Scott



McGrew                                                          [Page 7]


Internet Draft     Truncated Multi-Modular Hash (TMMH)     October, 2002


   Fluhrer, and der Mouse for their comments and corrections.



9. Contact Information

   Questions and comments about this Internet Draft SHOULD be directed
   to:

   David A. "I'm not gonna pay a lot for this Presentation Layer" McGrew
   Cisco Systems, Inc.
   mcgrew@cisco.com

   The discussion list for the Crypto Forurm Research Group is:

   cfrg@ietf.org


11. References

   [B97]   Bradner, S., Key words for use in RFCs to Indicate
           Requirement Levels, RFC 2119, March 1997.

   [WC81]  M. Wegman and L. Carter, New hash functions and their
           use in authentication and set equality, J. of Computer
           and System Sciences, vol. 22, 1981.

   [H01]   Halevi, S. Personal communication, May, 2001.

   [MMH]   Halevi, S., and Krawczyk, H.  MMH: Software Authentication
           in the Gbit/second rates, Fast Software Encryption
           Workshop, 1997.  Also available online at
           http://www.research.ibm.com/people/s/shaih/pubs/.

   [PC]    The PacketCable Security Specification,
           PKT-SP-SEC-I03-010626, Sections 9.8.1 and 9.8.2.  Available
           online at http://www.packetcable.com/specifications.html

   [S92]   Stinson, D. R.  Universal hashing and authentication codes.
           Advances in Cryptology - CRYPTO '91, Lecture Notes in
           Computer Science 576 (1992), pp. 74-85.










McGrew                                                          [Page 8]