Skip to main content

Kemeleon Encodings
draft-veitch-kemeleon-00

Document Type Active Internet-Draft (individual)
Authors Felix Günther , Douglas Stebila , Shannon Veitch
Last updated 2024-11-29
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-veitch-kemeleon-00
Network Working Group                                         F. Günther
Internet-Draft                              IBM Research Europe - Zurich
Intended status: Informational                                D. Stebila
Expires: 2 June 2025                              University of Waterloo
                                                               S. Veitch
                                                              ETH Zurich
                                                        29 November 2024

                           Kemeleon Encodings
                        draft-veitch-kemeleon-00

Abstract

   This document specifies Kemeleon encoding algorithms for encoding ML-
   KEM public keys and ciphertexts as random bytestrings.  Kemeleon
   encodings provide obfuscation of public keys and ciphertexts, relying
   on module LWE assumptions.  This document specifies a number of
   variants of these encodings, with differing failure rates, output
   sizes, and performance profiles.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://ssveitch.github.io/draft-kemeleon/draft-kemeleon.html.
   Status information for this document may be found at
   https://datatracker.ietf.org/doc/draft-veitch-kemeleon/.

   Source for this draft and an issue tracker can be found at
   https://github.com/ssveitch/draft-kemeleon.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

Günther, et al.            Expires 2 June 2025                  [Page 1]
Internet-Draft                  Kemeleon                   November 2024

   This Internet-Draft will expire on 2 June 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   3
   3.  Notation / ML-KEM Background  . . . . . . . . . . . . . . . .   3
   4.  Kemeleon Encoding . . . . . . . . . . . . . . . . . . . . . .   4
     4.1.  Common Functions  . . . . . . . . . . . . . . . . . . . .   4
     4.2.  Encoding Public Keys  . . . . . . . . . . . . . . . . . .   6
     4.3.  Encoding Ciphertexts  . . . . . . . . . . . . . . . . . .   7
     4.4.  Non-Rejection Sampling Variant  . . . . . . . . . . . . .   7
     4.5.  Faster Arithmetic Variant . . . . . . . . . . . . . . . .   9
     4.6.  Deterministic Encoding  . . . . . . . . . . . . . . . . .   9
     4.7.  Summary of Encodings  . . . . . . . . . . . . . . . . . .  10
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  11
     5.1.  Randomness Sampling . . . . . . . . . . . . . . . . . . .  11
     5.2.  Timing Side-Channels  . . . . . . . . . . . . . . . . . .  11
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
     7.2.  Informative References  . . . . . . . . . . . . . . . . .  12
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  12

1.  Introduction

   ML-KEM [FIPS203] is a post-quantum key-encapsulation mechanism (KEM)
   recently standardized by NIST, Many applications are transitioning
   from classical Diffie-Hellman (DH) based solutions to constructions
   based on ML-KEM.  The use of Elligator and related Hash-to-Curve
   [RFC9380] algorithms are ubiquitous in DH-based protocols where DH
   shares are required to be encoded as, and look indistinguishable
   from, random bytestrings.  For example, applications using Elligator
   include protocols used for censorship circumvention in Tor [OBFS4],

Günther, et al.            Expires 2 June 2025                  [Page 2]
Internet-Draft                  Kemeleon                   November 2024

   password-authenticated key exchange (PAKE) protocols [CPACE]
   [OPAQUE], and private set intersection (PSI) [ECDH-PSI].

   For the post-quantum transition, an analogous encoding for (ML-)KEM
   public keys and ciphertexts to random bytestrings is required.  This
   document specifies such an encoding, Kemeleon, for ML-KEM public keys
   and ciphertexts.  Kemeleon was introduced in [GSV24] for building an
   (post-quantum) "obfuscated" KEM whose public keys and ciphertexts are
   indistinguishable from random.  Beyond the original construction,
   this document additionally specifies variants that avoid the encoding
   failing or the use of large integer computations, or allow for a
   deterministic encoding.  Aside from these variants, it is notable
   that the Kemeleon encodings of public keys results in smaller
   representations than in the original ML-KEM specification.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Notation / ML-KEM Background

   A KEM consists of three algorithms:

   *  KeyGen() -> (pk, sk): A probabilistic key generation algorithm
      that, with no input, generates a public key pk and a secret key
      sk.

   *  Encaps(pk) -> (c, K): A probabilistic encapsulation algorithm that
      takes as input a public key pk, and outputs a ciphertext ct and
      shared secret key K.

   *  Decaps(sk, c) -> K: A decapsulation algorithm that takes as input
      a secret key sk and ciphertext c, and outputs a shared secret key
      K.

   The following variables and functions are adopted from [FIPS203]:

   *  q = 3329, n = 256

   *  Compress_d : x -> round((2d/q)*x) mod 2d (Equation 4.7)

   *  Decompress_d : y -> round((q/2d)*y) (Equation 4.8)

Günther, et al.            Expires 2 June 2025                  [Page 3]
Internet-Draft                  Kemeleon                   November 2024

   *  remaining parameters k, d_u, d_v, etc. are defined by the
      respective ML-KEM parameter set -- this document writes du and dv
      in place of d_u, d_v in pseudocode

   ML-KEM.KeyGen() (Section 7.1 [FIPS203]) produces a public key, pk,
   (termed an encapsulation key in [FIPS203]) and a private key, sk,
   (decapsulation key).  Public keys consist of byte-encoded vectors of
   coefficients in Z_q, where each coefficient is encoded in 12 bits,
   together with a 32-byte seed for generating the matrix A.  ML-
   KEM.Encaps(pk) (Section 7.2 [FIPS203]) produces ciphertexts
   consisting of byte-encoded compressed vectors of cofficients, where
   each coefficient in Z_q is compressed by a certain number of bits
   (depending on the ML-KEM parameter set).

   The following terms and notation are used throughout this document:

   *  msb(x) refers to the most significant bit of the value x

   *  a[i] denotes the ith position of a vector a of coefficients

   *  concat(x0, ..., xN): returns the concatenation of bytestrings.

4.  Kemeleon Encoding

   At a high level, the constructions in this document instantiate the
   following functions:

   *  EncodePk(pk) -> epk is the (possibly randomized) encoding
      algorithm that on input a public key, outputs an obfuscated public
      key or an error.

   *  DecodePk(epk) -> pk is the deterministic decoding algorithm that
      on input an obfuscated public key, outputs a public key.

   *  EncodeCtxt(c) -> ec is the (possibly randomized) encoding
      algorithm that on input a ciphertext, outputs an obfuscated
      ciphertext or an error.

   *  DecodeCtxt(ec) -> c is the deterministic decoding algorithm that
      on input an obfuscated ciphertext, outputs a ciphertext.

4.1.  Common Functions

   The following function maps a vector of k*n coefficients modulo q to
   a large integer, rejecting if the most significant bit of the integer
   is 1.

Günther, et al.            Expires 2 June 2025                  [Page 4]
Internet-Draft                  Kemeleon                   November 2024

   VectorEncode(a):
      r = 0
      for i from 1 to k*n:
         r += q^(i-1)*a[i]
      if msb(r) == 1:
         return err
      else:
         return r

   VectorDecode(r):
      for i from 1 to k*n:
         a[i] = r % q
         r = r // q
      return a

   The following algorithm samples an uncompressed pre-image of a
   coefficient c at random, where u is the decompressed value of c.  The
   mapping is based on the Compress_d, Decompress_d algorithms from
   (Section 4.2.1 [FIPS203]).

Günther, et al.            Expires 2 June 2025                  [Page 5]
Internet-Draft                  Kemeleon                   November 2024

   SamplePreimage(d,u,c):
      if d == 10:
         if Compress_d(u + 2) == c:
            rand <--$ [-1,0,1,2]
         else:
            rand <--$ [-1,0,1]
         return u + rand
      if d == 11:
         if Compress_d(u + 1) == c:
            rand <--$ [0,1]
         else if Compress_d(u - 1) == c:
            rand <--$ [-1,0]
         else:
            rand = 0
         return u + rand
      if d == 5:
         if c == 0:
            rand <--$ [-52,...,52]
         else:
            rand <--$ [-51,...,52]
         return u + rand
      if d == 4:
         if c == 0:
            rand <--$ [-104,...,104]
         else:
            rand <--$ [-104,...,103]
         return u + rand
      else:
         return err

4.2.  Encoding Public Keys

   The following algorithms encode ML-KEM public keys as random
   bytestrings. rho is the public seed used to generate the public
   matrix A [FIPS203].  This is already a random 32-byte string, so it
   is returned alongside the encoded value of t. t is a vector of k
   polynomials with n coefficients, but in the following pseudocode t is
   treated as a vector of k*n coefficients.

   Kemeleon.EncodePk(pk = (t, rho)):
      r = VectorEncode(t)
      if r == err:
         return err
      else:
         return concat(r,rho)

Günther, et al.            Expires 2 June 2025                  [Page 6]
Internet-Draft                  Kemeleon                   November 2024

   Kemeleon.DecodePk(epk):
      r,rho = epk # rho is fixed length
      t = VectorDecode(r)
      return (t, rho)

4.3.  Encoding Ciphertexts

   ML-KEM ciphertexts consist of two components: c_1, a vector of k
   polynomials with n coefficients mod 2^du, and c_2, a polynomial with
   n coefficients mod 2^dv.  The coefficients of these polynomials are
   not uniformly distributed, as a result of the compression step in
   encapsulation.  The following encoding function decompresses and
   recovers a random preimage of this compression step in order to
   recover the uniform distribution of coefficients.  Then, the same
   vector encoding step used for public keys is applied.  For the second
   ciphertext component, rejection sampling is performed to retain
   uniformity, rather than decompressing.

   Kemeleon.EncodeCtxt(c = (c_1,c_2)):
      u = Decompress_du(c_1)
      for i from 1 to k*n:
         u[i] = SamplePreimage(du,u[i],c_1[i])
      r = VectorEncode(u)
      if r == err:
         return err
      for i from 1 to n:
         if c_2[1] == 0:
            return err with prob. 1/ceil(q/(2^dv))
      return concat(r,c_2)

   Kemeleon.DecodeCtxt(ec):
      r,c_2 = ec # c_2 is fixed length
      u = VectorDecode(r)
      c_1 = Compress_du(u)
      return (c_1,c_2)

4.4.  Non-Rejection Sampling Variant

   Applying a technique from [ELL2] (Section 3.4), the original Kemeleon
   construction can be adapted to avoid rejection sampling.  This
   results in larger output sizes, but the encoding algorithm never
   fails.  Applying the technique from [ELL2], where r is the encoded
   vector before rejection occurs in VectorEncode, we then choose m at
   random from [0,floor((2^(b+t)-r)/(q^(k*n)))], where b =
   log_2(q^(k*n)) and t is a security parameter, and return r +
   m*q^(k*n).  This variant results in encoded values whose statistical
   distance from uniform is at most 2^-t.  This results in an increased
   output size of t bits, where t is the security parameter.  For

Günther, et al.            Expires 2 June 2025                  [Page 7]
Internet-Draft                  Kemeleon                   November 2024

   example, with t=128, this increases the output size by 16 bytes.

   For public key encodings, one can immediately replace VectorEncode
   and VectorDecode calls with calls to the following algorithms.

   VectorEncodeNR(a):
      r = 0
      t = sec_param # e.g. t = 128, 256, ...
      b = log_2(q^(k*n))
      for i from 1 to k*n:
         r += q^(i-1)*a[i]
      m <--$ [0,...,floor((2^(b+t)-r)/(q^(k*n)))]
      return r + m*q^(k*n)

   VectorDecodeNR(a):
      a = a % q^(k*n)
      for i from 1 to k*n:
         a[i] = r % q
         r = r // q
      return a

   Notably, the random value m need not be transmitted alongside the
   encoded values.

   For ciphertext encodings, one must also avoid rejection sampling
   based on coefficients of the second component of the ciphertext.
   Therefore, the new ciphertext encoding must decompress and
   VectorEncodeNR the second component of the ciphertext.  This more
   significantly increases the size of the encoded ciphertext.

Kemeleon.EncodeCtxtNR(c = (c_1,c_2)):
   u = Decompress_du(c_1)
   for i from 1 to k*n:
      u[i] = SamplePreimage(du,u[i],c_1[i])
   v = Decompress_dv(c_2)
   for i from 1 to n:
      v[i] = SamplePreimage(dv,v[i],c_2[i])
   w = [u,v] # treat u,v as a singular vector of (k+1)*n coefficients
   r = VectorEncodeNR(w) # this call should use k+1 rather than k when accumulating to a large integer
   return r

   Kemeleon.DecodeCtxtNR(ec):
      w = VectorDecodeNR(r)
      u,v = w # u, v are fixed length
      c_1 = Compress_du(u)
      c_2 = Compress_dv(v)
      return (c_1,c_2)

Günther, et al.            Expires 2 June 2025                  [Page 8]
Internet-Draft                  Kemeleon                   November 2024

4.5.  Faster Arithmetic Variant

   [OPEN ISSUE: Is the faster variant of interest?  If so, the following
   can be extended with a complete description.]

   Observing that q = 3329 = 13*2^8+1, a variant of Kemeleon with faster
   integer arithmetic can be specified.  First, the encoding rejects any
   polynomial with a coefficient equal to q-1 = 3328.  This ensures that
   all arithmetic can be computed with values modulo q-1 = 13*2^8.
   Then, note that rather than accumulating values to a large integer
   mod q^(k*n), it is only required to accumulate values to an integer
   mod 13^(k*n), while keeping track of the 8 lower order bits of each
   coefficient.  The output size of the encoding does not change, but
   this results in an increased rejection rate.

   In particular, Table 1 gives success probabilities for public key and
   ciphertext encodings:

    +=============+========================+==========================+
    | Parameter   | Pk success probability | Ctxt success probability |
    +=============+========================+==========================+
    | ML-KEM-512  |                   0.49 |                     0.45 |
    +-------------+------------------------+--------------------------+
    | ML-KEM-768  |                   0.29 |                     0.25 |
    +-------------+------------------------+--------------------------+
    | ML-KEM-1024 |                   0.53 |                     0.47 |
    +-------------+------------------------+--------------------------+

        Table 1: Success probabilities for faster Kemeleon encoding

4.6.  Deterministic Encoding

   The randomness used in Kemeleon ciphertext encodings MAY be derived
   in a deterministic manner.  To do so, following a call to Encap which
   returns a KEM key K and a ciphertext c, the following steps can be
   taken:

   *  Using a key derivation function (KDF), derive from the key K a new
      key K' and a seed for randomness rnd.

   *  The seed rnd can be used to generate the randomness required when
      encoding the ciphertext c.

   *  Use K' in place of K wherever applicable in the remainder of the
      protocol/system.

   *  Upon any call to Decap, apply the same KDF to derive the new key
      K', as required.

Günther, et al.            Expires 2 June 2025                  [Page 9]
Internet-Draft                  Kemeleon                   November 2024

   Deriving a new KEM key for use in the remainder of a system is
   crucial in order to ensure key separation (i.e., not using the
   original key K to derive randomness and for other purposes).

   The randomness used to encode a public key MAY be stored alongside
   the corresponding secret key, if it is subsequently needed.  See
   Section 5.1 for relevant discussion on keeping this randomness
   secret.

4.7.  Summary of Encodings

     +==============+=============+=============+===================+
     | Algorithm /  | Output size |     Success |        Additional |
     | Parameter    |     (bytes) | probability |    considerations |
     +==============+=============+=============+===================+
     | Kemeleon -   |    pk: 781, |   pk: 0.56, |  Large int (750B) |
     | ML-KEM512    |   ctxt: 877 |  ctxt: 0.51 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | Kemeleon -   |   pk: 1156, |   pk: 0.83, | Large int (1150B) |
     | ML-KEM768    |  ctxt: 1252 |  ctxt: 0.77 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | Kemeleon -   |   pk: 1530, |   pk: 0.62, | Large int (1500B) |
     | ML-KEM1024   |  ctxt: 1658 |  ctxt: 0.57 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonNR - |    pk: 797, |   pk: 1.00, | Large int (1123B) |
     | ML-KEM512    |  ctxt: 1140 |  ctxt: 1.00 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonNR - |   pk: 1172, |   pk: 1.00, | Large int (1498B) |
     | ML-KEM768    |  ctxt: 1514 |  ctxt: 1.00 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonNR - |   pk: 1546, |   pk: 1.00, | Large int (1872B) |
     | ML-KEM1024   |  ctxt: 1889 |  ctxt: 1.00 |        arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonFT - |    pk: 781, |   pk: 0.49, |       Smaller int |
     | ML-KEM512    |   ctxt: 877 |  ctxt: 0.45 | (235B) arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonFT - |   pk: 1156, |   pk: 0.29, |       Smaller int |
     | ML-KEM768    |  ctxt: 1252 |  ctxt: 0.25 | (355B) arithmetic |
     +--------------+-------------+-------------+-------------------+
     | KemeleonFT - |   pk: 1530, |   pk: 0.53, |       Smaller int |
     | ML-KEM1024   |  ctxt: 1658 |  ctxt: 0.47 | (475B) arithmetic |
     +--------------+-------------+-------------+-------------------+

       Table 2: Summary of Kemeleon Variants, NR = No Reject, FT =
                                  Faster

Günther, et al.            Expires 2 June 2025                 [Page 10]
Internet-Draft                  Kemeleon                   November 2024

5.  Security Considerations

   This section contains additional security considerations about the
   Kemeleon encodings described in this document.

   In general, the obfuscation properties of the Kemeleon encodings
   depend on module LWE assumptions similar to those underlying the IND-
   CCA security of ML-KEM; see [GSV24] for the detailed security
   analysis of the original Kemeleon encoding.

5.1.  Randomness Sampling

   Both public key and ciphertext encodings in the original Kemeleon
   encoding are randomized.  The randomness (or seed used to generate
   randomness) used in Kemeleon encodings MUST be kept secret.  In
   particular, public randomness enables distinguishing a Kemeleon-
   encoded value from a random bytestring: Decoding the value in
   question and re-encoding it with the public randomness will yield the
   original value if it was Kemeleon-encoded.

5.2.  Timing Side-Channels

   Beyond timing side-channel considerations for ML-KEM itself, care
   should be taken when using Kemeleon encodings, in particular those
   with a non-zero failure probability.  Rejecting and re-generating
   public keys or ciphertexts may leak information about the use of
   Kemeleon encodings, as might the overhead of the encoding itself.

6.  IANA Considerations

   This document has no IANA actions.

7.  References

7.1.  Normative References

   [ELL2]     Tibouchi, M., "Elligator Squared: Uniform Points on
              Elliptic Curves of Prime Order as Uniform Random Strings",
              2014, <https://eprint.iacr.org/2014/043>.

   [FIPS203]  "Module-lattice-based key-encapsulation mechanism
              standard", National Institute of Standards and Technology
              (U.S.), DOI 10.6028/nist.fips.203, August 2024,
              <https://doi.org/10.6028/nist.fips.203>.

   [GSV24]    Günther, F., Stebila, D., and S. Veitch, "Obfuscated Key
              Exchange", 2024, <https://eprint.iacr.org/2024/1086>.

Günther, et al.            Expires 2 June 2025                 [Page 11]
Internet-Draft                  Kemeleon                   November 2024

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC9380]  Faz-Hernandez, A., Scott, S., Sullivan, N., Wahby, R. S.,
              and C. A. Wood, "Hashing to Elliptic Curves", RFC 9380,
              DOI 10.17487/RFC9380, August 2023,
              <https://www.rfc-editor.org/rfc/rfc9380>.

7.2.  Informative References

   [CPACE]    Abdalla, M., Haase, B., and J. Hesse, "CPace, a balanced
              composable PAKE", Work in Progress, Internet-Draft, draft-
              irtf-cfrg-cpace-13, 14 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-
              cpace-13>.

   [ECDH-PSI] wangyuchen, Chang, Lu, Y., Hong, C., and J. Peng, "PSI
              based on ECDH", Work in Progress, Internet-Draft, draft-
              ecdh-psi-00, 21 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ecdh-psi-00>.

   [OBFS4]    "obfs4 (The obfourscator)", n.d.,
              <https://gitlab.torproject.org/tpo/anti-censorship/
              pluggable-transports/lyrebird/-/blob/HEAD/doc/
              obfs4-spec.txt>.

   [OPAQUE]   Bourdrez, D., Krawczyk, H., Lewi, K., and C. A. Wood, "The
              OPAQUE Augmented PAKE Protocol", Work in Progress,
              Internet-Draft, draft-irtf-cfrg-opaque-18, 21 November
              2024, <https://datatracker.ietf.org/doc/html/draft-irtf-
              cfrg-opaque-18>.

Authors' Addresses

   Felix Günther
   IBM Research Europe - Zurich
   Email: mail@felixguenther.info

   Douglas Stebila
   University of Waterloo
   Email: dstebila@uwaterloo.ca

Günther, et al.            Expires 2 June 2025                 [Page 12]
Internet-Draft                  Kemeleon                   November 2024

   Shannon Veitch
   ETH Zurich
   Email: shannon.veitch@inf.ethz.ch

Günther, et al.            Expires 2 June 2025                 [Page 13]