Skip to main content

Last Call Review of draft-josefsson-scrypt-kdf-03
review-josefsson-scrypt-kdf-03-genart-lc-kyzivat-2015-08-28-00

Request Review of draft-josefsson-scrypt-kdf
Requested revision No specific revision (document currently at 05)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2015-09-07
Requested 2015-08-13
Authors Colin Percival , Simon Josefsson
Draft last updated 2015-08-28
Completed reviews Genart Last Call review of -03 by Paul Kyzivat (diff)
Genart Telechat review of -04 by Paul Kyzivat (diff)
Secdir Last Call review of -03 by Joseph A. Salowey (diff)
Assignment Reviewer Paul Kyzivat
State Completed
Review review-josefsson-scrypt-kdf-03-genart-lc-kyzivat-2015-08-28
Reviewed revision 03 (document currently at 05)
Result On the Right Track
Completed 2015-08-28
review-josefsson-scrypt-kdf-03-genart-lc-kyzivat-2015-08-28-00
I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at <‚Äč
http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other comments you may receive.

Summary:

This draft is on the right track but has open issues, described in the
review.

Disclaimers:

I am not schooled about crypto algorithms and techniques. I have
reviewed this draft from the perspective of a software engineer, trying
to understand if I could implement from this specification.

I also know nothing about using ASN.1 to configure KDFs. So I can't
comment on section 6. This draft ought to be reviewed by somebody
capable of checking that.

I have not attempted to verify the test vectors in sections 7-12.

-------------
Major issues:

(Issue-1): Intended status

The intended status of this document is Informational. I question why it
is not a normative document. As best I can tell, this is the formal
specification of the algorithm. Those that use it would presumably want
to claim conformance to it. The introduction describes this as an
alternative to other KDF functions, including only one with an RFC
reference - RFC2898. That one is also informational, but it is a
restatement of an algorithm specified elsewhere, so that RFC can be
viewed as an informational supplement to the actual definition. The same
is not true of this document.

(Of course changing this to a normative document would require
significant changes, including adding 2119 language. And it probably
could then not be handled as an AD-sponsored document.)

(Issue-2): Type mismatch on call to scryptROMix

The scryptROMix function calls scryptBlockMix with an octet vector of
length 128*r octets. But the definition of scryptBlockMix specifies that
the input argument should be a vector of 2*r 64-octet blocks.

Clearly these don't match. One way to make them match would be to divide
the single 128*r octet vector into two 64-octet vectors, and then to
treat r as 2 inside of scryptBlockMix. I don't know if that is the intent.

(Issue-3): Definition of Integerify

The Integerify function is not clearly defined. The following is given
in the document:

   j = Integerify (X) mod N
       where Integerify (B[0] ... B[2 * r - 1]) is defined
       as the result of interpreting B[2 * r - 1] as a
       little-endian integer.

I can make no sense of this definition of Integerify. The description
implies that B must be an array containing elements up to index 2*r-1.
But the definition of B is "Input octet vector of length 128 * r
octets". Taking the definition literally, B[2*r-1] must be an octet, and
2*r must be less than 128. That seems like nonsense to me.

I found the following in the [SCRYPT] paper:

"We expect that for reasons of performance and simplicity,
implementors will restrict N to being a power of 2, in which case
the function Integerify can be replaced by reading the first (or
last) machine-length word from a k-bit block."

Simply reading a machine-length word ignores the differences between
little-endian and big-endian machines, and machines with different word
sizes. Conveniently, [SALSA20SPEC] defines a littleendian function that
yields a 32-bit integer from four bytes. That should be sufficient bits
for computing "j". So Integerify(X) could be defined as:

    littleendian(X[0],X[1],X[2],X[3])

or

    littleendian(X[128*r-4],X[128*r-3],X[128*r-2],X[128*r-1])

(I don't think it matters which, as long as everyone does it the same way.)

In any case, the language is ambiguous and needs to be clarified.

-------------
Minor issues:

(Issue-4): Identifiers reused for different meanings

In scrypt, "B" is an array of "p" vectors, each of which is 128*r
octets. In scryptROMix, "B" is a single vector of 128*r octets. In
scryptBlockMix, "B" is a vector of 2*r 64-octet blocks.

In both scrypt and scryptROMix "r" is the same block size parameter. But
in scryptROMix it is only used in the (broken) definition of Integerify.
In scryptBlockMix "r" is (apparently, if I have figured things out)
always 2, and unrelated to the other "r".

The document would be clearer if distinct identifiers were used for each
unique concept.

For those identifiers whose value is intended to be constant and common
across all the functions (such as "N"), it would be better to define
them once, globally.

(Issue-5): Confusing/misleading names/definitions of identifiers

The "Block size" parameter ("r") does not denote the size of a block. It
is a factor in the size of blocks, varying from function to function.
Exactly what concept it denotes, and how one would choose it, isn't
clear to me.

The definition of the "N" parameter (CPU/Memory cost parameter) isn't
especially clear. It appears that increasing N increases the cost both
of CPU and memory. But the "p" (parallelization) parameter acts as a
multiplier on N, also increasing the cost. It is far from clear how one
would choose appropriate values for N and p. For a given value of N*p,
is it better for N to be large, or p to be large?

I suggest that more thought be given to what these things mean in the
context of this application, and then choose identifier names and
descriptions accordingly. It may be better to refactor these some other way.

The ASN.1 in section 6 assigns names to several of these identifiers. It
would be helpful to readers if the names used in defining the algorithms
were also the names used here.

(Issue-6): Dubious stability of references

I looked for prior discussion of this draft, and found some on the saag
mailing list regarding the references.

The definition of the Salsa20 hash function in
http://cr.yp.to/snuffle/spec.pdf seems clear enough, but is the document
reference stable? It might be safer to replicate the definition in this
document (in an appendix) with attribution. It doesn't appear that there
is any copyright in the referenced document. I'll also note that call to
this hash function in scryptBlockMix is to "Salsa", not Salso20. It
ought to be consistent with the definition.

------------------------
Nits/editorial comments:

(Issue-7): IdNits reported errors

IdNits reports:

-- Obsolete informational reference (is this intentional?): RFC 5208
      (Obsoleted by RFC 5958)

This is triggered by the following in section 6:

"To be usable in PKCS#8 [RFC5208] and Asymmetric Key Packages [RFC5958]
the following extension of the PBES2-KDFs type is needed."

Since RFC5958 is referenced, and it obsoletes RFC5208, what is the
reason for referencing both?

(There is also an error about 2119, but it is bogus, triggered by the
use of OPTIONAL in the ASN.1.)

This completes my review.

	Thanks,
	Paul Kyzivat