Last Call Review of draft-ietf-marf-redaction-
review-ietf-marf-redaction-genart-lc-black-2012-01-11-00

Request Review of draft-ietf-marf-redaction
Requested rev. no specific revision (document currently at 08)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2012-01-17
Requested 2012-01-06
Draft last updated 2012-01-11
Completed reviews Genart Last Call review of -?? by David Black
Genart Last Call review of -?? by David Black
Genart Last Call review of -?? by David Black
Secdir Last Call review of -?? by Julien Laganier
Assignment Reviewer David Black
State Completed
Review review-ietf-marf-redaction-genart-lc-black-2012-01-11
Review completed: 2012-01-11

Review
review-ietf-marf-redaction-genart-lc-black-2012-01-11

I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please
see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you may receive.

Document: draft-ietf-marf-redaction-04
Reviewer: David L. Black
Review Date: January 10, 2012
IETF LC End Date: January 18, 2011
IESG Telechat Date: January 19, 2011

Summary: This draft is on the right track but has open issues, described in the review.

This draft specifies a method for redacting information from email abuse reports
(e.g., hiding the local part [user] of an email address), while still allowing
correlation of the redacted information across related abuse reports from the same
source. The draft is short, clear, and well written.

There are two open issues:

[1] The first open issue is the absence of security guidance to ensure that this
redaction technique effectively hides the redacted information.  The redaction
technique is to concatenate a secret string (called the "redaction key") to the
information to be redacted, apply "any hashing/digest algorithm", convert the output
to base64 and use that base64 string to replace the redacted information.

There are two important ways in which this technique could fail to effectively hide
the redacted information:
	- The secret string may inject insufficient entropy.
	- The hashing/digest algorithm may be weak.

To take an extreme example, if the secret string ("redaction key") consists of a
single ASCII character, and a short email local part is being redacted, then the
output is highly vulnerable to dictionary and brute force attacks because only 6 bits
of entropy are added (the result may look secure, but it's not).  Beyond this extreme
example, this is a potentially real concern - e.g., applying the rule of thumb that
ASCII text contains 4-5 bits of entropy per character, the example in Appendix A
uses a "redaction key" of "potatoes" that injects at most 40 bits of entropy -
is that sufficient for email redaction purposes?

To take a silly example, if a CRC is used as the hash with that sort of short input,
the result is not particularly difficult to invert.

I suggest a couple of changes:
1) Change "any hashing/digest algorithm" to require use of a secure hash, and
	explain what is meant by "secure hash" in the security considerations section.
2) Require a minimum length of the "redaction key" string, and strongly suggest
	(SHOULD) that it be randomly generated (e.g., by running sufficient output
	of an entropy-rich random number generator through a base64 converter).

For the latter change, figure out the amount of entropy that should be used
for redaction - the recommended string length will be larger because printable
ASCII is not entropy-dense (at best it's good for 6 bits of entropy in each
8-bit character, and human-written text such as this message has significantly
less).

From a pure security perspective, use of HMAC with specified secure hashes
(SHA2-family) and an approach of hashing the "redaction key" down to a binary
key for HMAC would be a stronger approach. I suggest that authors consider
approach, but  there may be practical usage concerns that suggest not adopting it.

[2] The second open issue is absence of security considerations for the redaction
key.  The security considerations section needs to caution that the redaction key
is a secret key that must be managed and protected as a secret key.  Disclosure
of a redaction key removes the redaction from all reports that used that key.
As part of this, guidance should be provided on when and how to change the
redaction key in order to limit the effects of loss of secrecy for a single
redaction key.

Editorial Nit: I believe that "anonymization" is a better description of what
this draft is doing (as opposed to "redaction"), particularly as the result is
intended to be correlatable via string match across reports from the same source.

idnits 2.12.13 didn't find any nits.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david.black at emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------