Last Call Review of draft-ietf-marf-redaction-
review-ietf-marf-redaction-genart-lc-black-2012-01-20-00

Request Review of draft-ietf-marf-redaction
Requested rev. no specific revision (document currently at 08)
Type Last Call Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2012-01-17
Requested 2012-01-12
Draft last updated 2012-01-20
Completed reviews Genart Last Call review of -?? by David Black
Genart Last Call review of -?? by David Black
Genart Last Call review of -?? by David Black
Secdir Last Call review of -?? by Julien Laganier
Assignment Reviewer David Black
State Completed
Review review-ietf-marf-redaction-genart-lc-black-2012-01-20
Review completed: 2012-01-20

Review
review-ietf-marf-redaction-genart-lc-black-2012-01-20

Based on discussion with the authors, the -05 version of this draft resolves the
issues raised in the Gen-ART review of the -04 version.  An important element of
the approach taken to issue [1] has been to explain why the security requirements
for redaction are significantly weaker than the strength of the secure hashes
that are suggested by the draft.

Thanks,
--David

> -----Original Message-----
> From: Black, David
> Sent: Tuesday, January 10, 2012 9:44 PM
> To: ietf at cybernothing.org; Murray S. Kucherawy; gen-art at ietf.org; ietf at ietf.org
> Cc: Black, David; marf at ietf.org; presnick at qualcomm.com
> Subject: Gen-ART review of draft-ietf-marf-redaction-04
> 
> I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please
> see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
> 
> Please resolve these comments along with any other Last Call comments you may receive.
> 
> Document: draft-ietf-marf-redaction-04
> Reviewer: David L. Black
> Review Date: January 10, 2012
> IETF LC End Date: January 18, 2011
> IESG Telechat Date: January 19, 2011
> 
> Summary: This draft is on the right track but has open issues, described in the review.
> 
> This draft specifies a method for redacting information from email abuse reports
> (e.g., hiding the local part [user] of an email address), while still allowing
> correlation of the redacted information across related abuse reports from the same
> source. The draft is short, clear, and well written.
> 
> There are two open issues:
> 
> [1] The first open issue is the absence of security guidance to ensure that this
> redaction technique effectively hides the redacted information.  The redaction
> technique is to concatenate a secret string (called the "redaction key") to the
> information to be redacted, apply "any hashing/digest algorithm", convert the output
> to base64 and use that base64 string to replace the redacted information.
> 
> There are two important ways in which this technique could fail to effectively hide
> the redacted information:
> 	- The secret string may inject insufficient entropy.
> 	- The hashing/digest algorithm may be weak.
> 
> To take an extreme example, if the secret string ("redaction key") consists of a
> single ASCII character, and a short email local part is being redacted, then the
> output is highly vulnerable to dictionary and brute force attacks because only 6 bits
> of entropy are added (the result may look secure, but it's not).  Beyond this extreme
> example, this is a potentially real concern - e.g., applying the rule of thumb that
> ASCII text contains 4-5 bits of entropy per character, the example in Appendix A
> uses a "redaction key" of "potatoes" that injects at most 40 bits of entropy -
> is that sufficient for email redaction purposes?
> 
> To take a silly example, if a CRC is used as the hash with that sort of short input,
> the result is not particularly difficult to invert.
> 
> I suggest a couple of changes:
> 1) Change "any hashing/digest algorithm" to require use of a secure hash, and
> 	explain what is meant by "secure hash" in the security considerations section.
> 2) Require a minimum length of the "redaction key" string, and strongly suggest
> 	(SHOULD) that it be randomly generated (e.g., by running sufficient output
> 	of an entropy-rich random number generator through a base64 converter).
> 
> For the latter change, figure out the amount of entropy that should be used
> for redaction - the recommended string length will be larger because printable
> ASCII is not entropy-dense (at best it's good for 6 bits of entropy in each
> 8-bit character, and human-written text such as this message has significantly
> less).
> 
> From a pure security perspective, use of HMAC with specified secure hashes
> (SHA2-family) and an approach of hashing the "redaction key" down to a binary
> key for HMAC would be a stronger approach. I suggest that authors consider
> approach, but  there may be practical usage concerns that suggest not adopting it.
> 
> [2] The second open issue is absence of security considerations for the redaction
> key.  The security considerations section needs to caution that the redaction key
> is a secret key that must be managed and protected as a secret key.  Disclosure
> of a redaction key removes the redaction from all reports that used that key.
> As part of this, guidance should be provided on when and how to change the
> redaction key in order to limit the effects of loss of secrecy for a single
> redaction key.
> 
> Editorial Nit: I believe that "anonymization" is a better description of what
> this draft is doing (as opposed to "redaction"), particularly as the result is
> intended to be correlatable via string match across reports from the same source.
> 
> idnits 2.12.13 didn't find any nits.
> 
> Thanks,
> --David
> ----------------------------------------------------
> David L. Black, Distinguished Engineer
> EMC Corporation, 176 South St., Hopkinton, MA  01748
> +1 (508) 293-7953             FAX: +1 (508) 293-7786
> david.black at emc.com        Mobile: +1 (978) 394-7754
> ----------------------------------------------------