IP Flow Anonymization Support
RFC 6235

Note: This ballot was opened for revision 06 and is now closed.

(Dan Romascanu) Yes

(Ron Bonica) No Objection

(Stewart Bryant) (was Discuss) No Objection

Comment (2010-12-13)
No email
send info
 "Binning can be seen as a special case of precision degradation; the operation is identical, except for in precision degradation the counter ranges are uniform, and in binning they need not be.  For  example, a common counter binning scheme for packet counters could be to bin values 1-2 together, and 3-infinity together, thereby separating potentially completely-opened TCP connections from unopened ones. 

I am not a TCP specialist, so I do not understand the leap from the simple text "to bin values 1-2 together, and 3-infinity together" to the TCP connection example, however I suspect that grouping and the protocol state are in the wrong order.
Either way, a few more words of a different example would add clarity.



(Gonzalo Camarillo) No Objection

(Adrian Farrel) No Objection

Comment (2011-01-03 for -)
No email
send info
Thanks for bringing this work forward as Experimental. While it is not a requirement, I think it is helpful to the community if Experimental documents give some indication of the scope of the experiment. Where will these protocol extensions be used? How will they be kept separate from the wider IPFIX deployment? How will the success or failure of the experiment be judged? What is the plan, assuming success? (The latter is usually: return to update this document based on experience, and move it to the standards track.) Note that some of the answers were presented in the proto writeup.

I see questions from IANA in the data tracker that appear to have been answered by email during IETF last call. It would be good to get these answers transfered into the document or added as an RFC Editor Note. Additionally, the registry referenced is split into two ranges, but there is no advice to IANA about which range should be used.

(Russ Housley) No Objection

Alexey Melnikov No Objection

Comment (2010-12-26 for -)
No email
send info
5.1.  Stability

   If no information about stability is available, users of anonymised
   data MAY assume that the techniques used are stable across the entire
   dataset, but unstable across datasets.

This doesn't look like an implementation choice, so I think MAY is wrong here.

6.1.  Anonymisation Records and the Anonymisation Options Template

   First, reliability is important: an
   Exporting Process SHOULD export Anonymisation Records after the
   Templates they describe have been exported, and SHOULD export
   anonymisation records reliably.

What is the exact meaning of the last SHOULD?


TLS and DTLS need Informational References.

(Peter Saint-Andre) No Objection

(Robert Sparks) No Objection

(Sean Turner) (was Discuss) No Objection

Comment (2011-01-06)
No email
send info
#1) General

The authors should decide whether they're going to use American or British spellings.  The draft uses the American spellings for categorize, organize, minimize, behavior, and so on, but unaccountably uses the British "-ise" in anonymise and pseudonymise.  In the research literature, and in the other relevant RFCs, "anonymize" seems to be more popular, but either spelling type is fine so long as it's consistent.

#2) Section 1

It might be wise to repeat here (or even in the abstract) the note from the Security Considerations section that this draft is only meant to explain how to interchange anonymized data, not to provide any recommendations as to which anonymization techniques to use, or even any guarantee that any particular technique achieves any particular purpose.  Otherwise, it is easy to misread some parts of section 4 as promising that particular techniques will prevent particular attacks, which is not in fact the case for reasonable threat models. 

#3) Section 4.2

Brute-forcing a 48-bit MAC addresses is harder than brute-forcing a 32-bit IPv4 address, but not out of reach even for a hobbyist.

#4) Section 4.3

There is existing research on the extent to which the beginning and ending times of related flows can be used to link an anonymized view of a flow to a non-anonymized view of the flow.  Can we add a pointer to Murdoch and Zelinski's "Sampled Traffic Analysis by Internet-Exchange-Level Adversaries". [ http://petworkshop.org/2007/papers/PET2007_preproc_Sampled_traffic.pdf ] 

#5) Section 4.3 & 4.4

There is a pretty extensive literature about the extent to which perturbing timing and volume information prevents correlation, linkability, and website fingerprinting. Check out the traffic analysis section of freehaven.net/anonbib, and also check out the literature on "stepping stone detection".

The results are unintuitive to many people; in general, to resist correlation and linkability attacks, you need to use perturbations of higher-variance or bins of larger size than many implementors would expect. 

Seems like there should be a reference added to these.