IPFIX Working Group                                            E. Boschi
Internet-Draft                                               B. Trammell
Intended status: Experimental                             Hitachi Europe
Expires: January 8, 2009                                    July 7, 2008

                     IP Flow Anonymisation Support

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on January 8, 2009.


   This document describes anonymisation techniques for IP flow data.
   It provides a categorization of common anomymisation schemes and
   defines the parameters needed to describe them.  It describes support
   for anonymization within the IPFIX protocol, providing the basis for
   the definition of information models for configuring anonymisation
   techniques within an IPFIX Metering or Exporting Process, and for
   reporting the technique in use to an IPFIX Collecting Process.

Boschi & Trammell        Expires January 8, 2009                [Page 1]

Internet-Draft        IP Flow Anonymisation Support            July 2008

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  IPFIX Protocol Overview  . . . . . . . . . . . . . . . . .  3
     1.2.  IPFIX Documents Overview . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Categorisation of Anonymisation Techniques . . . . . . . . . .  5
   4.  Anonymisation of IP Flow Data  . . . . . . . . . . . . . . . .  5
     4.1.  IP Address Anonymisation . . . . . . . . . . . . . . . . .  6
     4.2.  Timestamp Anonymisation  . . . . . . . . . . . . . . . . .  6
     4.3.  Anonymisation of Other Flow Fields . . . . . . . . . . . .  7
   5.  Parameters for the Description of Anonymisation Techniques . .  7
   6.  Anonymisation Support in IPFIX . . . . . . . . . . . . . . . .  7
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . .  7
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  7
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     9.1.  Normative References . . . . . . . . . . . . . . . . . . .  7
     9.2.  Informative References . . . . . . . . . . . . . . . . . .  8
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  8
   Intellectual Property and Copyright Statements . . . . . . . . . . 10

Boschi & Trammell        Expires January 8, 2009                [Page 2]

Internet-Draft        IP Flow Anonymisation Support            July 2008

1.  Introduction

   The standardisation of an IP flow information export protocol
   [RFC5101] and associated representations removes a technical barrier
   to the sharing of IP flow data across organizational boundaries and
   with network operations, security, and research communities for a
   wide variety of purposes.  However, with wider dissemination comes
   greater risks to the privacy of the users of networks under
   measurement, and to the security of those networks.  While it is not
   a complete solution to the issues posed by distribution of IP flow
   information, anonymisation is an important tool for the protection of
   privacy within network measurement infrastructures.  Additionally,
   various jurisdictions define data protection laws and regulations
   that flow measurement activities must comply with, and anonymisation
   may be a part of such compliance [IMC07, FloCon08].

   This document presents a mechanism for representing anonymised data
   within IPFIX and guidelines for using it.  It begins with a
   categorization of anonymisation techniques.  It then describes
   applicability of each technique to commonly anonymisable fields of IP
   flow data, organized by information element data type and semantics
   as in [RFC5102]; enumerates the parameters required by each of the
   applicable anonymisation techniques; and provides guidelines for the
   use of each of these techniques in accordance with best practices in
   data protection.  Finally, it specifies a mechanism for exporting
   anonymised data and binding anonymisation metadata to templates using
   IPFIX Options.

1.1.  IPFIX Protocol Overview

   In the IPFIX protocol, { type, length, value } tuples are expressed
   in templates containing { type, length } pairs, specifying which {
   value } fields are present in data records conforming to the
   Template, giving great flexibility as to what data is transmitted.

   Since Templates are sent very infrequently compared with Data
   Records, this results in significant bandwidth savings.

   Different Data Records may be transmitted simply by sending new
   Templates specifying the { type, length } pairs for the new data
   format.  See [RFC5101] for more information.

   The IPFIX information model [RFC5102] defines a large number of
   standard Information Elements which provide the necessary { type }
   information for Templates.

   The use of standard elements enables interoperability among different
   vendors' implementations.  Additionally, non-standard enterprise-

Boschi & Trammell        Expires January 8, 2009                [Page 3]

Internet-Draft        IP Flow Anonymisation Support            July 2008

   specific elements may be defined for private use.

1.2.  IPFIX Documents Overview

   "Specification of the IPFIX Protocol for the Exchange of IP Traffic
   Flow Information" [RFC5101] (informally, the IPFIX Protocol document)
   and its associated documents define the IPFIX Protocol, which
   provides network engineers and administrators with access to IP
   traffic flow information.

   "Architecture for IP Flow Information Export" [I-D.ietf-ipfix-arch]
   (the IPFIX Architecture document) defines the architecture for the
   export of measured IP flow information out of an IPFIX Exporting
   Process to an IPFIX Collecting Process, and the basic terminology
   used to describe the elements of this architecture, per the
   requirements defined in "Requirements for IP Flow Information Export"
   [RFC3917].  The IPFIX Protocol document [RFC5101] then covers the
   details of the method for transporting IPFIX Data Records and
   Templates via a congestion-aware transport protocol from an IPFIX
   Exporting Process to an IPFIX Collecting Process.

   "Information Model for IP Flow Information Export" [RFC5102]
   (informally, the IPFIX Information Model document) describes the
   Information Elements used by IPFIX, including details on Information
   Element naming, numbering, and data type encoding.  Finally, "IPFIX
   Applicability" [I-D.ietf-ipfix-as] describes the various applications
   of the IPFIX protocol and their use of information exported via
   IPFIX, and relates the IPFIX architecture to other measurement
   architectures and frameworks.

   This document references the Protocol and Architecture documents for
   terminology and extends the IPFIX Information Model to provide new
   Information Elements for anonymisation metadata.

2.  Terminology

   The terminology used in this document is fully aligned with the
   terminology defined in [RFC5101].  Therefore, the terms defined in
   the IPFIX terminology are capitalized in this document, as in other
   IPFIX drafts ([RFC5101], [RFC5102]).

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

Boschi & Trammell        Expires January 8, 2009                [Page 4]

Internet-Draft        IP Flow Anonymisation Support            July 2008

3.  Categorisation of Anonymisation Techniques

   Anonymisation modifies a data set in order to protect the identity of
   the people or entities described by the data set from disclosure.
   With respect to network traffic data, anonymisation generally
   attempts to preserve some set of properties of the network traffic
   useful for a given application or applications, while ensuring the
   data cannot be traced back to the specific networks, hosts, or users
   generating the traffic.

   Anonymisation may be broadly split into three categories:
   generalisation and reversible or irreversible substitution.  When
   generalisation is used, identifying information is grouped in sets,
   and one single value is used to identify each set element.  Note that
   this may cause multiple records to become indistinguishable, thereby
   aggregating them into a single record.  Generalisation is an
   irreversible operation, in that the information needed to identify a
   single record from its "generalised value" is lost.

   Substitution (or pseudonymization) substitutes a false identifier for
   a real one, and can be reversible or irreversible.  Reversible
   substitution uses an invertible or otherwise reversible function, so
   that the real identifier may be recovered later.  Irreversible
   substitution, likewise, uses a one-way or randomising function, so
   that the real identifier cannot be recovered.

   While anonymisation is generally applied at the resolution of single
   fields within a record, attacks against anonymisation use entire
   records and relationships between records within a data set.
   Therefore, fields which may not necessarily be identifying by
   themselves may be anonymised in order to increase the anonymity of
   the data set as a whole.

4.  Anonymisation of IP Flow Data

   Due to the restricted semantics of IP flow data, there are a
   relatively limited set of specific anonymisation techniques available
   on flow data, though each falls into the broad categories above.
   Each type of field that may commonly appear in a flow record may have
   its own applicable specific techniques.

   Of all the fields in an IP flow record, the most attention in the
   literature has been paid to IP addresses [TODO: cite].  IP addresses
   are structured identifiers, that is, partial IP address prefixes may
   be used to identify networks just as full IP addresses identify
   hosts.  This leads to the application of prefix-preserving
   anonymisation of IP address information [TODO: cite].  Prefix-

Boschi & Trammell        Expires January 8, 2009                [Page 5]

Internet-Draft        IP Flow Anonymisation Support            July 2008

   preserving anonymisation is a (generally irreversible) substitution
   technique which has the additional property that the structure of the
   IP address space is maintained in the anonymised data.

   While not identifiers in and of themselves, timestamps are vulnerable
   to fingerprinting attacks, wherein relationships between the start
   and end timestamps of flows within a data set can be used to identify
   hosts or networks [TODO: cite].  Therefore, a variety of
   anonymisation techniques are available, including loss of precision
   (a form of generalisation), or noise addition (substitution), which
   may or may not preserve the sequencing of flows in the data set.

   Counters and other flow values can also be used to break
   anonymisation in fingerprinting attacks, so the same techniques,
   precision loss and noise addition, are available for these fields as

   Of course, the simplest form of anomymisation and the most extreme
   form of generalisation is black-marker anonymisation, or full
   deletion of a field from each record of the flow data.  The black
   marker technique is available on any type of field in a flow record.

   [TODO: This section is incomplete; the set of techniques should be
   more exhaustive.]

4.1.  IP Address Anonymisation

   The following table gives an overview of the schemes for IP address
   anonymization described in this document and their categorization.

        | Scheme                | Action         | Reversibility |
        | Truncation            | Generalisation | N             |
        | Scrambling            | Substitution   | Y             |
        | Prefix-preserving     | Substitution   | Y             |
        | Random noise addition | Substitution   | N             |

   [TODO: This section is incomplete; text here should expand on the

4.2.  Timestamp Anonymisation

   [TODO: as section 4.1]

   [EDITOR'S NOTE: Counters might go here, since they are subject to the
   same techniques for largely the same reasons.]

Boschi & Trammell        Expires January 8, 2009                [Page 6]

Internet-Draft        IP Flow Anonymisation Support            July 2008

4.3.  Anonymisation of Other Flow Fields

   [TODO: as section 4.1]

   [EDITOR'S NOTE: Port Numbers go here.  Counters might, if not above.
   It might make sense to split this into flow key anonymisation versus
   flow value anonymisation.]

5.  Parameters for the Description of Anonymisation Techniques

   [TODO: see corresponding section of draft-ietf-psamp-sample-tech for
   the proposed structure of this section.]

6.  Anonymisation Support in IPFIX

   [TODO: Here we'll describe how the information specified above can be
   transmitted on the wire using an option template.  The idea is to
   scope the option to the Template ID and for each field specify which
   are anonymised, providing info on the output characteristics of the
   technique, and which ones aren't.]

   [EDITOR'S NOTE: Multiple anon. techniques applied on an IE at the
   same time is indicated with multiple elements of the same type (in
   application order as in PSAMP)]

   [EDITOR'S NOTE: for blackmarking we'll recommend not to export the
   information at all following the data protection law principle that
   only necessary information should be exported.]

7.  Security Considerations

   [TODO: write this section.]

8.  IANA Considerations

   This document contains no actions for IANA.

9.  References

9.1.  Normative References

   [RFC5101]  Claise, B., "Specification of the IP Flow Information
              Export (IPFIX) Protocol for the Exchange of IP Traffic

Boschi & Trammell        Expires January 8, 2009                [Page 7]

Internet-Draft        IP Flow Anonymisation Support            July 2008

              Flow Information", RFC 5101, January 2008.

   [RFC5102]  Quittek, J., Bryant, S., Claise, B., Aitken, P., and J.
              Meyer, "Information Model for IP Flow Information Export",
              RFC 5102, January 2008.

9.2.  Informative References

              Sadasivan, G. and N. Brownlee, "Architecture Model for IP
              Flow Information Export", draft-ietf-ipfix-arch-02 (work
              in progress), October 2003.

              Zseby, T., "IPFIX Applicability", draft-ietf-ipfix-as-12
              (work in progress), July 2007.

              Sadasivan, G., "Architecture for IP Flow Information
              Export", draft-ietf-ipfix-architecture-12 (work in
              progress), September 2006.

              Boschi, E., "Reducing Redundancy in IP Flow Information
              Export (IPFIX) and Packet  Sampling (PSAMP) Reports",
              draft-ietf-ipfix-reducing-redundancy-04 (work in
              progress), May 2007.

   [RFC3917]  Quittek, J., Zseby, T., Claise, B., and S. Zander,
              "Requirements for IP Flow Information Export (IPFIX)",
              RFC 3917, October 2004.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

Authors' Addresses

   Elisa Boschi
   Hitachi Europe
   c/o ETH Zurich
   Gloriastrasse 35
   8092 Zurich

   Phone: +41 44 632 70 57
   Email: elisa.boschi@hitachi-eu.com

Boschi & Trammell        Expires January 8, 2009                [Page 8]

Internet-Draft        IP Flow Anonymisation Support            July 2008

   Brian Trammell
   Hitachi Europe
   c/o ETH Zurich
   Gloriastrasse 35
   8092 Zurich

   Phone: +41 44 632 70 13
   Email: trammell@tik.ee.ethz.ch

Boschi & Trammell        Expires January 8, 2009                [Page 9]

Internet-Draft        IP Flow Anonymisation Support            July 2008

Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at

Boschi & Trammell        Expires January 8, 2009               [Page 10]