Sampling and Filtering Techniques for IP Packet Selection
RFC 5475

Review by Christian Vogt:

Draft-ietf-psamp-sample-tech-10 identifies and defines packet selection
techniques that are needed for input reduction in network measurements.

The document is very elaborative and certainly very valuable to the
PSAMP community.  For non-experts, however, it would be beneficial if
the document could be clarified according to the following comments:

(1)  Section 1, 2nd paragraph, claims that there are 3 types of packet
selection techniques:  sampling, filtering, and aggregation.  I would
argue that only sampling and filtering are *selection* techniques.
Aggregation is not a selection technique, although it does reduce the
input for network measurement like sampling and filtering do.

I would therefore remove aggregation from the list of packet selection
techniques.  This would not only be more precise, it would also
accommodate the fact that this document is about packet selection, but
does not get into aggregation.

(2)  I was missing a statement of objectives at the beginning of the
document:  What is it that the document is trying to achieve?  Is it to
provide guidance to network administrators when deploying network
measurement equipment?  Is it requirements for network measurement
implementations?  Is it an informational document proposing a set of
definitions for the standardization community to use?

(3)  The document explains that hashing well-defined packet bits is a
technique that can be used for both filtering and emulated sampling.
IMO, it would be good to explain in the document when it is appropriate
to emulate sampling through hashing, and when proper sampling should be
applied instead.  The document currently provides an only very limited
explanation:  In section 4, 4th paragraph, it states that hashing-based
sampling is useful where a consistent set of packets is to be selected
by different devices (although even here it remains unclear why/whether
proper sampling would be inappropriate in such a situation).

(4)  The table starting on page 27 summarizes packet selection
techniques.  The 3rd column is defined as functions that a particular
packet selection technique must execute in order to select a packet.
However, some of the fields in the 3rd column render this definition
circular because they include "selection function" or "filter function"
(i.e., a class of selection function).

This from Fred Baker of the OPS-DIRECTORATE:

My principal comment has to do with the IPR. AT&T among others have  IPR that affects this technology and have filed a RAND statement with  no comment on terms. While much of this document is presently  implemented in vendor equipment and used in operational networks,  trajectory sampling is not. Since the apparent objective of the  document is to make trajectory sampling based on interesting filters  commonly available for operational use, the IPR statement reduces the  document's utility. Since this is a non-technical issue, I would not  ask the working group for a change in the document; this is a comment  to the working group and the authors.

A second comment has to do with the use of sampling in the first  place. It isn't clear that sampled analysis meets the needs of the  European Union's Data Retention mandates. As such, either the Data  Retention mandate is creating a requirement for additional technology  beyond the needs of the service providers (something they have stated  they don't intend), or sampled data fails to meet some operational  requirement that I don't know about. In any event, vendors and ISPs  have to somehow come to terms with the disparate requirements, and  can't simply use sampling to replace full-scale accounting. This  comment is out of scope for psamp, but is relevant to ops-dir.

I reviewed this document  to make sure it was in line with:
   the IPFIX protocol draft, so that one could use IPFIX to export the information
   the PSAMP protocol draft, so that one could PSAMP (which is based on IPFIX) to export the information
   the PSAMP architecture draft
   the PSAMP information model (which is based on the IPFIX information model)

and all that appears "ok."

Regarding the document itself, I would say that it contains
   - the common sampling mechanisms used routers
   - some more complex sampling mechanisms, based on the consensus
   - a very basic filtering mechanism (logical AND)
   - some hashing mechanisms, for trajectory sampling
Even if we had some requests to add some extra mechanisms, I would say that this draft is complete. Anyway, there is an IANA procedure for new mechanisms.

Why make it PS? Only because there is the inclusion of the basic filtering mechanism and the hashing for trajectory sampling. Although as other mention, it is laden w/ IPR.

