Network Working Group                                      Pedro Marques
Internet Draft                                             Nischal Sheth
Expiration Date: April 11, 2005                         Juniper Networks
                                                           Robert Raszuk
Jared Mauch                                                 Barry Greene
NTT/Verio                                             Cisco Systems Inc.
                                                          Danny McPerson
                                                          Arbor Networks

                                                            October 2004


               Dissemination of flow specification rules


                   draft-marques-idr-flow-spec-01.txt

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 11, 2005.








Marques, et al.                                                 [Page 1]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


Copyright Notice

   Copyright (C) The Internet Society (2004).


Abstract

   This document defines a new BGP NLRI encoding format that can be used
   to distribute traffic flow specifications. This allows the routing
   system to propagate information regarding special treatment that is
   desired for sub-components of a particular IP prefix.

   Additionally it defines an application of that encoding format to
   traffic filtering of inter-domain flows such as what is necessary in
   order to mitigate (distributed) denial of service attacks.

   The information is carried via the Border Gateway Protocol (BGP),
   thereby reusing protocol algorithms, operational experience and
   administrative processes such as inter-provider peering agreements.



Table of Contents

 1      Introduction  ..............................................   3
 2      Flow specifications  .......................................   4
 3      Dissemination of Information  ..............................   5
 4      Traffic filtering  .........................................  10
 5      Validation procedure  ......................................  11
 6      Traffic Filtering Actions  .................................  12
 7      Monitoring  ................................................  12
 8      Security considerations  ...................................  13
 9      Acknowledgments  ...........................................  13
10      References  ................................................  13
11      Authors' Addresses  ........................................  13
















Marques, et al.                                                 [Page 2]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


1. Introduction

   Modern IP routers contain both capability to forward traffic accord-
   ing to aggregate IP prefixes as well as the capability to identify
   and special case particular flows of traffic. The latter are usually
   referred to as ACL or firewall engines.

   While forwarding information is, typically, dynamically signaled
   accross the network via routing protocols, there is no agreed upon
   mecanism to dynamically signal flows across autonomous-systems.

   For several applications, it may be necessary to exchange control
   information pertaining to aggregated traffic flow definitions which
   cannot be expressed using destination address prefixes only.

   An aggregated traffic flow is considered to be an n-tuple consisting
   on several matching criteria such as source and destination address
   prefixes, IP protocol and transport protocol port numbers.

   The intention of this document is to define a general procedure to
   encode such flow specification rules as a BGP NLRI which can be
   reused for several different control applications. Additionally, we
   define the required mechanisms to utilize this definition to the
   problem of immediate concern to the authors: intra and inter provider
   distribution of traffic filtering rules to filter (Distributed)
   Denial of Service (DoS) attacks.

   By expanding routing information with flow specifications, the rout-
   ing system can take advantage of the ACL/firewall capabilities in the
   router's forwarding path. Flow specifications can be seen as more
   specifc routing entries to an unicast prefix and are expected to
   depend upon the existing unicast data information.

   For example, a flow specification received from a external
   autonomous-system will need to be validated against unicast routing
   before being accepted. If the aggregate traffic flow defined by the
   unicast destination prefix is forwarded to a given BGP peer, then the
   local system can install more specific flow rules which result in
   different forwarding behaviour as requested by this system.

   The choice of BGP as the carrier of this control information is also
   justified by the fact that the key issues in terms of complexity are
   problems which are common to unicast route distribution and have
   already been solved in the current environment.

   From an algorithmic perspective, the main problem that presents
   itself is the distributed loop-free distribution of <key, attribute>
   pairs. The key, in this particular instance, being a flow



Marques, et al.                                                 [Page 3]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


   specification.

   From an operational perspective, the utilization of BGP as the car-
   rier for this information, allows a network service provider to reuse
   both internal route distribution infrastructure (e.g.: route reflec-
   tor or confederation design) and existing external relationships
   (e.g.: inter-domain BGP sessions to a customer network).

   While it is certainly possible to address this problem using other
   mechanisms the authors believe that this solution offers the substan-
   tial advantage of being an incremental addition to deployed mecha-
   nisms.


2. Flow specifications

   A flow specification is an n-tuple consisting on several matching
   criteria that can be applied to IP traffic. A given IP packet is said
   to match the defined flow if it matches all the specified criteria.

   A given flow may be associated with a set of attributes, depending on
   the particular application, such attributes may or may not include
   reachability information (i.e. NEXT_HOP). Well-known or AS-specific
   community attributes can be used to encode a set of predeterminate
   actions.

   A particular application is identified by a specific (AFI, SAFI) pair
   and corresponds to a distinct set of RIBs. Those RIBs should be
   treated independently from each other in order to assure non-inter-
   ference between distinct applications.

   BGP itself treats the NLRI as an opaque key to an entry in its
   databases. Entries that are placed in the Loc-RIB are then associated
   with a given set of semantics which is application dependent. This is
   consistent with existing BGP applications. For instance IP unicast
   routing (AFI=1, SAFI=1) and IP multicast reverse-path information
   (AFI=1, SAFI=2) are handled by BGP without any particular semantics
   being associated with them until installed in the Loc-RIB.

   Standard BGP policy mechanisms, such as UPDATE filtering by NLRI pre-
   fix and community matching, SHOULD apply to the newly defined NLRI-
   type. Network operators can also control propagation of such routing
   updates by enabling or disabling the exchange of a particular (AFI,
   SAFI) pair on a given BGP peering session.







Marques, et al.                                                 [Page 4]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


3. Dissemination of Information

   We define a "Flow Specification" NLRI type that may include several
   components such as destination prefix, source prefix, protocol,
   ports, etc. This NLRI is treated as an opaque bit string prefix by
   BGP. Each bit string identifies a key to a database entry which a set
   of attributes can be associated with.

   This NLRI information is encoded using MP_REACH_NLRI and
   MP_UNREACH_NLRI attributes as defined in [BGP-MP]. Whenever the cor-
   responding application does not require Next Hop information, this
   shall be encoded as a 0 octet length Next Hop in the MP_REACH_NLRI
   attribute and ignored on receipt.

   The NLRI field of the MP_REACH_NLRI and MP_UNREACH_NLRI is encoded as
   a two byte NLRI length value in octets followed by a variable length
   NLRI value.

      +------------------------------+
      |    NLRI length (2 octets)    |
      +------------------------------+
      |    NLRI value  (variable)    |
      +------------------------------+

   The Flow Specification NLRI-type consists of several optional subcom-
   ponents. A specific packet is considered to match the flow specifica-
   tion when it matches the intersection (AND) of all the components
   present in the specification.

   The following component types are defined:

     + Type 1 - Route Distinguisher

       Encoding: <type (1 octect), RD value (8 octets)>

       Route Distinguisher value, encoded as specified in [2547]. This
       allows this NLRI to carry information for more than one routing
       realm. This value should be ommited when distributing information
       for the Internet routing realm. When the RD is present, routes
       should also contain the Route Target extended community.

     + Type 2 - Destination Prefix

       Encoding: <type (1 octet), prefix length (1 octet), prefix>

       Defines the destination prefix to match. Prefixes are encoded as
       in BGP UPDATE messages, a length in bits is followed by enough
       octets to contain the prefix information.



Marques, et al.                                                 [Page 5]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


     + Type 3 - Source Prefix

       Encoding: <type (1 octet), prefix-length (1 octet), prefix>

       Defines the source prefix to match.

     + Type 4 - IP Protocol

       Encoding: <type (1 octet), [op, value]+>

       Contains a set of {operator, value} pairs that are used to match
       IP protocol value byte in IP packets.

       The operator byte is encoded as:

       7   6   5   4   3   2   1   0
     +---+---+---+---+---+---+---+---+
     | e | a |  len  | 0 |lt |gt |eq |
     +---+---+---+---+---+---+---+---+

     -i. End of List bit. Set in the last {op, value} pair in the list.

    -ii. And bit. If unset the previous term is logically ORed with the
         current one. If set the operation is a logical AND. It should
         be unset in the first operator byte of a sequence. The AND
         operator has higher priority than OR for the purposes of evalu-
         ating logical expressions.

   -iii. The lenght of value field for this operand is given as (1 <<
         len).

    -iv. Lt - less than comparisson between data and value.

     -v. gt - greater than comparisson between data and value.

    -vi. eq - equality between data and value.

         The bits lt, gt, and eq can be combined to produce "less or
         equal", "greater or equal" and inequality values.

     + Type 5 - Port

       Encoding: <type (1 octet), [op, value]+>

       Defines a list of {operation, value} pairs that matches source OR
       destination TCP/UDP ports. This list is encoded using the numeric
       operand format defined above. Values are encoded as 1 or 2 byte
       quantities.



Marques, et al.                                                 [Page 6]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


     + Type 6 - Destination port

       Encoding: <type (1 octet), [op, value]+>

       Defines a list of {operation, value} pairs used to match the des-
       tination port of a TCP or UDP packet. Values are encoded as 1 or
       2 byte quantities.

     + Type 7 - Source port

       Encoding: <type (1 octet), [op, value]+>

       Defines a list of {operation, value} pairs used to match the
       source port of a TCP or UDP packet. Values are encoded as 1 or 2
       byte quantities.

     + Type 8 - ICMP type

       Encoding: <type (1 octet),  [op, value]+>

       Defines a list of {operation, value} pairs used to match the type
       field of an icmp packet. Values are encoded using a single byte.

     + Type 9 - ICMP code

       Encoding: <type (1 octet),  [op, value]+>

       Defines a list of {operation, value} pairs used to match the code
       field of an icmp packet. Values are encoded using a single byte.

     + Type 10 - TCP flags

       Encoding: <type (1 octet),  [op, bitmask]+>

       Bitmask values are encoded using a single byte, using the bit
       definitions specified in the TCP header format [rfc793].

       This type uses the bitmask operand format, which differs from the
       numeric operator format in the lower nibble.


       7   6   5   4   3   2   1   0
     +---+---+---+---+---+---+---+---+
     | e | a |  len  | 0 | 0 |not| m |
     +---+---+---+---+---+---+---+---+






Marques, et al.                                                 [Page 7]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


     -i. Top nibble (End of List bit, And bit and Length field), as
         defined for in the numeric operator format.

    -ii. Not bit. If set, logical negation of operation.

   -iii. Match bit. If set this is a bitwise match operation defined as
         "(data & value) == value"; if unset (data & value) evaluates to
         true if and of the bits in the value mask are set in the data.


     + Type 11 - Packet length

       Encoding: <type (1 octet), [op, value]+>

       Match on the total IP packet length (excluding L2 but including
       IP header).  Values are encoded using as 1 or 2 byte quantities.

     + Type 12 - DSCP

       Encoding: <type (1 octet), [op, value]+>

       Defines a list of {operation, value} pairs used to match the IP
       TOS octect.

     + Type 13 - Fragment Encoding: <type (1 octet), [op, bitmask]+>

       Uses bitmask operand format defined above.

       Bitmask values:
            -i. Bit 0 - Dont fragment

           -ii. Bit 1 - Is a fragment

          -iii. Bit 2 - First fragment

           -iv. Bit 3 - Last fragment


   Flow specification components must follow strict type ordering. A
   given component type may or may not be present in the specification,
   but if present it MUST precede any component of higher numeric type
   value.

   If a given component type within a prefix in unknown, the prefix in
   question cannot be used for traffic filtering purposes by the
   receiver. Since a Flow Specification as the semantics of a logical
   AND of all components, if a component is FALSE by definition it can-
   not be applied. However for the purposes of BGP route propagation



Marques, et al.                                                 [Page 8]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


   this prefix should still be transmitted since BGP route distribution
   is independent on NLRI semantics.

   Flow specification components are to be interpreted as a bit match at
   a given packet offset. When more than one component in a flow speci-
   fication tests the same packet offset the behavior is undetermined.

   The <type, value> encoding is chosen in order to account for future
   extensibility.

   An example of a Flow Specification encoding for: "all packets to
   10.0.1/24 and TCP port 25".

        destination    proto      port
      +-------------+--------+-----------+
      02 18 0a 01 01 04 81 06 05 81 19     (hex)

      Decode for protocol:
      0x04 type
      0x81 operator = end-of-list, value size=1, =.
      0x06 value

   An example of a Flow Specification encoding for: "all packets to
   10.0.1/24 from 192/8 and port {range [137, 139] or 8080".
        destination    source     port
      +-------------+---------+------------------------+
      02 18 0a 01 01 03 08 c0   05 03 89 45 8b 91 1f 90  (hex)

      Decode for port:
      0x05 type
      0x03 value size=1, >=
      0x89 value 137
      0x45 &, value size=1, <=
      0x8b value 139
      0x91 end-of-list, value-size=2, =
      0x1f90value 8080

   This constitutes a NLRI with an NLRI length of 16 octets.

   Implementations wishing to exchange flow specification rules MUST use
   BGP's Capability Advertisement facility to exchange the Multiprotocol
   Extension Capability Code (Code 1) as defined in [BGP-MP].  The (AFI,
   SAFI) pair carried in the Multiprotocol Extension capability MUST be
   the same as the one used to identify a particular application that
   uses this NLRI-type.






Marques, et al.                                                 [Page 9]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


4. Traffic filtering

   Traffic filtering policies have been traditionally considered to be
   relatively static.  The popularity of traffic-based denial of service
   (DoS) attacks, which often requires the network operator to be able
   to use traffic filters for detection and mitigation, brings with it
   requirements that are not fully satisfied by existing tools.

   Several techniques are currently used to control traffic filtering of
   DoS attacks.  Among those, one of the most common is to inject uni-
   cast route advertisements corresponding to a destination prefix being
   attacked. One variant of this technique marks such route advertise-
   ments with a community that gets translated into a discard next-hop
   by the receiving router. Other variants, attract traffic to a partic-
   ular node that serves as a deterministic drop point.

   Using unicast routing advertisements to distribute traffic filtering
   information has the advantage of using the existing infrastructure
   and inter-as communication channels. This can allow, for instance,
   for a service provider to accept filtering requests from customers
   for address space they own.

   There are several drawbacks, however. An issue that is immediately
   apparent is the granularity of filtering control: only destination
   prefixes may be specified. Another area of concern is the fact that
   filtering information is intermingled with routing information.

   The mechanism defined in this document is designed to address these
   limitations. We use the flow specification NLRI defined above to con-
   vey information about traffic filtering rules for traffic that should
   be discarded.

   This mechanism is designed to, primarily, allow an upstream
   autonomous system to perform inbound filtering, in their ingress
   routers of traffic that a given downstream AS wishes to drop.

   In order to achieve that goal, we define an application specific NLRI
   identifier (AFI=1, SAFI=TBD) along with specific sematic rules.  BGP
   routing updates containing this identifier use the flow specification
   NLRI encoding to convey particular aggregated flows that require spe-
   cial treatment.










Marques, et al.                                                [Page 10]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


5. Validation procedure

   Flow specifications received from a BGP peer and which are accepted
   in the respective Adj-RIB-In are used as input to the route selection
   process. Although the forwarding attributes of two routes for the
   same Flow Specification prefix may be the same, BGP is still required
   to perform its path selection algorithm in order to select the cor-
   rect set of attributes to advertise.

   The first step of the BGP Route Selection procedure [BGP-BASE] (sec-
   tion 9.1.2) is to exclude from the selection procedure routes that
   are considered non-feasible. In the context of IP routing information
   this step is used to validate that the NEXT_HOP attribute of a given
   route is resolvable.

   The concept can be extended, in the case of Flow Specification NLRI,
   to allow other validation procedures.

   A flow specification NLRI SHOULD be validated such that it is consid-
   ered unfeasible if it contains an non-empty AS_PATH and that AS_PATH
   does not match the AS_PATH of the best match unicast route that
   includes the specified destination address prefix.

   The underlying concept is that the neighboring AS that advertises the
   best unicast route for a destination is allowed to advertise flow
   spec information that conveys a less or equally specific destination
   prefix.

   The neighboring AS is the immediate destination of the traffic
   described by the Flow Specification. If it requests these flows to be
   dropped that request can be honored without concern that it repre-
   sents a denial of service in itself. Supposedly, the traffic is being
   dropped by the downstream autonomous-system and there is no added
   value in carrying the traffic to it.

   BGP implementations MUST also enforce that the AS_PATH attribute of a
   route received via eBGP contains the neighboring AS in the left-most
   position of the AS_PATH attribute. While this rule is optional in the
   BGP specification, it becomes necessary to enforce it for security
   reasons.











Marques, et al.                                                [Page 11]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


6. Traffic Filtering Actions

   The default action for a traffic filtering flow specification is to
   accept IP traffic that matches that particular rule.

   The following extended community values can be used to specify par-
   ticular actions.


     type    extended community      encoding
     --------------------------------------------------------
     0x8006  traffic-rate            2-byte as#, 4-byte float
     0x8007  sample-rate             2-byte as#, 4-byte float
     0x8008  redirect                6-byte Route Target
     0x8009  marking                 1-byte DSCP value


A traffic-rate of 0 should result on all traffic that matches the par-
ticular flow to be discarded.

The redirect extended community allows the traffic to be redirected to a
VRF routing instance that list the specified route-target in its import
policy. If several local instances match this criteria, the choice
between them is a local matter (for example, the instance with the low-
est Route Distinguisher value can be elected).

The traffic marking extended community instruct a system to modify the
DSCP bits of a transiting IP packet to the corresponding value. This
extended community is encoded as a sequence of 5 zero bytes followed by
the DSCP value.


7. Monitoring

   Traffic filtering applications require monitoring and traffic statis-
   tics facilities. While this is an implementation specific choice,
   implementations SHOULD provide:

      - A mechanism to log the packet header of filtered traffic,

      - A mechanism to count the number of matches for a given Flow
        Specification rule.









Marques, et al.                                                [Page 12]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


8. Security considerations

   Inter-provider routing is based on a web of trust. Neighboring
   autonomous-systems are trusted to advertise valid reachability infor-
   mation. If this trust model is violated, a neighboring autonomous
   system may cause a denial of service attack by advertising reachabil-
   ity information for a given prefix for which it does not provide ser-
   vice.

   As long as traffic filtering rules are restricted to match the corre-
   sponding unicast routing paths for the relevant prefixes, the secu-
   rity characteristics of this proposal are equivalent to the existing
   security properties of BGP unicast routing.

   Where it not the case, this would open the door to further denial of
   service attacks.


9. Acknowledgments

   The authors would like to thank Yakov Rekhter, Dennis Ferguson and
   Chris Morrow for their comments.



10. References

   [BGP-BASE] Y. Rekhter, T. Li, S. Hares, "A Border Gateway Protocol 4
        (BGP-4)", draft-ietf-idr-bgp4-20.txt, 03/03

   [BGP-MP] T. Bates, R. Chandra, D. Katz, Y. Rekhter, "Multiprotocol
        Extensions for BGP-4", RFC2858.


11. Authors' Addresses

Pedro Marques
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
Email: roque@juniper.net










Marques, et al.                                                [Page 13]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


Nischal Sheth
Juniper Networks
1194 N. Mathilda Ave.
Sunnyvale, CA 94089
E-mail: nsheth@juniper.net


Robert Raszuk
Cisco Systems, Inc.
Al. Jerozolimskie 146C
02-305 Warsaw, Poland
Email: rraszuk@cisco.com


Barry Greene
Cisco Systems, Inc.
Email: bgreene@cisco.com



Jared Mauch
NTT/VERIO
8285 Reese Lane
Ann Arbor, MI, 48103-9753
Email: jmauch@verio.net | jared@puck.nether.net


Danny McPherson
Arbor Networks
Email: danny@arbor.net



Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any assur-
   ances of licenses to be made available, or the result of an attempt
   made to obtain a general license or permission for the use of such
   proprietary rights by implementers or users of this specification can



Marques, et al.                                                [Page 14]


Internet Draft     draft-marques-idr-flow-spec-01.txt       October 2004


   be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFOR-
   MATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
   OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


















Marques, et al.                                                [Page 15]