Network Working Group                                      Daniel Walton
Internet Draft                                                David Cook
Expiration Date: May 2003                                  Alvaro Retana
File name: draft-walton-bgp-add-paths-01.txt                John Scudder
                                                           Cisco Systems
                                                           November 2002

                 Advertisement of Multiple Paths in BGP
                   draft-walton-bgp-add-paths-01.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet Drafts are working documents of the Internet Engineering
   Task Force (IETF), its Areas, and its Working Groups. Note that other
   groups may also distribute working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months. Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time. It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a "working
   draft" or "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   The BGP specification [BGP] defines an "Update-Send Process" to
   advertise the routes chosen by the Decision Process to other BGP
   speakers.  No provisions are made to facilitate the advertisement of
   multiple paths to the same destination.  In fact, a route with the
   same NLRI as a previously advertised route implicitly replaces the
   original advertisement.

   This document proposes a mechanism that will allow the advertisement
   of multiple paths for the same prefix without the new paths
   implicitly replacing any previous ones.  The essence of the mechanism
   is that each path is identified by an arbitrary identifier in
   addition to its prefix.






Walton, et al                                                   [Page 1]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


1. Specification of Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].


2. Advertisement of Multiple Paths in BGP

   This section describes an alternate NLRI encoding that allows the
   advertisement of multiple paths in BGP.


2.1. Capability Advertisement

   This specification defines the capability [BGP_CAP] ADD_PATH.  The
   ADD_PATH capability has code TBD.  Its length is zero, there is no
   data.

   Capability code 4 defined in [RFC3107] MUST NOT be advertised if
   ADD_PATH is advertised (see also the section below entitled
   'Modifications to "Carrying Label Information in BGP-4"').


2.2. NLRI Encoding

   If two BGP speakers advertise the ADD_PATH capability to each other,
   the NLRI encoding is modified to add two new fields at the beginning
   of the NLRI -- a "flags" field (described below), and an identifier
   to distinguish the NLRI from other NLRI with the same prefix but
   different path attributes and/or nexthop.

   We note that in many BGP operations, the prefix is used as a key for
   identifying a datum.  For example, when withdrawing a route using the
   procedures of [BGP], only the prefix needs to be specified in order
   to withdraw the entire route.  For such purposes, the identifier
   field introduced by this specification is treated as part of the key.

   The following subsections specify the necessary modifications to
   existing encodings.  We recommend that future documents which specify
   NLRI encodings for BGP include an encoding (possibly the sole
   encoding) compatible with this specification.









Walton, et al                                                   [Page 2]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


2.2.1. Modifications to "BGP-4"

   "BGP-4" [BGP], section 4.6 (the sub-sections titled "Withdrawn
   Routes" and "Network Layer Reachability Information") are updated by
   the following:

      The Network Layer Reachability information is encoded as one or
      more 4-tuples of the form <flags, identifier, length, prefix>,
      whose fields are described below:

                        +---------------------------+
                        |   Flags (1 octet)         |
                        +---------------------------+
                        |   Identifier (2 octets)   |
                        +---------------------------+
                        |   Length (1 octet)        |
                        +---------------------------+
                        |   Prefix (variable)       |
                        +---------------------------+

      The use and the meaning of these fields are as follows:

         a) Flags:

            This is a one-octet bit-field and MUST NOT be used for
            identifying the path.  In other words, it does not form part
            of the key used to to identify the path. The following
            values are defined:


            BestPath (0x01)

                 If set to one, the bestpath bit indicates that the path
                 associated with the NLRI has been selected by the BGP
                 speaker for installation into its FIB.  If set to zero,
                 the path has not been selected.

                 If a route which was advertised with the bestpath bit
                 set to one is removed from the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to zero, or withdrawn.  Likewise, if a route which was
                 advertised with the bestpath bit set to zero is
                 selected for installation in the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to one, or withdrawn.


            FirstPath (0x02)



Walton, et al                                                   [Page 3]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


                 If set to one, the firstpath bit indicates the current
                 update contains the first of a series of paths for a
                 specific prefix.  Any paths received before this one
                 MUST be removed by the receiver. If set to zero, it
                 indicates that the current update is not the first in
                 the series.


            LastPath (0x04)

                 If set to one, the lastpath bit indicates that the
                 current update is the last one for the prefix.  If set
                 to zero, it indicates that more paths for the same pre-
                 fix MAY be advertised.

         b) Identifier:

            The Identifier field allows the address prefix and its asso-
            ciated path attributes ("path") to be distinguished from
            other paths for the same prefix.  The selection of identif-
            ier values is a local implementation decision.

            If the Identifier is set to 65535, then it MUST be inter-
            preted as an explicit withdraw for wall paths associated
            with the prefix.

         c) Length:

            The Length field indicates the length in bits of the address
            prefix.  A length of zero indicates a prefix that matches
            all (as specified by the address family) addresses (with
            prefix, itself, of zero octets).

         d) Prefix:

            The Prefix field contains an address prefix followed by
            enough trailing bits to make the end of the field fall on an
            octet boundary.  Note that the value of trailing bits is
            irrelevant.


2.2.2. Modifications to "Multiprotocol Extensions for BGP-4"

   "Multiprotocol Extensions for BGP-4" [MP_BGP], section 7 is replaced
   by the following:

      The Network Layer Reachability information is encoded as one or
      more 4-tuples of the form <flags, identifier, length, prefix>,



Walton, et al                                                   [Page 4]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


      whose fields are described below:

                        +---------------------------+
                        |   Flags (1 octet)         |
                        +---------------------------+
                        |   Identifier (2 octets)   |
                        +---------------------------+
                        |   Length (1 octet)        |
                        +---------------------------+
                        |   Prefix (variable)       |
                        +---------------------------+

      The use and the meaning of these fields are as follows:

         a) Flags:

            This is a one-octet bit-field and MUST NOT be used for iden-
            tifying the path.  In other words, it does not form part of
            the key used to to identify the path. The following values
            are defined:


            BestPath (0x01)

                 If set to one, the bestpath bit indicates that the path
                 associated with the NLRI has been selected by the BGP
                 speaker for installation into its FIB.  If set to zero,
                 the path has not been selected.

                 If a route which was advertised with the bestpath bit
                 set to one is removed from the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to zero, or withdrawn.  Likewise, if a route which was
                 advertised with the bestpath bit set to zero is
                 selected for installation in the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to one, or withdrawn.


            FirstPath (0x02)

                 If set to one, the firstpath bit indicates the current
                 update contains the first of a series of paths for a
                 specific prefix.  Any paths received before this one
                 MUST be removed by the receiver. If set to zero, it
                 indicates that the current update is not the first in
                 the series.




Walton, et al                                                   [Page 5]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


            LastPath (0x04)

                 If set to one, the lastpath bit indicates that the
                 current update is the last one for the prefix.  If set
                 to zero, it indicates that more paths for the same pre-
                 fix MAY be advertised.

         b) Identifier:

            The Identifier field allows the address prefix and its asso-
            ciated path attributes ("path") to be distinguished from
            other paths for the same prefix.  The selection of identif-
            ier values is a local implementation decision.

            If the Identifier is set to 65535, then it MUST be inter-
            preted as an explicit withdraw for wall paths associated
            with the prefix.

         c) Length:

            The Length field indicates the length in bits of the address
            prefix.  A length of zero indicates a prefix that matches
            all (as specified by the address family) addresses (with
            prefix, itself, of zero octets).

         d) Prefix:

            The Prefix field contains an address prefix followed by
            enough trailing bits to make the end of the field fall on an
            octet boundary.  Note that the value of trailing bits is
            irrelevant.


2.2.3. Modifications to "Carrying Label Information in BGP-4"

   "Carrying Label Information in BGP-4" [RFC3107] is modified as fol-
   lows. Section 4 ("Advertising Multiple Routes to a Destination") is
   deleted, as the procedures of this specification allow multiple
   routes to be advertised, so no other procedures are required.  For
   the same reason, the final paragraph of Section 5 (which specifies
   capability code 4) is deleted.  Section 3 is replaced by the follow-
   ing:

      Label mapping information is carried as part of the Network Layer
      Reachability Information (NLRI) in the Multiprotocol Extensions
      attributes.  The AFI indicates, as usual, the address family of
      the associated route.  The fact that the NLRI contains a label is
      indicated by using SAFI value 4.



Walton, et al                                                   [Page 6]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


      The Network Layer Reachability information is encoded as one or
      more 5-tuples of the form <flags, identifier, length, label, pre-
      fix>, whose fields are described below:

                        +---------------------------+
                        |   Flags (1 octet)         |
                        +---------------------------+
                        |   Identifier (2 octets)   |
                        +---------------------------+
                        |   Length (1 octet)        |
                        +---------------------------+
                        |   Label (3 octets)        |
                        +---------------------------+
                        +---------------------------+
                        |   Prefix (variable)       |
                        +---------------------------+

           The use and the meaning of these fields are as follows:

         a) Flags:

            This is a one-octet bit-field and MUST NOT be used for iden-
            tifying the path.  In other words, it does not form part of
            the key used to to identify the path. The following values
            are defined:


            BestPath (0x01)

                 If set to one, the bestpath bit indicates that the path
                 associated with the NLRI has been selected by the BGP
                 speaker for installation into its FIB.  If set to zero,
                 the path has not been selected.

                 If a route which was advertised with the bestpath bit
                 set to one is removed from the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to zero, or withdrawn.  Likewise, if a route which was
                 advertised with the bestpath bit set to zero is
                 selected for installation in the advertiser's FIB, the
                 route MUST be re-advertised with the bestpath bit set
                 to one, or withdrawn.


            FirstPath (0x02)

                 If set to one, the firstpath bit indicates the current
                 update contains the first of a series of paths for a



Walton, et al                                                   [Page 7]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


                 specific prefix.  Any paths received before this one
                 MUST be removed by the receiver. If set to zero, it
                 indicates that the current update is not the first in
                 the series.


            LastPath (0x04)

                 If set to one, the lastpath bit indicates that the
                 current update is the last one for the prefix.  If set
                 to zero, it indicates that more paths for the same pre-
                 fix MAY be advertised.

         b) Identifier:

            The Identifier field allows the address prefix and its asso-
            ciated path attributes ("path") to be distinguished from
            other paths for the same prefix.  The selection of identif-
            ier values is a local implementation decision.

            If the Identifier is set to 65535, then it MUST be inter-
            preted as an explicit withdraw for wall paths associated
            with the prefix.

         c) Length:

            The Length field indicates the length in bits of the address
            prefix.  A length of zero indicates a prefix that matches
            all (as specified by the address family) addresses (with
            prefix, itself, of zero octets).

         d) Label:

            The Label field carries one or more labels (that corresponds
            to the stack of labels [LABELS]).  Each label is encoded as
            3 octets, where the high-order 20 bits contain the label
            value, and the low order bit contains "Bottom of Stack" (as
            defined in [LABELS]).

         e) Prefix:

            The Prefix field contains an address prefix followed by
            enough trailing bits to make the end of the field fall on an
            octet boundary.  Note that the value of trailing bits is
            irrelevant.

      The label(s) specified for a particular route (and associated with
      its address prefix) must be assigned by the LSR which is



Walton, et al                                                   [Page 8]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


      identified by the value of the Next Hop attribute of the route.

      When a BGP speaker redistributes a route, the label(s) assigned to
      that route must not be changed (except by omission), unless the
      speaker changes the value of the Next Hop attribute of the route.

      A BGP speaker can withdraw a previously advertised route (as well
      as the binding between this route and a label) by either (a)
      advertising a new route (and, optionally, a label) with the same
      NLRI as the previously advertised route (keeping in mind that the
      identifier comprises part of the NLRI for this purpose), or (b)
      listing the NLRI (again keeping in mind the inclusion of the iden-
      tifier as part of the NLRI for this purpose) of the previously
      advertised route in the Withdrawn Routes field of an Update mes-
      sage.  In the latter case, no label information need be included.


2.3. Operation

   Using the identifier specified in the previous subsection, the same
   prefix can be advertised multiple times without subsequent advertise-
   ments replacing previous ones.  Apart from the fact that this is pos-
   sible, the route advertisement rules of [BGP] are not changed.  In
   particular, a new advertisement of a given NLRI (remembering that the
   identifier is part of the NLRI's definition) replaces a previous
   advertisement of the given NLRI.

   When two BGP speakers have advertised the ADD_PATH capability to each
   other, the NLRi encoding defined in this document MUST be used.


3. Deployment Considerations

   The intent of this extension is to be used in a controlled fashion
   for applications that require only partial propagation of the routing
   information, or specific individual recipients.

   Care should be taken when deploying this enhancement.  If deployed
   improperly, the presence of extra paths in some parts of the AS and
   not in others can cause inconsistent routing.  One scenario of par-
   ticular concern involves the IGP metric to the address depicted by
   the NEXT_HOP, and the MED attribute.  If this extension is used to
   advertise alternate paths, the best path [BGP] SHOULD also be adver-
   tised.  As long as the best path is still selected as best, the pres-
   ence of additional paths in some parts of the AS and not others will
   not cause inconsistent routing.  However, if the IGP metric to the
   address depicted by the NEXT_HOP should change such that a non best
   path is now preferred over the best path, then every router in the



Walton, et al                                                   [Page 9]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


   path to the address depicted by the NEXT_HOP should have the addi-
   tional paths.

   Because the MED is only compared between routes from the same AS
   [BGP], it is possible that an additional path could be selected as
   the best path. This may cause inconsistent routing if all routers in
   the forwarding path of the affected routers do not have the addi-
   tional paths.

   In a simple topology, it may be possible to anticipate these
   scenarios and avoid inconsistent routing while still enabling
   appropriate applications. Documents proposing applications of this
   extension SHOULD specify restrictions for propagating additional
   paths and should supply specific deployment guidelines.


4. Security Considerations

   This document introduces no new security concerns to BGP or other
   specifications referenced in this document.


5. Acknowledgments

   We would like to thank Dave Meyer, Srihari Ramachandra, Eric Rosen,
   Dan Tappan, Robert Raszuk, Mark Turner and Enke Chen for their com-
   ments and suggestions.


6. References

 [BGP]
      Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 (BGP-4)," RFC
      1771, March 1995.

 [RFC2119]
      Bradner, S., "Key words for use in RFCs to Indicate Requirement
      Levels," RFC 2119, March 1997.

 [BGP_CAP]
      Chandra, R. and J. Scudder, "Capabilities Advertisement with BGP-
      4," RFC 2842, May 2000.

 [MP_BGP]
      Bates, T., R. Chandra, D. Katz and Y. Rekhter, "Multiprotocol
      Extensions for BGP-4," RFC 2858, June 2000.

 [RFC3107]



Walton, et al                                                  [Page 10]


INTERNET DRAFT           Multiple Paths in BGP             November 2002


      Rekhter, R. and E. Rosen, "Carrying Label Information in BGP-4,"
      RFC 3107, May 2001.

 [LABELS]
      Rosen, E., D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci, T. Li
      and A.  Conta, "MPLS Label Stack Encoding", RFC 3032, January
      2001.


7. Authors' Addresses

         Daniel Walton
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: dwalton@cisco.com

         Alvaro Retana
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: aretana@cisco.com

         David Cook
         Cisco Systems, Inc.
         7025 Kit Creek Rd.
         Research Triangle Park, NC 27709
         Email: dacook@cisco.com

         John G. Scudder
         Cisco Systems, Inc.
         100 S. Main Suite 200
         Ann Arbor, MI 48104
         Email: jgs@cisco.com

















Walton, et al                                                  [Page 11]