Internet Engineering Task Force                               David Ward
 Internet Draft                                      Internet Engineering
                                                               Group, LLC
 draft-ward-bgp4-ibb-00.txt
                                                             John Scudder
                                                     Internet Engineering
                                                               Group, LLC
                                                               June, 1999
 
 
                   BGP Notification Cease: I'll Be Back
 
 
                      <draft-ward-bgp4-ibb-00.txt>
 
 
 Status of this Memo
 
    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC2026.
 
    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups. Note that
    other groups may also distribute working documents as Internet-
    Drafts. Internet-Drafts are draft documents valid for a maximum of
    six months and may be updated, replaced, or obsoleted by other
    documents at any time. It is inappropriate to use Internet-Drafts
    as reference material or to cite them other than as "work in
    progress."
 
    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt
 
    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html.
 
 
 1. Abstract
 
    Many recent router architectures decouple the routing engine from
    the forwarding engine, so that packet forwarding can continue even
    if routing software is not active.  The current definition of the
    BGP protocol does not support this.  We propose a new variety of
    CEASE NOTIFICATION message (IBB) which indicates to a peer that the
    router sending the notification expects to be able to continue
    forwarding traffic for a certain period of time without an
    established BGP peering session.  We also propose a new OPEN
    message (ICB) that if received during the HOLDTIME period, does not
    require conventional reestablishment of the BGP peering session.
    These capabilities are useful for orderly and non-intrusive routing
    software updates, operating system updates, AS number migration,
    redundancy and catastrophic event handling.
 
 Ward, Scudder   Internet Draft   June 1999                  page     1
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
 2. Conventions used in this document
 
    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
    this document are to be interpreted as described in RFC-2119.
 
  3.Introduction
 
    Goals:
 
    a.        Continued forwarding in the absence of an Established BGP
       peering session
    b.        Traffic shall continue to flow over the preferred path which
       would be used if the BGP speaker had not closed the session
    c.        Routes will not be flapped.
 
    Applications:
 
    a.        Support minimally intrusive upgrade of routing software,
       operating system, hardware, etc.
    b.        Support minimally intrusive AS, IP, interface, etc. renumbering
    c.        Support minimally intrusive catastrophic software events
 
 4. Operation
 
    IBB introduces a new OPEN option, a new CEASE NOTIFICATION option,
    and a new Capabilities Negotiation [BGP-CAP] option.
 
    BGP operation is modified as follows:
 
 4.1. Capability Negotiation
 
    IBB must be negotiated at session startup time using Capability
    Negotiation.  (See Section 5 for discussion of why this is
    necessary.)
 
    The capability encoding for IBB is as follows:
 
        Capability Code: TBD (1 octet)
        Capability Length: 6 (1 octet)
        Capability Value:
            Flags: reserved, must be transmitted as zero (2 octets)
            Maximum IBB timeout in seconds: (2 octets unsigned)
            Maximum route refresh timeout in seconds: (2 octets
            unsigned)
 
    The IBB and route refresh timeouts specify the maximum timeout
    values the BGP speaker is willing to accept.  The maximum timeout
    values are a matter of local configuration.  360 seconds is
    suggested as a reasonable default value for both maxima.  The
    actual timeouts which will be used are based on the timeouts
    proposed in the IBB CEASE and ICB OPEN; see below.
 
 Ward, Scudder   Internet Draft   June 1999                  page     2
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
 4.2. Closing a Session With IBB CEASE
 
    After IBB has been successfully negotiated, if a BGP speaker wants
    to temporarily disconnect the session but is capable of continuing
    to forward packets, it MAY close the session using a special CEASE
    NOTIFICATION message called the _I'll be back_, or IBB CEASE.  The
    IBB CEASE adds the following option to the standard CEASE
    NOTIFICATION message:
 
        Error code = 6 (Cease) (one octet)
        Error subcode = 1 (IBB) (one octet)
        Flags = Reserved, must be sent as zero (two octets unsigned)
        Data0 = IBB timeout in seconds (two octets unsigned)
        Data1 = not used (two octets unsigned)
 
    The semantics of the IBB CEASE are that the sender,
 
    a.        Will attempt to reestablish the session prior to the expiration
       of the IBB timeout, and
    b.        Will be able to continue forwarding packets in the interim.
 
    A BGP speaker MUST NOT send an IBB CEASE unless these criteria are
    met.  It MUST be possible for a router administrator to cause a BGP
    session to be closed with a conventional CEASE instead of an IBB
    CEASE.
 
    When a BGP speaker has multiple IBGP peers to which it will send an
    IBB CEASE, it MUST NOT set the IBB timeout as a value greater than
    the minimum of all maximum IBB timeout values negotiated by the
    IBGP peers.  A BGP speaker MUST NOT send an IBB CEASE to any IBGP
    peer unless all IBGP peers have successfully negotiated the IBB
    option. (See Section 5 for discussion of why this is necessary, and
    for a discussion of special considerations for route reflectors.)
 
    The IBB timeout selected SHOULD NOT greatly exceed the time needed
    for the BGP speaker to re-initiate its BGP connections; i.e. it has
    the sense of a _reboot time._  It MUST NOT exceed the maximum value
    established by the peer during capability negotiation.  (There are
    further restrictions for IBGP peers; see previous paragraph.)
 
    Upon receiving the IBB CEASE, the connection to the peer which sent
    the CEASE should be closed, just as with a normal CEASE. However,
    in place of marking the routes from the peer as invalid, as
    specified in section 6 of the BGP specification [BGP-4], the routes
    are scheduled for later cleanup as follows:
 
    a.        Create a timer scheduled to expire at the lesser of the IBB
       timeout received in the CEASE and the locally-configured
       maximum.  If the received IBB timeout exceeds the locally-
       configured maximum, an error SHOULD be logged.
    b.        Mark the routes from the peer which sent the CEASE to be deleted
       when the timer expires.
 
 Ward, Scudder   Internet Draft   June 1999                  page     3
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
    c.        If the IBB timeout expires, delete all marked routes
       immediately.
    d.        If a new session is opened with the peer without the ICB option
       (see below) being used, or if a session is attempted but fails
       (i.e., an error is detected before the session enters
       ESTABLISHED state) delete all marked routes immediately, and
       cancel the timer.
 
 4.3. Opening a Session With OPEN ICB
 
    When a peer which sent an IBB CEASE wishes to establish a new
    session, it must do so by negotiating IBB as specified in section
    4.1, with the addition of the _I Came Back_ (or ICB) OPEN
    parameter, which is encoded as follows:
 
        Parm. Type: TBD (one octet)
        Parm. Length: 3 (one octet)
        Parm. Value: Route refresh timeout in seconds (two octets
      unsigned)
        Flags: Reserved, must be sent as zero (one octet unsigned)
 
    An OPEN carrying the ICB parameter is known as an ICB OPEN.  The
    semantics of the ICB OPEN are that the sender,
 
    a.        Previously sent an IBB CEASE, or terminated the previous session
       without sending a CEASE (e.g., due to a crash),
    b.        Has preserved the forwarding table it had prior to sending the
       preceding IBB CEASE (the _old forwarding table_), and
    c.        Will not remove any NLRI from the old forwarding table prior to
       the expiration of the route refresh timeout.  (Note that it MAY
       update the NLRI, however.)
 
    A BGP speaker MUST NOT send an ICB OPEN unless these criteria are
    met.  A BGP speaker SHOULD NOT send an IBGP peer a route refresh
    timeout value which exceeds the minimum of the previously-
    negotiated route refresh timeouts for all IBGP peers.  Note that
    this MAY require writing route refresh timeout values to stable
    storage as they are negotiated.  (See Section 5 for discussion of
    why this is advisable.)
 
    The route refresh timeout value should be selected such that
    routing will typically have reconverged prior to its expiration.
    The exact means of selecting the value are implementation-specific,
    but MAY include manual configuration or heuristics based on the
    size of the Loc-RIB prior to session restart.  180 seconds MAY be
    used as a reasonable default value.
 
    When an ICB OPEN is received:
 
    a.        If there is a pending IBB timer, the timer is rescheduled to
       expire at the lesser of the route refresh timeout and the
       locally-configured maximum.
 
 Ward, Scudder   Internet Draft   June 1999                  page     4
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
    b.        If there is not a pending IBB timer, but there is already a
       session in ESTABLISHED state with the peer from which the ICB
       OPEN was received, and if that session had negotiated IBB, then
       the ESTABLISHED session should be terminated immediately, as if
       an IBB CEASE had been received.  (The effect will be to create a
       timer with a timeout value as given in (a), and to enqueue the
       peer's routes on that timer.)  This rule provides for, e.g.,
       non-intrusive transition from a primary to a backup route
       processor in the event of the failure of the primary in a router
       with redundant route processors.
 
    If a BGP session is begun with a peer whose previous session
    terminated with an IBB CEASE, if the new session does not begin
    with an ICB OPEN, then the pending IBB timer should immediately be
    expired, i.e. the peer's old routes should immediately be flushed.
    Likewise, if a session is begun which terminates with an error
    (i.e., a condition which causes the connection to be terminated
    with a NOTIFICATION code other than CEASE) before reaching
    ESTABLISHED state, the peer's old routes should be flushed.
 
    Under normal circumstances, the connection to the peer should be
    re-established in less than the IBB timeout period.  When new
    routes are received from the peer, they may either depict wholly
    new NLRI (in which case they are added to the Adj-RIB-In as per the
    BGP specification) or they may depict NLRI which are already
    present in the Adj-RIB-In waiting on the deletion timer.  In this
    case, the marked route is replaced by the refreshing route.  Such
    routes are said to have been refreshed, and are no longer
    candidates for deletion when the route refresh timer expires.
 
    A _previous session_ as discussed in this section is defined as a
    session with a BGP speaker whose IP address is the same as the IP
    address of the new session.  Note that router ID SHOULD NOT be used
    to determine if a session is the _previous session_; this
    facilitates using IBB to non-intrusively change the router ID of a
    BGP speaker.
 
 4.4. Route Reflectors
 
    Note that it is only necessary that all direct IBGP peers of the
    BGP speaker support IBB, not all IBGP speakers in the routing
    domain if route reflection is in use.  If route reflection is in
    use, then if an IBB cease is sent to a reflector which implements
    IBB, then the reflector simply won't propagate withdrawals until
    the timeout period expires.
 
    The reflector itself is a special case.  It MAY send an IBB notify
    to any subset of peers which all support IBB -- that is, if all the
    reflector's clients support IBB, an IBB cease MAY be sent to all
    the clients.  If all the regular peers support IBB, an IBB cease
    MAY be sent to those peers.
 
 
 Ward, Scudder   Internet Draft   June 1999                  page     5
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
 5. Deployment
 
    The IBB cease may be used with external BGP peers with impunity.
    In the IBGP case, it's only safe to use IBB if all IBGP neighbors
    of the BGP speaker understand the IBB cease.  To understand why
    this is the case, consider the following topology:
 
                  B
                 / \
                A   D
                 \ /
                  C
 
    The topology is fully IBGP meshed; the diagram shows physical
    topology.
 
              A injects prefix X with Localpref 200
              B injects prefix X with Localpref 100
              A and D support IBB
              B and C do not
              C's shortest path to B is through D.
              D's shortest path to A is through C.
 
    Suppose A sends a CEASE/IBB to B, C and D.  D will retain A's route
    to X, with a next hop of C.  C, however, will remove A's route to
    X, and will instead select B's route, with a next hop of D.  A
    routing loop ensues.
 
    To avoid this situation, the IBB cease must not be sent to an IBGP
    peer unless the capability has been negotiated (see BGP-CAP). The
    same scenario holds true if different IBB timers are used for the
    different peers.  For this reason, this specification mandates that
    the same IBB timer, which is known to be acceptable to all IBGP
    peers, be used for all IBGP peers when sending IBB CEASEs.
 
    A similar scenario holds true if different refresh timers are used
    by the different peers _- consider the case where A does not
    refresh prefix X, D has a refresh timer of 100 seconds, and C has a
    refresh timer of 50 seconds.  For this reason, this specification
    suggests that the same refresh timer, which is known to be
    acceptable to all IBGP peers, be used for all IBGP peers when
    sending ICB OPENs.
 
 6. References
 
    [BGP-4] "A Border Gateway Protocol 4 (BGP-4)", Y. Rekhter and T.
    Li, RFC1771, March 1995.
 
    [BGP-CAP] "Capabilities Negotiation with BGP-4", R. Chandra and J.
    Scudder, Internet Draft, April 1998.
 
 7. Acknowledgements
 
 Ward, Scudder   Internet Draft   June 1999                  page     6
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
    Many people have contributed valuable ideas to this draft.  Enke
    Chen, Yakov Rekhter, Paul Traina and Curtis Villamizer provided
    particularly valuable comments.  Special thanks are given to Wayne
    Mesard of Sun Microsysytems, Inc. Thanks to Matthew C. Jones and
    Ralph Jensen for their review comments.
 
 8. Security Considerations
 
    This extension to BGP has the same security considerations as [BGP-
    4].
 
 9. Author's Addresses
 
      David Ward
      Internet Engineering Group, LLC
      122 South Main Street, Suite 280
      Ann Arbor, MI 48104
      dward@ieng.com
 
      John Scudder
      Internet Engineering Group, LLC
      122 South Main Street, Suite 280
      Ann Arbor, MI 48104
      jgs@ieng.com
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Ward, Scudder   Internet Draft   June 1999                  page     7
 
      <draft-ward-bgp4-ibb-00.txt>                           June, 1999
 
 
 Full Copyright Statement
 
    "Copyright (C) The Internet Society (date). All Rights Reserved.
    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain
    it or assist in its implementation may be prepared, copied,
    published and distributed, in whole or in part, without restriction
    of any kind, provided that the above copyright notice and this
    paragraph are included on all such copies and derivative works.
    However, this document itself may not be modified in any way, such
    as by removing the copyright notice or references to the Internet
    Society or other Internet organizations, except as needed for the
    purpose of developing Internet standards in which case the
    procedures for copyrights defined in the Internet Standards process
    must be followed, or as required to translate it into languages
    other than English.
 
    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.
 
    This document and the information contained herein is provided on
    an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
    ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
    IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
    THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Ward, Scudder   Internet Draft   June 1999                  page     8