Network Working Group                  Toby Smith (Laurel Networks)
Internet Draft                    Andrew G. Malis (Vivace Networks)
Expiration Date: April 2002            Jack Shaio (Vivace Networks)


                Graceful Restart Mechanism for LDP

                draft-smith-mpls-ldp-restart-00.txt


1.  Status of this Memo

    This document is an Internet-Draft and is in full conformance
    with all provisions of Section 10 of RFC2026.

    Internet-Drafts are working documents of the Internet
    Engineering Task Force (IETF), its areas, and its working
    groups.  Note that other groups may also distribute working
    documents as Internet- Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or obsoleted by other
    documents at any time.  It is inappropriate to use
    Internet-Drafts as reference material or to cite them other
    than as ``work in progress.''

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed
    at http://www.ietf.org/shadow.html.


2.  Abstract

    This document proposes a lightweight mechanism that the LDP
    protocol may use to help minimize the impact introduced by
    transient interruptions to an LDP session's TCP connection,
    with a focus on connection preservation for signaled Layer 2
    circuits.  A new LDP Cork message request/response mechanism is
    specified. New message types are defined for the delivery of
    graceful restart events. Finally, procedures for utilizing this
    mechanism are detailed.


3.  Introduction

    The LDP protocol [1] provides label mapping information to its
    peer LSRs.  In addition to providing label mappings for IP
    prefixes, LDP has recently been adopted as a signaling
    mechanism for the establishment of Layer 2 circuits between two
    provider edge LSRs [2, 3].  Customers expect these circuits,
    like the physical circuits they emulate, to be highly available
    connections.

    Under some circumstances (planned outages, software upgrades),
    LDP may temporarily lose connectivity to its peer(s). In these
    circumstances, it is beneficial to the customer to maintain the
    LDP-established LSPs even in the (temporary) absence of an LDP
    session.

    This draft describes a proposal for a lightweight mechanism
    which allows LDP LSRs to retain their forwarding state, even
    when the connection to the peer LSR is temporarily lost.

    The procedure described in this draft has excellent scaling
    properties: the LDP state is preserved incrementally, such that
    after an unexpected restart of an LDP session, only the LDP
    activity not already acknowledged during the previous session
    needs to be resignaled.  In the case of provisioned Layer 2
    circuits, it is probable that no resignaling will be necessary.

    The procedure described in this draft is minimally invasive to
    the LDP state machine and requires no changes to the LDP
    message processing procedures.

    This mechanism may be used in conjunction with a mechanism for
    the preservation of IP forwarding state; when LDP is being used
    solely as a signaling mechanism for the establishment of Layer
    2 transports, however, such coordination is not required.

    The remainder of this document is organized as follows: A new
    LDP Cork message request/response mechanism is specified.  New
    message types are defined for the delivery of graceful restart
    events. Finally, procedures for utilizing this mechanism are
    detailed.


4.  Overview of Graceful Restart Mechanism

    LDP LSRs which support this graceful restart mechanism signal
    this capability with an additional Graceful Restart TLV sent as
    part of the session's Initialization messages.

    During normal session operation, each peer periodically issues
    a Cork message, defined below, which checkpoints the current
    label advertisement state between the peers.  Each cork message
    is acknowledged by the far end.

    If an LDP peer is able to recognize that it needs to
    temporarily drop its connection to its peer, this LSR (termed
    the Originating Peer) will send a special, final Cork message
    to each of its peer LSRs (termed the Receiving Peer(s)).

    When the Receiving Peer receives a final Cork message, it
    responds with a corresponding final Cork message to the
    Originating Peer. Upon receiving the final Cork message
    response from each Receiving Peer, the Originating Peer may
    sever its TCP connection(s).  All forwarding state
    corresponding to the cached state of the LDP protocol is
    preserved over the loss of connectivity with the LDP peer.

    Once the Originating Peer's LDP state is able to be
    re-established, it reconnects to each of its Receiving Peers,
    following the standard procedures for establishing TCP
    connections as specified in [1].

    When the TCP session to the Receiving Peer(s) has been
    re-established, the LSRs exchange Graceful Restart TLVs as part
    of their Initialization messages.  This TLV contains that
    checkpoint information corresponding to the last exchanged Cork
    messages, which allows the LSRs to resume operation without
    readvertising any checkpointed label mapping information.

    The details of the steps outlined in this section may be found
    in the Procedures section, below.


5.  Message Formats

    This section describes the new LDP message and TLV formats used
    by this document.


5.1 Cork Message

    The LDP Cork message is sent periodically by each participating
    LSR.  The Cork message may be used to checkpoint currently sent
    information, to acknowledge the reception of a previously
    received Cork message, or both.

    The rate at which periodic Cork messages are sent is locally
    determined by each participating LSR, and is implementation
    dependent.  For example, cork messages may be sent at regular
    intervals, or after a threshold of sent LDP messages has been
    exceeded. Cork updates are not necessary if the state of the
    LSR has not changed since the time the last Cork message was
    sent.

    Cork messages with the Final Bit set are used to flush all
    currently pending label mapping and nexthop messages to the
    peer LSR, in anticipation of dropping the connection to the
    peer.

    The encoding for the Cork Message is:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |0|       Cork (0x3F00)         |      Message Length           |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                           Message ID                          |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                     Acknowledged Message ID                   |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |F|C|A|        Reserved         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    Message ID
      32-bit value used to identify this message.

    Acknowledged Message ID
      A 32-bit value used to acknowledge the reception of a prior
      Cork message from the sender.  The receiver replies with a
      Cork message of its own, with this field set to the Message
      ID of the Cork message it is acknowledging.  If the
      Acknowledgement Bit is not set (see below), this field MUST
      be ignored.

    Final Bit
      A single bit denoting whether this message is the final
      checkpointing Cork message that the receiver should expect to
      receive from the sender.

    Checkpoint Bit
      A single bit denoting that this Cork message is being used
      by the sender to checkpoint its currently sent label and
      address information.  An LSR which receives a Cork message
      with the Checkpoint Bit set MUST acknowledge the reception
      of this message with a corresponding Cork message with the
      Acknowledgement Bit set (see below).  Cork messages with
      the Checkpoint Bit set MUST contain a non-zero Message ID.

    Acknowledgement Bit
      A single bit denoting that this Cork message is being used
      by the sender to acknowledge the reception of a previously
      received Cork message.  When the Acknowledgement Bit is
      set, the Acknowledged Message ID field MUST be set to the
      Message ID of the Cork message being acknowledged.

      A single Cork message may have both the Checkpoint and
      Acknowledgement Bits set, allowing a single message to
      both checkpoint recently sent information, as well as
      acknowledge recently received Cork messages.

    Reserved
      These 13 bits MUST be filled with zeroes.


5.2 Graceful Restart TLV

    The Graceful Restart TLV is contained within both the
    Originating and Receiving Peers' Initialization messages to
    denote their participation in the graceful restart protocol.


    The encoding for the Graceful Restart TLV is:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |0|0| Graceful Restart (0x3F00) |           Length              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                    Acknowledged Message ID                    |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                        Restart Timeout                        |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    Acknowledged Message ID
      If the LSR is establishing a connection to a peer for the
      first time, this field MUST be set to zero.

      If an LSR is re-establishing a session with a remote peer
      with which it had previously exchanged Cork messages, and if
      the local LSR's Restart Timeout time has not expired, this
      value MUST contain the Message ID of the last successfully
      acknowledged Cork message received from the remote peer. If
      the Restart Timeout time has expired, this value MUST be
      reset to zero.

    Restart Timeout
      32-bit unsigned non-zero integer that indicates the number of
      seconds that the sending LSR is willing to wait for
      re-establishment of the TCP connection between the peers
      after a restart has begun.  This timer is started when the
      current TCP connection is terminated.  The Restart Timeout
      MUST be calculated by using the smaller of the values sent in
      the Graceful Restart TLV to the peer LSR and the Restart
      Timeout value in the Graceful Restart TLV received from the
      peer LSR.


6.  Procedures

    This section describes in detail the procedures which must be
    implemented by participating LSRs.

    An LSR which is capable of participating in this mechanism
    includes a Graceful Restart TLV in the Initialization message
    it sends to its remote peer.

    If the Initialization message received from the remote peer
    does not contain a Graceful Restart TLV, or if the value
    contained in the Acknowledged Message ID field is not
    the value expected from that peer, then the graceful restart
    mechanism MUST NOT be employed, and no Cork messages may be
    sent to the remote peer.  In this case, if the local LSR has
    cached any state from a prior session to this peer, that cached
    state MUST be immediately discarded.

    For two LSRs which have successfully exchanged Graceful Restart
    TLVs, the Restart Timeout value used by both LSRs is calculated
    to be the lesser of the values exchanged by the peers.

    If this is the first time that the two LSRs have peered, or if
    the Restart Timeout time from a previous session has expired,
    the peering LSRs MUST include a value of zero in the
    Acknowledged Message ID field.

    When the exchanged Acknowledged Message ID values are
    non-zero, and neither LSR's Restart Timeout time has expired,
    both peers MUST resume operation of the LDP session as if all
    checkpointed sent and received information is still active.
    Upon returning to such a state, the first message sent by each
    LSR to its peer MUST be a Cork message with the Acknowledgement
    Bit set, and the Acknowledged Message ID set to the value
    contained in the LSR's Graceful Restart TLV Acknowledged
    Message ID field.  If the LSR is unable to restore
    its state for any reason, it MUST immediately send a Cork
    message with the Acknowledgement Bit set and containing an
    Acknowledged Message ID value of zero.  In either case, after
    exchanging Initialization messages with non-zero Acknowledged
    Message ID values, the first messages exchanged between the
    peers MUST be Cork messages.

    If an LSR which is re-establishing cached state after a restart
    receives an initial Cork message which does not match the value
    contained in the peer's Graceful Restart TLV, the receiving LSR
    MUST immediately discard any cached state, as the graceful
    restart has failed on the peer LSR.

    After successfully negotiating the use of the graceful restart
    mechanism, and restoring cached state (if recovering from a
    prior restart), the peering LSRs resume normal LDP operation.
    Each LSR periodically checkpoints the label mapping and nexthop
    information that it has sent to its peer and issues an
    unsolicited Cork message with the Checkpoint Bit set to its
    peer.  The sending LSR MUST NOT cache the current state of the
    sent session information until the remote peer acknowledges the
    receipt of the current Cork message.

    If the local LSR knows a priori that it is about to restart, it
    may issue a Cork message with the Final Bit set.  After sending
    a Cork message with the Final bit set, the sending LSR MUST NOT
    send any further Label Mapping, Label Withdraw, Address, or
    Address Withdraw messages to the receiving peer.

    An LSR which receives a Cork message from its peer with the
    Checkpoint Bit set MUST acknowledge the receipt of this message
    by responding to the sending peer with a Cork message with the
    Acknowledgement Bit set.  The receiving LSR MUST cache all
    received session information from the remote peer before
    acknowledging the reception of a Checkpoint Cork message.

    If the received Cork message's Final bit is set, the receiving
    peer immediately sends any pending Label Mapping, Label
    Withdraw, Address, and Address Withdraw messages to the sending
    peer, followed by a Cork message with the Final bit set in
    response.  This Cork message may also serve to acknowledge
    receipt of the sending peer's Final Cork message.  After
    sending the Cork message, the receiving peer MUST not send any
    more Label Mapping, Label Withdraw, Address, or Address
    Withdraw messages to the sending peer.

    An LSR which is expecting to be restarted initiates the
    graceful restart by sending a Cork message with the Final bit
    set to its peer.  This LSR may restart upon receiving both a
    corresponding Final Cork message from its peer, and upon
    receiving a Acknowledgement Cork message from its peer.  These
    two messages may be consolidated into a single message with the
    Final, Checkpoint and Acknowledgement Bits set.

    LSRs participating in this graceful restart mechanism do not
    expect to see a fatal Notification message from their remote
    peer before restarting.  If an LSR sends a fatal Notification
    message to its remote peer, or receives a fatal Notification
    from its remote peer, the LSR MUST discard any cached LDP state
    immediately.


7.  Operational Considerations

    This document describes a mechanism for the graceful
    re-establishment of LDP sessions, with a focus on providing a
    simple signaling recovery mechanism for Layer 2 transport
    LSPs. Given that the establishment of IP LSPs via LDP relies
    upon the existence of an underlying IGP to determine the
    network topology, a complete graceful restart mechanism
    requires a degree of coordination between LDP and its
    underlying IGP when restarting. This document does not address
    ways in which the IGP state may be preserved during a graceful
    restart.


8.  Security Considerations

    Given that this document describes a mechanism for preserving
    LDP session state during periods of lost connectivity, there
    may be concern that this proposal introduces new security
    risks. However, since the re-establishment of the LDP session
    is based upon the same mechanisms described in [1], and since
    the cached LDP session state is only eligible for use if an LDP
    session is re-established to a peer which had previously been
    peering with the LSR, the authors believe that this proposal
    does not impact the underlying security model of LDP.


9.  References

    [1] "LDP Specification", L. Andersson, P. Doolan, N. Feldman,
        A. Fredette, B. Thomas. RFC3036

    [2] "Transport of Layer 2 Frames Over MPLS", draft-martini-
        l2circuit-trans-mpls-08.txt. ( work in progress )

    [3] "MPLS-based Layer 2 VPNs", Kompella, et. al., draft-
        kompella-mpls-l2vpn-02.txt. ( work in progress )


10. Author Information

    Toby Smith
    Laurel Networks, Inc.
    1300 Omega Drive
    Pittsburgh, PA  15205
    Email: tob@laurelnetworks.com

    Andrew G. Malis
    Vivace Networks, Inc.
    2730 Orchard Parkway
    San Jose, CA 95134
    Phone: +1 408 383 7223
    Email: Andy.Malis@vivacenetworks.com

    Jack Shaio
    Vivace Networks, Inc.
    2730 Orchard Parkway
    San Jose, CA 95134
    Phone: +1 408 432 7623
    Email: Jack.Shaio@vivacenetworks.com