Network Working Group                                Tissa Senevirathne
 Internet Draft                                                (Force10)
 Document: draft-tsenevir-l2vpn-pmesh-00.txt
 Category: Informational

                                                              June, 2001


    Use of Partial meshed tunnels to achieve forwarding behavior of full
                              meshed tunnels.

 Status of this Memo


    This document is an Internet-Draft and is in full conformance with
       all provisions of Section 10 of RFC2026 [1].

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups. Note that
    other groups may also distribute working documents as Internet-
    Drafts. Internet-Drafts are draft documents valid for a maximum of
    six months and may be updated, replaced, or obsoleted by other
    documents at any time. It is inappropriate to use Internet- Drafts
    as reference material or to cite them other than as "work in
    progress."
    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt
    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html.


    For potential updates to the above required-text see:
    http://www.ietf.org/ietf/1id-guidelines.txt

 Placement of this Memo in Sub-IP Area

    RELATED DOCUMENTS:

      See reference.

    WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK

    The ID presented fits in to the PPVPN WG and/or CCAMP WG.

    WHY IS IT TARGETED AT THIS WG(s)

    Sub-IP and IP tunnels are becoming a popular method in carrying
    data transparently over the provider or the core network.

    Use of such tunnels are key component of PPVPN infrastructure. On
    the other hand CCAMP WG charter includes defining common control
    and measurement plane. Hence optimal use of tunnels is an integral
    part of the control infrastructure.


  Senevirathne       Informational - December 2001                   1
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


    JUSTIFICATION

    Increasing number of service providers are offering Ethernet
    services to the customer. In the core of the network IP or Sub-IP
    technologies are used. In general, Ethernet services provided to
    customers are VPN service having multi-points of service access (as
    opposed to point-to-point).

    Requirement to use fully meshed networks seriously affects the
    scalability of Layer 2 NBVPN. The methods presented in this
    document facilitate service providers to offer scalable Layer 2 VPN
    solutions.

 1. Abstract

    This document presents methods to achieve proper forwarding of
    Broadcast, Multicast and Unknown traffic over a set of partial mesh
    tunnels. In addition, the methods presented in this document may be
    used to achieve loop free topology.


 Senevirathne        Informational - December 2001                   2
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001



 2. Conventions used in this document

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
    this document are to be interpreted as described in RFC-2119 [2].

    Table of Content

    1. Abstract......................................................2
    2. Conventions used in this document.............................3
    3. Introduction..................................................3
    4.0 Deployment scenario for Layer 2 NBVPN........................4
    5.0 Objective and Methodology....................................5
    6.0 Building Blocks..............................................6
    7.0 Application of Minimum Spanning Tree Algorithm...............7
    7.1 Calculation of Minimum Spanning Tree.........................7
    7.2 Calculation of Source Mesh...................................7
    7.3 Calculation of Intermediate Mesh.............................8
    8.0 Application of Shortest Path First (SPF)algorithm:...........8
    8.1 Calculation of Source Mesh...................................8
    8.2 Calculation of Intermediate Mesh.............................9
    9.0 Comparison between SPF and STP algorithms....................9
    10.0 Interaction between various components.....................10
    11.0 Other Applications:........................................10
    12.0 Security Considerations....................................11
    13.0 References.................................................11
    14.0 Acknowledgments............................................11
    15.0 Author's Addresses.........................................11
    Full Copyright Statement........................................12


 3. Introduction

    Problem Definition

    Layer 2 NBVPN services use tunneling methods such as IPSec or MPLS
    to carry customer's Layer 2 traffic over the provider's network.
    Layer 2 NBVPN deployment, in general, requires many-to-many
    connectivity. Unknown unicast, Broadcast and Multicast traffic are
    required to be forwarded to all end-points of the Layer 2 NBVPN.
    When there are more than two end-points in the Layer 2 NBVPN, to
    achieve the required Layer 2 behavior, a set of fully meshed
    tunnels are required. However requirement of such fully meshed
    tunnels seriously affects the scalability of Layer 2 NBVPN that
    require many-to-many connectivity.


    In this document we provide methods to achieve required Layer 2
    behavior using set of partially meshed tunnels.


 Senevirathne        Informational - December 2001                   3
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


     In the first part of the document a typical deployment scenario is
    presented. In this section we also explain the proposed solution
    using an example.

    In the second part required building blocks for the proposed
    solution is discussed.

    Later in the document use of Spanning Tree and Shortest Path First
    Algorithms are discussed. Also discussed are the advantages, and
    disadvantages of using Spanning Tree Algorithms vs. Shortest Path
    First Algorithms for the purpose.

    Scope of the Document

    In this document word node and end-point is used to identify PE
    devices. The discussion in the document does not focus on devices
    in the core of the network. The "tunnel" in this discussion refer
    to a logical connection between two PE devices.


 4.0 Deployment scenario for Layer 2 NBVPN


    Consider the scenario where there are (n)end points in the Layer 2
    NBVPN. Now each PE device is required to maintain (n -1) tunnels to
    all other end points. As a result the network is required to
    maintain n*(n-1) set of tunnels.

    Diagram below depicts partial mesh deployment of a 4 end-point
    Layer 2 NBVPN deployment. Typically, some end-points may be
    connected to all other end-points and some may only be connected to
    a sub set of end-points.



     --------                               -------
    |  (PE)  |                             |  (PE)  |
    |   A    |---------                    |   D    |
    |        |         \                   |        |
     --------           \                   --------
         |               \                      |
         |                \                     |
         |                 \                    |
         |                  \                   |
         |                   \                  |
     --------                 \            --------
    |  (PE)  |                  --------- |  (PE)   |
    |   B    |----------------------------|   C     |
    |        |                            |         |
     --------                              ---------




 Senevirathne        Informational - December 2001                   4
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


    Fig: Partial Mesh connectivity

    In theory, a partial mesh graph G may be represented by union of
    set of fully mesh graphs g and set of partial mesh graphs g'. In
    the above diagram graph G {A, B, C, D} contain fully mesh graph g
    {A,B,C} and partial mesh graph g' {C,D}.

    Discussions in this document will focus on the forwarding at node A
    and C. Node A represents partial mesh node. Node C represents a
    fully meshed node. However, the discussion is equally applicable to
    all other nodes in other graphs with any arbitrarily number of
    nodes. Only restriction is that graph is required to be connected
    (ie no disconnected nodes).


 5.0 Objective and Methodology

    Objective

    Each node must be able to derive the mesh topology of the Layer 2
    NBVPN domain. The derived topology by each node must be identical.
    As a result each node is capable of deriving nodes that have
    already received traffic from the upstream. Forwarding policies
    deduce set of outgoing tunnels for each incoming tunnel. Note: for
    fully meshed nodes the outgoing tunnel set is NULL.

    Methodology

    Each end-point advertise its preference (by some means) to receive
    traffic over a given tunnel T that is terminating/starting at the
    end-point. A tunnel T is represented using end-point addresses
    (id). The preference value assigned must be coordinated via global
    policies. As an example, lower numerical value represents higher
    preference.

    We assume that there is control plane IP connectivity between all
    end-points. Intermediate devices in the IP plane do not modify the
    tunnel preference as they forward advertisements.

    Each source node chooses set of optimal tunnels, using the
    preference information received. If there is no direct tunnel the
    best intermediate end-point is selected. The set of tunnels that a
    source end-point use to forward unknown, multicast and broadcast
    traffic is called Source Mesh and denoted by Ms{A} where A is the
    end-point id. Ms{A} = [A-B,A-C].

    Each end-point also maintains set of tunnels for each incoming
    tunnel. These tunnels are called intermediate mesh and represented
    by receiving Tunnel and intermediate end-point address. The
    intermediate Mesh is denoted by Mi{k,t}; where j is the
    intermediate node and t is the tunnel that traffic is arriving. As
    an example intermediate mesh for tunnel A-C at end-point C is Mi{C,
    A-C}. Mi{C,A-C} = [D].

 Senevirathne        Informational - December 2001                   5
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001



    The above source and intermediate graphs(mesh) are derived using a
    Minimum Spanning Tree or a Shortest Path First Algorithm. Based on
    the methods presented in discussion the intermediate mesh derived
    by an end-point is a subset of source mesh of the corresponding
    source end-point. As an example Mi{C, A-C} is a subset of Ms{A}.
    Thus guaranteeing loop free forwarding.

    [3] has specified the requirement to maintain a separate Virtual
    Forwarding Instance (VFI) by each PE device for each Layer 2 NBVPN
    domain. [3] also specify the requirement for each VFI to contain a
    flooding scope. Flooding scope of VFI represents tunnels and local
    ports that any unknown, broadcast or multicast packets should be
    forwarded. We propose to use multiple flooding scopes; flooding
    scope for locally originating traffic is called source flooding and
    denoted by Fs{i} where i is the end-point identifier. At end-point
    A source flooding scope is denoted by Fs{A}. Flooding scope of non
    locally originating traffic is called intermediate flooding scope
    and denoted by Fi{j-t} where j is the intermediate endpoint id and
    t is the receiving tunnel id. As an example at end-point C there
    are three intermediate flooding scopes: Fi{C,A-C}, Fi{C, B-C},
    Fi{C, D-C}

    As a result; in theory each end-point for each Layer 2 NBVPN domain
    has a single source flooding scope and multiple intermediate
    flooding scopes (each for each tunnel).

    Although exact implementation details of multiple flooding scopes
    are beyond the scope of this document we would like to present a
    simple method to implement multiple flooding scopes. A CAM (Content
    Addressable Memory) lookup with ingress tunnel (port) id and Layer
    2 NBVPN domain Id may be used to obtain the appropriate flooding
    scope. Similarly, if MPLS is used as the tunneling method; incoming
    Label may be used to derive the corresponding flooding scope.


 6.0 Building Blocks

    The methods presented in this document can be broadly classified in
    to four major blocks. These building blocks collectively specify
    the implementation of Layer 2 NBVPN using partial meshed tunneling
    topologies.

    1. Global policy for reachability preference. It is important that
    all end-points use the same set of polices. Uses of such policies
    assure proper forwarding behavior. Reachability preference policies
    are used to derive the source and intermediate mesh.


 2. Advertisement protocol for advertisement of  reachability
 preferences. The protocol used for advertisement of preferences MAY be
 Link State. Use of such protocol guarantees faster convergence. OSPF
 Opaque LSA can be easily adapted for the purpose.

 Senevirathne        Informational - December 2001                   6
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001



    3. Tunneling methods. Tunneling protocols such as IPSec or MPLS can
    be used to implement required tunnels.


    4. Tree Algorithm. A Tree generation algorithm that is capable of
    building loop free graphs that use minimum cost concepts MUST be
    used. In this discussion we propose to use either a Minimum
    Spanning Tree Algorithm or Shortest Path First algorithm. Prim
    Algorithm [4] that is used in 802.1w [5] specification may be used
    for Minimum Spanning Tree. Dijkstra algorithm that is used in OSPF
    [6] may be used for Shortest Path First Algorithm.


 7.0 Application of Minimum Spanning Tree Algorithm

    When using ST there is a single tree for the entire Layer 2 NBVPN,
    rooted at some node. In order to identify the root node for Layer 2
    NBPVN, the participating nodes advertise the node priority. In
    addition the nodes are also required advertise the reachability
    preference for each tunnel that originate at the node.

    edge representation = {node-id, nexthop-node-id}

    node-id = IP address of the node.

    node-preference = [integer]

    Semantics of node-preference is a global policy. As an example
    numerically lower numbers may represent higher preference.

 7.1 Calculation of Minimum Spanning Tree

    Step 1: Select the node with the highest preference as the root
    node. In the event of a tie use the node-id as the tie barker.

    Step 2: Derive the Spanning Tree for the Layer 2 NBVPN using a
    minimum spanning tree algorithm. Here we propose to use Prim [4]
    Algorithm for the purpose.

    Step 3. Let the derived Spanning Tree is T.

 7.2 Calculation of Source Mesh

    Step 1. Select source node v.

    Step 2. Remove all the edges in the graph T except the edges that
    are directly connected to the node v. Let say this is graph T'.

    Step 3. Graph T' is the source Mesh Ms for node v.



 Senevirathne        Informational - December 2001                   7
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


    Step 4. The Graph T' represent the set of active tunnels to forward
    traffic (unknown, broadcast and multicast) originating from the
    node v.


 7.3 Calculation of Intermediate Mesh

    Let v is the local node.

    Let T' is the source Mesh derived for the Local node v.

    Let g is set of all nodes that have a directed edge with the local
    node v.

    for each node u in g

    Step 1: Remove corresponding edge in T'.

    Step 2: The resultant graph T" is (intermediate mesh) Mi{v, u-v}.
    Broadcast, unknown and multicast traffic arriving on tunnel {u-v}
    MUST be forwarded on T".

    Repeat the above process for each tunnel starting/terminating at
    node v.


 8.0 Application of Shortest Path First (SPF)algorithm:


    We propose to use Dijkstra algorithm for the purpose of calculating
    Source Mesh and Intermediate Mesh. Nodes (endpoints), edges
    (tunnels) are represented as below.

    edge representation = {node-id, nexthop-node-id}

    node-id = IP address of the node.

    Preference = [integer]; preference to receive traffic over a
    tunnel. Deduction of preference is a global policy that all nodes
    agrees.

    Treat each Layer 2 NBVPN domain as a single graph. In analogy to
    OSPF, that is single area.

 8.1 Calculation of Source Mesh

    Step 1. Calculate the SPF tree T for the source node v.

    Step 2. Remove all the edges in the graph T except the edges that
    are directly connected to the node v. Let say this is graph T'.

    Step 3. Graph T' is the source Mesh Ms for node v.


 Senevirathne        Informational - December 2001                   8
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001



    Step 4. The Graph T' represent the set of active tunnels to forward
    traffic (unknown, broadcast and multicast) originating from the
    node v.


 8.2 Calculation of Intermediate Mesh

    Let v is the local node.

    Let g is set of all nodes that have a directed edge with the local
    node v.

    for each node u in g

    Step 1: calculate the SPF tree T for node u such that node u is
    member of g.

    Step 2: Traverse from u to each node in T. Remove nodes that does
    not require traversing via local node v. Let the resultant graph
    T'.

    Step 3: select local node v. Remove all the edges in T' that do not
    have a direct edge with v. Remove edge {u-v}. Let T'' is the
    resultant graph.

    Step 4: The resultant graph T'' is the set of active tunnels for
    traffic arriving on tunnel {u-v}. Broadcast, unknown and multicast
    traffic arriving on tunnel {u-v} MUST be forwarded on T''.

    Repeat the above process for each tunnel starting/terminating at
    node v.


 9.0 Comparison between SPF and STP algorithms

    When using Spanning Tree algorithm, there is a single Spanning Tree
    for the given Layer 2 NBVPN domain (graph). The Tree is rooted at
    the root node that was selected based on some criteria. As a
    result, path taken by some nodes to reach some other nodes may not
    be optimal.

    When using Shortest Path First Algorithms, there is a separate
    Shortest Path Tree for each node. As a result path taken by traffic
    originating at the node is always assured to take the best path.

    However, SPF requires each node to derive SPF trees for each node.
    On the other hand Spanning Tree algorithm requires deriving only a
    single tree. All intermediate meshes and Source mesh can be derived
    from the Spanning Tree (there is only one tree for the network/VPN
    domain). Hence Spanning Tree requires less iteration of the
    algorithm.


 Senevirathne        Informational - December 2001                   9
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


    Dijkstra is a very popular algorithm used to derive Shortest Path
    Trees. Dijkstra has computational complexity of O(n^2) ; where n is
    number of nodes.

    Prim's algorithm is a popular algorithm used for Spanning Tree.
    Prim's Algorithm has computational complexity of O(n^2) ; where n
    is number of nodes. Kruskal's [7] algorithm is a variation of
    Prim's. Kruskal's Algorithm has a computational complexity of
    O(E*logE) where E is number of edges and E << N^2.

 10.0 Interaction between various components

     --------       ---------                   --------
    | NBVPN1 |     | NBVPN2  |                 |NBVPNn   |
    | MDB    |     | MDB     |                 |MDB      |
     --------       ---------                   --------
       |               |                             |
       |               |                             |
       |     ----------------                        |
       |    | OSPF Opaque/   |                       |
        - - | BGP-MP mux     |-----------------------
             ----------------
                     |
                     |
                     |
      --------------------------------------------
     |                                            |
     |       Core Protocol Engine (OSPF/BGP)      |
     |                                            |
      --------------------------------------------


    NBVPN(i)MDB - Represent reachability information received from
    other end-points. SPF or STP calculation for source Mesh and
    intermediate Mesh for the NBVPN domain is performed in this
    context.

    OSPF Opaque/BGP-MP Mux - This module performs multiplexing of
    tunnel reachability information received to the correct NBVPN
    instance.

    Core Protocol Engine - This module represent the protocol
    implementation.

 11.0 Other Applications:

    The methods presented in this document may be easily applicable to
    any other applications that require optimum path selection via
    transit node. Optical Lambda switching is such application that may
    use the methods presented in this document.

    Internet Exchange Points (IEP) are a new evolving concept. IEP use
    a shared network fabric to provide multi-party peering for customer

 Senevirathne        Informational - December 2001                  10
                   draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001


    sites. Methods presented in this discussion can be easily adapted
    to provide multi-party peering using partially meshed networks.

 12.0 Security Considerations

    A security analysis of the methods presented in this discussion has
    not yet been performed.

 13.0 References


    1  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
       9, RFC 2026, October 1996.

    2  Bradner, S., "Key words for use in RFCs to Indicate Requirement
       Levels", BCP 14, RFC 2119, March 1997

    3  Senevirathne, T., and et.al, "Requirements for Layer 2 Network
       Based VPN", Work in Progress, May 2001.

    4  Gross, J., and Yellen, J, "Graph Theory and its Applications",
       CRC Press, 1998.

    5  IEEE/ISO , "Amendment 2-Rapid Reconfiguration", IEEE802.1w,
       March 26, 2001.

    6  Moy, J., OSPF Version 2, RFC 1583, March 1994.

    7  Aho, A.V., and et.al., "Data Structures and Algorithms",
       Addison-Wesley 1983.





14.0 Acknowledgments

   Several people provided suggestions and comments and volunteered to
   review this draft. The suggestions and feedback received helped this
   document to evolve to a draft. Waldemar Augustyn and Andrew Smith
   provided valuable suggestions and comments.


15.0 Author's Addresses

   Tissa Senevirathne
   Force10 Networks
   1440 McCarthy Blvd
   Milipitas, CA 95035
   Phone: 408-965-5103
   Email: tissa@force10networks.com


 Senevirathne        Informational - December 2001                  11
                  draft-tsenevir-l2vpn-pmesh-00.txt        June, 2001



Full Copyright Statement

   "Copyright (C) The Internet Society (2001). All Rights Reserved.
   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implmentation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into


Senevirathne        Informational - December 2001                  12