MPLS Working Group                                         Zheng Wang
Internet Draft                                     Grenville Armitage
Expiration: Dec 1997                   Bell Labs, Lucent Technologies
                                                            July 1997


             Scalability Issues in Label Switching over ATM


Status of this Memo

   This document is an Internet Draft. Internet Drafts are working
   documents of the Internet Engineering Task Force (IETF), its Areas,
   and its Working Groups. Note that other groups may also distribute
   working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months. Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time. It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a "working
   draft" or "work in progress."

   Please check the 1id-abstracts.txt listing contained in the
   internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net,
   nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the
   current status of any Internet Draft.


1.  Introduction

   The scalability of label switching over ATM is one of fundamental
   issues in MPLS that has not been fully understood. Whether or not one
   should assume stream merging in ATM is a major design decision that
   has many implications to MPLS protocols and ATM hardware design.  The
   issues are also common to any proposals for setting up labels
   [1,2,3,4,5].

   In this document, we present an analysis of scalability of label
   switching over ATM, and examine some possible solutions. The document
   is intended to do two things:

   - Facilitate discussions in the MPLS WG that lead to realistic
     assessments of the label space issues,
   - Result in additional text for the FrameWork document that captures the
     refined assessments.





Wang & Armitage           Expiration: Dec 1997                  [Page 1]


Internet Draft            Scalability Issues                   July 1997


2.  Consequences of Conventional VPI/VCI Use


   In the absence of non-standard ATM switch hardware, the need to avoid
   interleaving of cells from different AAL5 PDUs on a single VCC makes
   it necessary to use a different label for each source/destination
   pair. Therefore the number of labels required is O(N**2) for N end-
   points (sources and destinations) in a cloud.

   We now look at the worst-case label requirement, namely the maximum
   number labels required on a single link in one direction. To set up
   switched paths based on destination-based routing tables for a net-
   work with of N endpoints, the worst-case label requirement is as fol-
   lows:

   (N**2)/4 if N is an even number

   (N**2 - 1)/4 if N is an odd number

   It should be emphasized that this is the worst-case scenario which
   may never happen in real networks. However, the worst-case analysis
   gives us a very conservative estimation of the scalability of label
   switching over ATM.

   The worse-case scenario occurs only when the following two extreme
   conditions are met:

   1) a network is divided into two parts with N/2 endpoints each (or
   with (N+1)/2, (N-1)/2 each if N is an odd number), and the two parts
   are connected with a single link

   2) each endpoint in one part is simultaneously communicating with all
   endpoints in the other part.

   Note that it is the link between the two parts that will hit the
   worst-case label requirement.

   To set up switched paths between all N endpoints based on
   destination-based routing, it translates into an upper limit of
   2**(0.5*M + 1) endpoints, where M is the length of the label in bits.
   For simplicity, we assume here that both N and M are even numbers.

   If we use the 28 bits VPI/VCI space in ATM for labels, the upper
   limit is 32K endpoints and maximum flows is 256M on a single link.

   The results has a number of implications on the way we deal with the
   scalability issues which we will discuss in the next a few sections.



Wang & Armitage           Expiration: Dec 1997                  [Page 2]


Internet Draft            Scalability Issues                   July 1997


3.  Cloud Size

   Given that each endpoint represents an Edge LSR of an MPLS domain (a
   edge router of the overlying routing domain), 32K endpoints would
   seem to be a fairly large figure for majority of current networks.
   Furthermore, the worst-case scenario occurs when a network can be
   divided into two parts and there is only a single link between the
   two. However, in most real network topology, there are usually multi-
   ple connections between any two parts of a network. Therefore the
   upper limit can be several times bigger than 32K.

   On the other hand, the results assume that we only have best-effort
   destination-based forwarding. Other types of traffic such as multi-
   cast, RSVP/explicit routes will also consume label space. However, it
   is difficult to quantify the level of such traffic in the future
   Internet, and it is likely that associated switched paths will be
   established on an 'as needed' basis.

   If we wanted to pre-establish switched paths for a few different
   classes of traffic such as low delay, high throughput, high reliabil-
   ity etc, the worst-case upper limit is then reduced by K*N, where K
   is the maximum number of classes and N is the number of endpoints.
   This will reduce the scalability significantly. The implication is
   that for traffic other than best-effort, on-demand/on-request label
   setup is a more scalable approach as the likelihood of all the flows
   for all possible classes active on a single link is very small.

   Note, the theoretical limit imposed by the size of the VPI/VCI bit-
   space actually overstates the case by ignoring the practical limits
   imposed by the ATM NICs of Ingress and Egress LSRs. Typical NICs can
   support in the order of a few thousand simultaneous SAR instances. An
   Ingress LSR with a NIC that supports 4k SAR instances can at most
   have only 4k labeled paths originating from it and terminating on it.
   Any MPLS domain built with Edge LSRs supporting Y SAR instances will
   have substantially less than Y edge LSRs. This has a consequential
   impact on the number of labels demanded through the core LSRs of the
   MPLS domain.


4.  Setup On-Demand/On-Request

   Instead of pre-establishing switched paths among all endpoints, one
   can set up switched paths on-demand or on-request. Such setup is use-
   ful for the following reasons:

   1) For many traffic types such as multicast and QoS/explicit routes,
   pre-establishment of switched paths is not possible.



Wang & Armitage           Expiration: Dec 1997                  [Page 3]


Internet Draft            Scalability Issues                   July 1997


   2) On-demand/on-request setup can exploit the locality of the traffic
   flows thus improves the scalability.

   With on-demand/on-request setup, the theoretical scalability issue
   becomes the probability of having 256M flows simultaneously active on
   a single link. Even on backbone links, and given the limited abili-
   ties of Ingress and Egress LSRs to source and sink thousands of
   labeled paths, this number of independent and non-aggregatable flows
   is arguably unlikely.

   With on-demand/on-request setup, the scalability issue becomes the
   probability of having 256M flows simultaneously active on a single
   link. Even on backbone links, this number of independent and non-
   aggregatable flows is arguably unlikely.

   Even if this becomes a problem in the future, intra-LSR solutions are
   possible (e.g. the virtual VC space, which is discussed in the next
   section).


5.  Virtual Label Space

   It is conceivable that an unusual topology could result in the worst
   case label consumption predicted above (e.g. some hot spots in a
   backbone network connecting two large networks by a single link).
   However, since the worst-case label consumption is localized, it is
   arguably preferable to find a localized solution (rather than some-
   thing that would affect all switches in an MPLS domain).

   One simple solution is to use the the Virtual label space. At such
   hot spots, we can have multiple parallel physical links instead of a
   single physical link. For example, if we have L smaller physical
   links distributed across L ports between two LSRs, the total usable
   label space (on the link, and in the port cards of the LSRs) is
   expanded by L times relative to what a single link could support.


6.  VP Merge

   VP merge allows multiple VPIs to be merged and uses different VCIs
   for distinguishing flows or packets within the merged VP. So each
   egress router can be represented with a single VPI, and packets from
   different ingress routers going to the same egress router simply use
   different VCI at the mergeing point. With VP merge the total number
   of labels available is not changed when compared to simply using the
   whole VPI/VCI space as a single label. However, since VPIs are set up
   for forwarding and VCIs are allocated "as needed" to resolve cell
   interleaving, So VP merge does improve the scalability by exploiting


Wang & Armitage           Expiration: Dec 1997                  [Page 4]


Internet Draft            Scalability Issues                   July 1997


   the locaility of the flows. In this sense, it is similar to the On-
   Demand/On-Request setup discussed in section 4. However, the differ-
   ence is that in VP merge, VPI space is pre-allocated while VCI space
   is allocated "as-needed".  This feature does seem to be a good trade-
   off between setting all label switched paths in advance and allocat-
   ing on a purely "as-needed" basis. VP merge also reduces the number
   of labels that have to be managed by the switches. However, the down-
   side of VP merge is that it requires collision detection and resolu-
   tion when allocating VCIs to make it work. Another problem is that
   the VPI space is limited to 4096.


7.  VC Merge

   VC merge can reduce the worst-case label requirement to N, where N is
   the number of endpoints. However, VC merge requires modifications to
   current ATM cell switching. In VC merge, a switch has to wait until
   the last cell of a packet to arrive before it can start to forward
   the cells. In effect, the switch operates in a frame-forwarding mode.
   VC merge may introduce extra buffering, depending on whether inter-
   leaving of cells from packets going to different destinations.  For
   FIFO queuing, no such interleaving takes place. Thus a VC-merged net-
   work has the same performance as a frame-based network.  If we assume
   per-flow round-robin, cells from packets to different destinations
   may interleave, at the next switch, the cells have to be sorted out
   in the re-assemble buffer. At the cell level, the switch now operates
   in a non-work-conserving mode which introduces extra delay and
   buffering, particularly when the utilization is low.


8.  Security Issues


   Security Issues are not discussed here.


9.  Conclusion


   Based on the above analysis, our conclusion is that combined VPI/VCI
   space in ATM should be able to support networks of sufficient sizes,
   and even label space is exhausted on some hot spots, simple solutions
   exist to extend label space at such points.







Wang & Armitage           Expiration: Dec 1997                  [Page 5]


Internet Draft            Scalability Issues                   July 1997


10.  References


   [1] Y. Rekhter, B. Davie, D. Katz, E. Rosen, G. Swallow,
   "Cisco Systems' Tag Switching Architecture Overview",
   RFC2105, Feb 1997

   [2] A. Viswanathan, N. Feldman, R. Boivie, R. Woundy,
   "Aggregate Route-Based IP Switching",
   Internet-Draft, Mar 1997

   [3] Y. Katsube, K. Nagami, H. Esaki,
   "Cell Switch Router - Basic Concept and Migration Scenario"
   Networld+Interop'96 Engineer Conference, July 1996

   [4] Peter Newman, Tom Lyon, Greg Minshall,
   "Flow Labelled IP: A Connectionless Approach to ATM"
   IEEE Infocom, March 1996

   [5] Arup Acharya, Rajiv Dighe, Furquan Ansari,
   "IPSOFACTO: IP Switching Over Fast ATM Cell Transport",
   Internet Draft, July 1997

   [6] Indra Widjaja, Anwar Elwalid, "Performace issues in VC-Merged
   Capable Switches for IP over ATM Networks", pre-print, 1997.

Authors' Address:

   Zheng Wang
   Bell Labs Lucent Technologies
   101 Crawfords Corner Road
   Holmdel, NJ 07733
   Email: zhwang@bell-labs.com

   Grenville Armitage
   Bell Labs Lucent Technologies
   101 Crawfords Corner Road
   Holmdel, NJ 07733
   Email: gja@lucent.com











Wang & Armitage           Expiration: Dec 1997                  [Page 6]