Internet Engineering Task Force                        Curtis Villamizar
INTERNET-DRAFT                                                       ANS
draft-villamizar-mpls-omp-00                          November 18, 1998


                   MPLS Optimized Multipath (MPLS--OMP)





Status of this Memo


  This document is an Internet-Draft.  Internet-Drafts are working doc-
  uments of the Internet Engineering Task Force (IETF), its areas, and
  its working groups.  Note that other groups may also distribute work-
  ing documents as Internet-Drafts.

  Internet-Drafts are draft documents valid for a maximum of six months
  and may be updated, replaced, or obsoleted by other documents at any
  time.  It is inappropriate to use Internet- Drafts as reference mate-
  rial or to cite them other than as ``work in progress.''

  To view the entire list of current Internet-Drafts, please check the
  ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
  Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Eu-
  rope), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim),
  ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).

  This draft specifies a precedure for routing MPLS label switch paths
  and balancing load across these paths.  A portion of the algorithm
  has been verified in simulations, however a portion has yet to be
  simulated.  Some details of the algorithms are likely to change as
  simulations progress.  This draft contains notes pointing out sections
  which are likely to change.


Abstract


  MPLS ingress routers may establish one or more explicit paths to a
  given egress to the MPLS domain.  Load can be balanced across a com-
  plex topology using MPLS and the technique referred to here as MPLS-
  -OMP (Multiprotocol Label Switching -- Optimized Multipath).  The
  technique requires the use of an interior gateway protocol (IGP) such
  as OSPF or ISIS with extensions to flood loading information.  It re-
  quires the ingress router be capable of computing a hash based on the
  IP source and destination and selecting a forwarding entry based on

INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  the outcome of the hash with a sufficiently fine level of granularity,
  then load can be balanced across a complex topology.

  MPLS Optimized Mulitpath can be considered an extension to MPLS or a
  specific usage of MPLS. MPLS-OMP requires no additions or modification
  to the MPLS protocols.  MPLS-OMP does require that the IGP be capa-
  ble of flooding loading information as described in the OSPF-OMP and
  ISIS-OMP documents.  At the MPLS ingress an algorithm is applied to
  select alternate paths where needed and adjust forwarding.  Forwarding
  is adjusted gradually enough to insure stability yet fast enough to
  track long term changes in loading.  The load adjustment algorithm
  itself is fully defined in a related document describing OSPF--OMP.
  The application of this technique to MPLS is described here.



1  Overview


  Networks are often heavily loaded.  Topologies often evolve to include
  multiple paths.  Multiple paths may be initially designed to provide
  redundancy but also result from incremental addition of circuits to
  accommodate traffic growth.  The redundant paths provide a potential
  to distribute traffic loading and reduce congestion.  Optimized Mulit-
  path (OMP) provides a means to make better use of this potential to
  distribute loading.  Please refer to [11] for a more complete intro-
  duction to OMP.

  MPLS involves a forwarding technique.  The MPLS framework and ar-
  chitecture are defined by [1, 2].  The generic forwarding encapsula-
  tion is defined by [6].  MPLS--OMP involves the setup of MPLS Label
  Switched Paths (LSPs) with strictly routed Explicit Routes (ERLSPs).
  There are two methods of setting up LSPs, the Label Distribution Pro-
  tocol (LDP) [10], and use of RSVP extensions [5, 4].  Both methods of
  setting up LSPs support strict (and loose) routed ERLSPs.

  In an underutilized topology, criteria such as minimum delay dominate
  routing decisions.  Often IGP link costs (costs in OSPF, metrics in
  IS--IS) are set roughly according to link propagation delay.  If the
  entire topology is not underutilized, some consideration is given to
  lower capacity links or more heavily utilized links to reduce their
  utilization somewhat.

  When using MPLS--OMP, link costs may be set roughly according to link
  propagation delay without regard to link capacity or expected uti-
  lization.  An initial set of LSP will be set up according to these
  costs as described in Section 4.  There will initially one or more LSP
  between any ingress and any egress and these LSP will take the path
  of least cost.  There will be exactly one LSP between any ingress and
  egress except where multiple paths exist with equal cost.  If metrics
  are set with sufficient precision, for example taking delay in tenths


Villamizar                 Expires  May 18, 1999                  [Page 2]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  of a millisecond and using this as cost (such precision is possible
  with OSPF, but may not be in IS--IS where link metrics are limited to
  values of 0-63), then few if any equal cost paths will exist.

  MPLS uses link loading and loss information flooded by OSPF--OMP or
  ISIS--OMP as described in Section 2.  Each link along a given path
  is examined to see if loading persistently exceeds a specific set of
  criteria.  An alternate path will be created according to criteria
  which depends on the severity of the loading, the amount of time the
  condition has persisted, and the ingress relative contribution to the
  loading.  This is described in Section 5.

  The distribution of load among a set of alternate paths is deter-
  mined by the amount of number space from a IP source and destination
  hash computation that is allocated to each path.  For example using
  a 16 bit hash, if there are two paths, one may be given 80% of the
  16 bit number space or the values 0-52428 and the other given 52429-
  65535 to accomplish a split of approximately 4:1.  The initial load
  applied to a newly created alternate LSP is small.  As loading on the
  alternate path persists in being lower than loading on other paths in
  use, more load is moved to the path with lower loading.  The increment
  of change is gradually increased.  At some point, what was among the
  less loaded paths becomes more heavily loaded than all other paths.
  At this point, this path is designated the most heavily loaded, the
  load increment toward the previous most heavily loaded path is at
  least halved, and load is moved away from the new most heavily loaded
  path.  At this point load will begin to converge such that loading on
  the most heavily loaded link on each paths tends toward equality.  See
  Section 8 and Section 9 for details.

  When more than one path between an ingress and an egress exists, after
  load balancing, there may still be an excessive load on the entire set
  of paths.  The same criteria in terms of severity of loading, persis-
  tence of the condition, and ingress contribution used when examining
  a single path to see if an alternate is needed is used to examine a
  set of paths.  For each path currently in use, the loading on the most
  heavily loaded link of the path is considered to be the loading of the
  path.  The lowest path loading is considered to be the loading on the
  set of paths in considering whether to add an additional path.  The
  algorithm described in Section 5 covers this case as well.

  If loading is reduced or failed links are restored, then sets of paths
  may be found to have excess capacity.  Paths which are less desir-
  able in terms of cost, and so therefore also in terms of delay, can be
  deleted.  This is described in Section 6.

  The impact of link state changes on the path layouts is discussed in
  Section 7.  An algorithms is described that insures that path selec-
  tion returns to an optimal state after arbitrary link state changes
  have occurred and the topology has then become quiescent.



Villamizar                 Expires  May 18, 1999                  [Page 3]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  This overview provides a qualitative description of the MPLS--OMP
  algorithms.  The details of these algorithms are provided in the sec-
  tions that follow.



2  Use of ISIS--OMP and OSPF--OMP Flooding


  Both OSPF and IS--IS are link state protocols commonly used as in-
  terior gateway protocols (IGPs) [9, 7, 3].  MPLS relies on an IGP
  to determine the feasibility of paths and the relative preference of
  paths.  MPLS--OMP also relies upon the IGP to provide link loading
  information.  The flooding rate and encapsulation of link loading
  information provided by OSPF or IS--IS is defined in [11, 8].

  The OSPF or IS--IS flooding of link loading information must be en-
  abled.  The OSPF or IS--IS load adjustment should be disabled since
  the amount of traffic forwarded directly by IP (without MPLS encapsu-
  lation) will be minimal and have little or no impact on loading.


3  Link, Path, and Path Set Equivalent Load


  Link loading and link loss is determined from information flooded by
  OSPF or IS--IS as described in Section 2.  ``Link equivalent load''
  on each individual link is computed based on link utilization and
  link loss as described in [11].  A path is a series of links.  ``Path
  equivalent load'' is the maximum link equivalent load along the path.
  ``Path set equivalent load'' is the minimum path equivalent load among
  a set of paths.  Path and path set equivalent load are defined pre-
  cisely in Appendix B.2.



4  Initial Path and Initial Loading


  From the standpoint of a given ingress a given egress at some point
  either at some epoch, or after an outage, makes a transition from
  completely unreachable to reachable.  When this transition occurs, a
  single initial LSP is created from the ingress to the egress without
  regard to traffic loading.  The shortest path based on IGP metrics is
  used.  If multiple paths have identical shortest metrics, then all of
  these paths may be used as an initial path set.  The best path cri-
  teria may be further relaxed as described and recommended in [11] to
  create a larger number of initial paths.  All traffic to the given
  egress is put on this path or set of paths.  Further details are found
  in Appendix B.3



Villamizar                 Expires  May 18, 1999                  [Page 4]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

5  Creating Alternate Paths


  A path set may consist of one or more paths.  Many path sets will
  contain only the initial path created when the egress first became
  reachable as described in Section 4.  The path set loading is deter-
  mined as described in Section 3.  The path set loading is compared
  to a set of thresholds to determine if an alternate path should be
  created.

  The set of thresholds should be configurable as described in
  Appendix A.  The symbolic names MIN_NEW_PATH, MAX_NEW_PATH,
  NEW_PATH_INCR, and OVERLOAD_PERSIST will be used here.  MIN_NEW_PATH,
  MAX_NEW_PATH, and NEW_PATH_INCR are in units of equivalent load, where
  1.0 is full link utilization.  OVERLOAD_PERSIST is in units of time,
  generally seconds.

  A array of values will be maintained for each path set.  The array
  will be indexed such that there is an array entry for each value from
  MIN_NEW_PATH to MAX_NEW_PATH inclusive of MAX_NEW_PATH in increments
  of NEW_PATH_INCR. For example, if MIN_NEW_PATH is 0.80, MAX_NEW_PATH
  is 1.15, and NEW_PATH_INCR is 0.05, there would be array entries cor-
  responding to 0.85, 0.90, 0.95, 1.00, 1.05, 1.10, 1.15.  Each array
  entry can contain a timestamp or a value illegal as a timestamp re-
  ferred to here as NEVER. Each array element is initialized to NEVER.

  If the path set loading is less than MIN_NEW_PATH then all array
  entries are set to NEVER. If the path set loading is greater than
  MIN_NEW_PATH, then each array element that corresponds to a loading
  below or equal to the path set loading that contains the value NEVER
  is set to the current time.  Each array entry that corresponds to a
  value above the current path load is set to NEVER.

  When some array values are not all set to NEVER, a conditional de-
  fined in Appendix B.4 is periodically checked for each array entry.
  If the conditional is true for any array entry, the decision process
  is terminated and the search for an alternate path is undertaken.

  To determine if an alternate path is available, and if so deter-
  mine the best alternate path, an SPF calculation is performed after
  temporarily removing all links from the OSPF or IS--IS link state
  database whose link equivalent load equals or exceed the path set
  equivalent load.  This SPF may result in no path between the ingress
  and egress.  If so, no additional paths are available which can re-
  move load from the path set under consideration, and no action can be
  taken.  Otherwise, the SPF will yield one or more best paths between
  ingress and egress.  LSPs for these paths are set up and added to the
  path set.

  The algorithms described above are provided in a more concise and
  precise form in Appendix B.4.


Villamizar                 Expires  May 18, 1999                  [Page 5]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  When a path is added to a path set, an initial loading and initial
  rate at which this loading is increased is set.  These initial values
  and the loading adjustment algorithm is provided in Section 8.

  The arrays are updated when new loading information arrives or every
  PATH_CHECK seconds and the arrays checked every PATH_CHECK seconds.
  This parameters is configurable as described in Appendix A.



6  Deleting from Underutilized Path Sets


  Please note that simulations of this portion of the algorithm have not
  been completed.  Some details may change.

  Normal time of day load fluctuations or restoral of link failures can
  result in path sets that are significantly underutilized.  When this
  occurs, one path at a time can be dropped from the path set.  Param-
  eters are defined in Appendix A which determine the conditions under
  which a path can be dropped.  The loading on the most heavily loaded
  path in the path set must be below PATH_DEL_SHARE. When this condi-
  tion becomes true a timestamp must be retained.  If the condition is
  no longer true, the timestamp must be invalidated.  If this condi-
  tion remains true for a period of time, referred to here as ``time to
  deletion'', then a path can be selected for deletion.

  The time to deletion must be dependent on loading contribution in or-
  der to avoid many ingress routers performing deletions at once.  The
  computation must favor deletions from path sets with small traffic
  contributions.  This favors eliminating paths with marginal ben-
  efit.  The time to deletion is computed as follows.  The constant
  PATH_DEL_PERSIST seconds is multiplied by the larger of zero or the
  difference between one and the ratio of total absolute (bytes/second)
  traffic contributed by the path set to the sum of absolute excess
  bandwidth on the paths.

  The best candidate for deletion is the path which has the highest IGP
  cost.  If a subset of paths containing more than one path have ex-
  actly equal IGP cost and are the highest, then the path among this
  subset with the lowest traffic share is chosen.  Only one path should
  be deleted at a time.

  The total contribution of traffic to the path set, as measured by the
  absolute LSP traffic levels, must not exceed PATH_DEL_EXCESS times the
  excess bandwidth on the remaining paths.  When a path is selected as
  the best path to delete, this criteria must be checked.  If this path
  cannot be deleted and there is no other path of equal IGP cost, then
  no deletion should take place.  Deleting a path of low IGP cost and
  lower bandwidth might still be possible but such a path should not
  be deleted, since the lower IGP cost path that would be deleted is


Villamizar                 Expires  May 18, 1999                  [Page 6]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  considered better than some of the remaining paths.

  Once a path is selected for deletion, its traffic share should be
  divided among the remaining paths in proportion to the absolute band-
  width available on each remaining path.  The LSP can then be deleted,
  after traffic is moved and the path is known to be idle.  After a
  path is deleted, the time stamp should be artificially advanced by
  PATH_DEL_WAIT. This will insure that path deletions on this path set
  are spaced out in time by at least this amount of time.

  The algorithms for deleting paths are desribed in Appendix B.5.



7  Handling Link State Changes


  Please note that simulations of this portion of the algorithm have not
  been completed.  Some details may change.  In particular, optimiza-
  tions done in the simulation code which dramatically reduce the number
  of SPF calculations have not been described here.

  Some initial paths may be placed prior to complete convergence of the
  link state protocol.  In addition, when a failed link is restored,
  there may not be an overload condition to force traffic back onto the
  restored link.  If this were not corrected the network could settle
  into a state where path layout and possibly also loading was subopti-
  mal.

  The adjustments that occur due to a link failure will occur immedi-
  ately or quickly.  In some cases, all paths from a given ingress to
  a given egress will be lost.  In this case, if connectivity is still
  feasible, the best remaining path in terms of cost is used, as de-
  scribed in Section 4.  If paths remain, load is shifted to these paths
  as if the links had been deleted as described in Section 6.

  The result of a link loss on a well utilized topology will generally
  be a period of suboptimal loading and congestion.  As this condition
  persists, alternate paths will be added as described in Section 5.
  The most heavily congested links will be offloaded first, with the
  ingress contributing the largest share of traffic reacting first.  The
  less loaded links and lesser contributors will react if congestion
  continues.  In some cases, paths will be added faster than the balanc-
  ing makes use of some of the new paths.  As the balancing progresses,
  path sets which have created excess links will begin to prune the less
  beneficial links as described in Section 6.

  When a link is restored to service, the SPF will yield multiple better
  paths to some egress than paths currently in use, which may also be
  less loaded than paths in use.  If the topology is not overloaded,
  these paths may not be immediately considered.


Villamizar                 Expires  May 18, 1999                  [Page 7]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  Two conditions have been described which may cause better path to not
  be selected and may not be forced into consideration later by loading
  conditions.  The first is where an initial path is selected prior to
  link state database convergence.  The second is when a link is re-
  stored to service.  This condition should not be allowed to persist
  indefinitely.

  When a SPF is required due to a link transition to a full adjacency,
  those path sets for which the new best path is better than some paths
  in use and is not overloaded are marked.  Where the existing set of
  paths is heavily loaded the new best paths will be used very quickly.
  For the remainder, the better paths will be added to path sets based
  on an event timer.  This algorithm is described in detail in Ap-
  pendix B.7.



8  Load Balancing


  There is no load balancing done on a path set with only one path.
  When there is more than one path, fine adjustments can be made in
  the amount of traffic on each path by adjusting boundary values in a
  source/destination hash number space associated with each path.  This
  technique is described in detail in [11].

  The terms ``traffic share'' and ``load increment'' are defined in
  [11].  The loading on a particular path is determined by its traffic
  share.  The rate of increase in loading, when offloading traffic from
  other paths, is determined by the move increment.  Both of these vari-
  ables are in units of hash table space.  For example, if a 16 bit hash
  is used, these variables are in the range 0---65535.

  A path set is constructed according to the rules described in Sec-
  tion 4 and Section 5.  When a new path is added to an existing path
  set, the initial traffic share is set to a small value as described in
  Section 5.


9  MPLS--OMP Forwarding


  The load balance itself utilizes the ``next hop structures'' described
  in [11].  The next hop structures themselves will differ slightly in
  that an ingress to an MPLS domain will be adding a label to the label
  stack in addition to making a forwarding decision.  The algorithm for
  determining when to adjust load and for adjusting load is exactly as
  described in [11].  MPLS--OMP forwarding is described in detail in
  Appendix B.8.




Villamizar                 Expires  May 18, 1999                  [Page 8]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

Acknowledgments


  Simulations were performed using hypotheitc cases and using UUNET's
  topology and loading.  Simulations involving UUNET's network would not
  have been possible without data gathered by Dave Cleary, Yong Xue, and
  Yong Zhue of UUNET.

  @@ To be completed.  Get your valuable comments in please.  @@



References

   [1]  Ross Callon, George Swallow, N. Feldman, A. Viswanathan,
        P. Doolan, and A. Fredette.    A framework for multiprotocol la-
        bel switching.     Internet Draft (Work in Progress) draft-ietf-
        mpls-framework-02, Internet Engineering Task Force, 11 1997.
        ftp://ftp.isi.edu/internet-drafts/draft-ietf-mpls-framework-
        02.txt.

   [2]  Ross Callon, A. Viswanathan, and E. Rosen.   Multiprotocol label
        switching architecture. Internet Draft (Work in Progress) draft-
        ietf-mpls-arch-02, Internet Engineering Task Force, 7 1998.
        ftp://ftp.isi.edu/internet-drafts/draft-ietf-mpls-arch-02.txt.

   [3]  R.W. Callon. Use of osi is-is for routing in tcp/ip and dual en-
        vironments. Technical Report RFC 1195, Internet Engineering Task
        Force, 1990. ftp://ftp.isi.edu/in-notes/rfc1195.txt.
   [4]  Bruce Davie, Tony Li, Y Rekhter, and E. Rosen.    Explicit route
        support in mpls.  Internet Draft (Work in Progress) draft-davie-
        mpls-explicit-routes-00, Internet Engineering Task Force, 11
        1997.        ftp://ftp.isi.edu/internet-drafts/draft-davie-mpls-
        explicit-routes-00.txt.

   [5]  Bruce Davie, Y Rekhter, A. Viswanathan, S. Blake, Vijay Srini-
        vasan, and E. Rosen.  Use of label switching with rsvp.   Inter-
        net Draft (Work in Progress) draft-ietf-mpls-rsvp-00, Internet
        Engineering Task Force, 3 1998.      ftp://ftp.isi.edu/internet-
        drafts/draft-ietf-mpls-rsvp-00.txt.

   [6]  D. Farinacci, Tony Li, A. Conta, Y Rekhter, Dan Tappan,
        E. Rosen, and G. Fedorkow.  Mpls label stack encoding.  Internet
        Draft (Work in Progress) draft-ietf-mpls-label-encaps-03, Inter-
        net Engineering Task Force, 9 1998.  ftp://ftp.isi.edu/internet-
        drafts/draft-ietf-mpls-label-encaps-03.txt.
   [7]  ISO/IEC.        Iso/iec 10589 - information processing systems -
        telecommunications and information exchange between systems -
        intermediate system to intermediate system intra-domain routing
        information exchange protocol for use in conjunction with the
        protocol for providing the connectionless-mode network service


Villamizar                 Expires  May 18, 1999                  [Page 9]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

        (iso 8473).     Technical report, International Organization for
        Standardization, 1992. ftp://merit.edu/pub/iso/iso10589.ps.gz.

   [8]  Tony Li and Curtis Villamizar.            Is-is optimized multi-
        path (isis-omp).        Internet Draft (Work in Progress) draft-
        villamizar-isis-omp-00, Internet Engineering Task Force, 10
        1998.   ftp://ftp.isi.edu/internet-drafts/draft-villamizar-isis-
        omp-00.txt.

   [9]  J. Moy.      Ospf version 2.      Technical Report RFC 2328, In-
        ternet Engineering Task Force, 1998.       ftp://ftp.isi.edu/in-
        notes/rfc2328.txt.
  [10]  Bob Thomas, N. Feldman, P. Doolan, Loa Andersson, and A. Fre-
        dette.   Ldp specification.    Internet Draft (Work in Progress)
        draft-ietf-mpls-ldp-02, Internet Engineering Task Force, 11
        1998.     ftp://ftp.isi.edu/internet-drafts/draft-ietf-mpls-ldp-
        02.txt.

  [11]  Curtis Villamizar.  Ospf optimized multipath (ospf-omp).  Inter-
        net Draft (Work in Progress) draft-ietf-ospf-omp-01, Internet
        Engineering Task Force, 10 1998.     ftp://ftp.isi.edu/internet-
        drafts/draft-ietf-ospf-omp-01.txt.


Security Considerations


  MPLS--OMP is a method of setting up MPLS LSPs and balancing load
  across MPLS LSPs.  MPLS--OMP, like MPLS itself, depends on the in-
  tegrity of information obtains from the IGP. MPLS--OMP requires a link
  state protocol IGP relies upon loading information provided by the
  IGP in addition to the underlying routing functionality that is min-
  imally required for any use of MPLS. If the IGP is compromised such
  that integrity of the information provided by the IGP is no longer
  maintained, substantial denial of service is possible.  Use of MPLS--
  OMP does not add exposure to the IGP, nor does it provide any means to
  leverage attacks on the IGP beyond the existing potential for denial
  of service.

  MPLS--OMP relies upon the ability of LDP or MPLS RSVP extensions to
  set up explicit LSPs.  As a technique which uses these capabilities
  with no signaling mechanisms of its own, MPLS--OMP adds no further ex-
  posure to these signaling mechanisms.  If sufficiently misconfigured,
  MPLS--OMP could add significant load to MPLS LSRs if misconfiguration
  is sufficient that paths are repeatedly created and deleted.



Author's Addresses


  Curtis Villamizar

Villamizar                 Expires  May 18, 1999                 [Page 10]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  ANS
  <curtis@ans.net>



A  Configuration Options


  MPLS--OMP configurable parameters do not control the rate of traffic
  adjustment or the rate of change in traffic adjustment.  These rates
  should be kept as defined in OSPF--OMP [11] in order to insure that
  stability is not impacted.

  The MPLS--OMP configurable parameters control the conditions for cre-
  ating new paths, the initial state of new paths, and the conditions
  for deleting paths.



          Parameters: Conditions for Creating New Paths

    Parameter Name         Units       Range        Default

    MIN_NEW_PATH           unitless    0.50-1.00     0.45
    MAX_NEW_PATH           unitless    see below     1.10
    NEW_PATH_INCR          unitless    0.01-0.25     0.05
    OVERLOAD_PERSIST       seconds     60-3600         60

      MAX_NEW_PATH >= MIN_NEW_PATH + NEW_PATH_INCR
      MAX_NEW_PATH <= 10.0

          Parameters: Initial State of New Paths

    Parameter Name         Units       Range        Default

    NEW_LOAD_FRACTION      unitless    0.01-1.00     0.10
    NEW_LOAD_EPSILON       unitless    0.00-1.00     0.00
    NEW_MOVE_FRACTION      unitless    0.01-1.00     0.25
    NEW_MOVE_EPSILON       unitless    0.00-1.00     0.01

          Parameters: Conditions for Deleting Paths

    Parameter Name         Units       Range        Default

    PATH_DEL_SHARE         hash space  0.00-1.00     0.35
    PATH_DEL_PERSIST       seconds     60-86400      1200
    PATH_DEL_EXCESS        hash space  0.00-1.00     0.75

      PATH_DEL_SHARE must be significantly less than
      MIN_NEW_PATH to insure sufficient hysteresis.



Villamizar                 Expires  May 18, 1999                 [Page 11]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

          Parameters: Timer based processing

    Parameter Name         Units       Range        Default

    PATH_CHECK             seconds     15-300          30
    PATH_ADD_WAIT          seconds     15-3600        240
    PATH_DEL_WAIT          seconds     15-3600        240
    SPF_PERSIST            seconds     60-3600        300
    SPF_RECHECK            seconds     60-86400       900



B  Concise Statement of Algorithms


  MPLS--OMP uses links state information and link loading information
  flooded via a link state IGP. MPLS--OMP uses this information to de-
  termine where to place MPLS LSPs and to adjust load where multiple
  LSPs are established per traffic flow.  The traffic flows are assumed
  to be aggregates of very large numbers of end-to-end flows such that
  dividing traffic using a hash over the IP source and IP destination
  provides a means of distributing load.

  The MPLS--OMP algorithms are stated here in pseudocode with limited
  accompanying text.  All upper case variables are generally constants
  or parameters defined in Appendix A.  The pseudocode syntax is not
  formally defined but is sufficiently similar to the ``C'' or perl
  programming languages that it should be understandable.



B.1  Flooding Loading Information

  The flooding of loading information is the responsibility of the link
  state IGP. The data formats and flooding rate is provided in the OSPF-
  -OMP and ISIS--OMP documents [11, 8].  The MPLS routers are all as-
  sumed to participate in the IGP routing.  The ingress router MPLS
  implementation and IGP implementation must be sufficiently coupled
  such that links state changes and loading information is passed as
  received from the IGP to MPLS.



B.2  Link, Path, and Path Set Equivalent Load

  Link loading and link loss is determined from information flooded by
  OSPF or IS--IS. The variables ``Utilization'' and ``Loss'' below are
  the utilization and loss values flooded by the IGP. The utilization
  and loss directly yield a ``link equivalent load''.  The maximum uti-
  lization of any link on a path yields the ``path equivalent load''.



Villamizar                 Expires  May 18, 1999                 [Page 12]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  The minimum utilization of a set of paths associated with reaching a
  specific egress are a ``path set equivalent load''.



      EventHandler NewLoadingInfo(Link): {
          if (Loss{Link} < 0.005) {
              LinkEquivLoad{Link} = Utilization{Link};
          } else {
              if (Loss{Link} <= 0.09) {
                  LossComp = 10 * sqrt(Loss{Link});
              } else {
                  LossComp = 3;
              }
              LinkEquivLoad{Link} = Utilization{Link} * LossComp;
          }
      }



  Path equivalent load is the maximum link equivalent load on a given
  path.



      Function ComputePathLoad(Path) {
          PathEquivLoad = 0;
          foreach Link ( Links{Path} ) {
              Load = LinkEquivLoad{Link}
              if (Load > PathEquivLoad)
                  PathEquivLoad = Load;
          }
          PathEquivLoad{Path} = PathEquivLoad;
      }



  The path set equivalent load is the minimum path equivalent load of
  a set of LSPs between a given ingress and a given egress.  Some later
  computations also require the maximum loading of any path on a path
  set.  Both can be computed at the same time.



      Function ComputePathSetLoad(PathSet) {
          undefine PathSetEquivLoad;
          PathSetMaxLoad = 0;
          foreach Path ( Paths{PathSet} ) {
              ComputePathLoad(Path);
              Load = PathEquivLoad{Path};
              if (!defined(PathSetEquivLoad)) {


Villamizar                 Expires  May 18, 1999                 [Page 13]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

                  PathSetEquivLoad = Load;
              } else if (Load < PathSetEquivLoad) {
                  PathSetEquivLoad = Load;
              }
              if (Load > PathSetMaxLoad)
                  PathSetMaxLoad = Load;
          }
          PathSetEquivLoad{PathSet} = PathSetEquivLoad;
          PathSetMaxLoad{PathSet} = PathSetMaxLoad;
      }



  The pseudocode fragments above and elsewhere in this section do not
  implement any form of data structure threading to allow all parame-
  ters dependent on a given link to be updated when the link utilization
  changes.  This sort of data structure threading to reflect depen-
  dencies among computed parameters can save considerable time when
  processing the periodic timer events.  This type of implementation op-
  timization is not expressed in the pseudocode but may be nessecary to
  produce an implementation that scales well with respect to the number
  of path set, paths, and links.



B.3  Initial Path and Initial Loading

  At some point a path that had been unreachable becomes reachable.
  When this occurs, an initial path is set according to the results of
  the SPF calculation.  The following pseudocode describes the initial-
  ization.



      EventHandler NewlyReachable(Egress): {
          new Paths = SPF(Egress);
          new PathSet;
          Egress{PathSet} = Egress;
          undefined Paths{PathSet};
          insert PathSets PathSet;
          foreach Path ( Paths ) {
              insert Paths{PathSet} Path;
              PathSetBelowMax{PathSet} = NEVER;
              for (Util = MIN_NEW_PATH + NEW_PATH_INCR;
                   Util += NEW_PATH_INCR;
                   Util <= MAX_NEW_PATH) {
                  PathSetArray{PathSet,Util} = NEVER;
              }
          }
          ComputePathSetLoad(PathSet);
      }


Villamizar                 Expires  May 18, 1999                 [Page 14]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  This initialization may also occur when a link failure makes all of
  the paths to a given egress currently in use unfeasible.  In this
  case, the old path set corresponding to this egress is deleted.  A
  new path must be selected and the egress is treated as if it had just
  become reachable.



B.4  Creating Alternate Paths

  Deciding whether to create an alternate path and then creating one can
  be broken down into a number of smaller procedures.

  Every PATH_CHECK seconds the set of arrays used to determine whether
  to add or delete a path are checked.



      Function AdjustPathArrays(PathSet) {
          Load = PathSetEquivLoad{PathSet};
          for (Util = MIN_NEW_PATH + NEW_PATH_INCR;
               Util += NEW_PATH_INCR;
               Util <= MAX_NEW_PATH) {
              if (Util > Load) {
                  if (PathSetArray{PathSet,Util} == NEVER)
                      break;
                  else
                      PathSetArray{PathSet,Util} = NEVER;
              } else if (PathSetArray{PathSet,Util} == NEVER) {
                  PathSetArray{PathSet,Util} = NOW;
              }
          }
      }



  The function below is used in pseudocode later in this section.  When
  loading is sufficient to require checking the path's contributions
  to the loadings on the available paths, the following computation is
  made.



      Function GetPathContribution(PathSet): {
          TotalTraffic = 0;
          TotalCapacity = 0;
          foreach Path ( Paths{PathSet} ) {
              TotalTraffic += LoadOnLSP{Path};
              undefine Capacity;
              foreach Link ( Links{Path} ) {
                  if (!defined(Capacity))


Villamizar                 Expires  May 18, 1999                 [Page 15]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

                      Capacity = LinkCapacity{Link};
                  else if (LinkCapacity{Link} < Capacity)
                      Capacity = LinkCapacity{Link};
              }
              TotalCapacity += Capacity;
          }
          return TotalTraffic / TotalCapacity;
      }



  Every PATH_CHECK seconds a check is made to determine whether to try
  to create a new path.  An elapsed time is computed.  The elapsed time
  is the difference between the current time and the time stored in the
  array entry.  A load compensation factor is computed.  It is the ratio
  of two differences, the difference between the loading correspond-
  ing to the array entry and MIN_NEW_PATH and the difference between
  MAX_NEW_PATH and MIN_NEW_PATH. A traffic contribution factor is com-
  puted.  This is ratio of the current total path set traffic to the sum
  of the capacities of each path.  The capacity of an individual path is
  taken to be the capacity of the slowest link on that path.  The OVER-
  LOAD_PERSIST constant is compared to the product of the elapsed time,
  the load compensation factor, and the traffic contribution factor.  If
  the product is the greater of the two, the condition is true for this
  array entry and an atempt is made to create a new path.



      Function CheckForNewPath(PathSet) {
          undefine PathContribution;
          for (Util = MIN_NEW_PATH + NEW_PATH_INCR;
               Util += NEW_PATH_INCR;
               Util <= MAX_NEW_PATH) {
              Then = PathSetArray{PathSet,Util};
              if (Then == NEVER)
                  return 1;
              Elapsed = NOW - Then;
              LoadComp = (Util - MIN_NEW_PATH)
                  / (MAX_NEW_PATH - MIN_NEW_PATH);
              if (!defined(PathContribution))
                  PathContribution
                      = GetPathContribution(PathSet);
              if (Elapsed * LoadComp * PathContribution
                  > OVERLOAD_PERSIST) {
                  CreateNewPath(PathSet);
                  break;
              }
          }
      }




Villamizar                 Expires  May 18, 1999                 [Page 16]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  When the decision has been made to attempt to create a new path,
  the first step is to search for a candidate path.  If a new path is
  available it is initialized.  The set of arrays is incremented by
  PATH_ADD_WAIT whether a new path is found or not in order to defer
  repeating the check for this amount of time.



      Function CreateNewPath(PathSet) {
          NewPaths = FindAlternatePath(PathSet);
          if (defined(NewPaths)) {
              MostCapacity = 0;
              undefine BestNewPath
              foreach NewPath ( NewPaths )
                  ComputePathLoad(NewPath);
                  undefine Capacity;
                  foreach Link ( Links{Path} ) {
                      ExcessCapacity = LinkCapacity{Link}
                          * (1 - PathEquivLoad{NewPath});
                      if (!defined(Capacity))
                          Capacity = ExcessCapacity;
                      else if (ExcessCapacity < Capacity)
                          Capacity = ExcessCapacity;
                  }
                  if (MostCapacity < Capacity) {
                      MostCapacity = Capacity;
                      BestNewPath = NewPath;
                  }
              }
              if (defined(BestNewPath))
                  InitilizeNewPath(PathSet,BestNewPath);
          }
          for (Util = MIN_NEW_PATH + NEW_PATH_INCR;
               Util += NEW_PATH_INCR;
              Util <= MAX_NEW_PATH) {
              if (PathSetArray{PathSet,Util} == NEVER)
                  break;
              else
                  PathSetArray{PathSet,Util}
                      += PATH_ADD_WAIT;
          }
      }



  Finding an alternate path involves running an SPF calculation with
  some of the links disabled.  Note that the SPF can return more than
  one equal cost path.  There can also be enough links disabled that the
  SPF returns no path at all.




Villamizar                 Expires  May 18, 1999                 [Page 17]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

      Function FindAlternatePath(PathSet) {
          Load = PathSetEquivLoad{PathSet};
          foreach Link ( LinkStateDataBase ) {
              if (LinkEquivLoad{Link} >= Load)
                  TemporarilyDisableLink(Link);
          }
          NewPaths = SPF(Egress{PathSet});
          EnableAllLinks();
          return NewPaths;
      }



  Once an alternate path has been found, the traffic share and the move
  increment are initialized.  The new path is then added to the path
  set.

  Note that in simulations, the initial share was set to 0 and the ini-
  tial increment was set to about 0.01.  Either the simulation will be
  changed to reflect the algorithm below or the algorithm will be sim-
  plified.  The initial conditions do not seem to have much of an effect
  if set very conservatively as recommended or if set conservatively and
  fixed as in the current simulations.



      Function InitilizeNewPath(PathSet,NewPath): {
          TrafficShare{PathSet,NewPath} = 0;
          PathExcessCapacity = ExcessCapacity(NewPath);
          TotalExcessCapacity = PathExcessCapacity;
          foreach Path ( Paths{PathSet} ) {
              Excess = ExcessCapacity(Path);
              if (Excess > 0)
                  TotalExcessCapacity += Excess;
          }
          TargetShare = NEW_LOAD_EPSILON;
          TargetShare2 = HASH_SPACE_SIZE * NEW_LOAD_FRACTION
                (PathExcessCapacity / TotalExcessCapacity);
          if (TargetShare2 < TargetShare)
              TargetShare = TargetShare2;
          TargetIncr = NEW_MOVE_EPSILON;
          TargetIncr2 = HASH_SPACE_SIZE * NEW_MOVE_FRACTION
              (PathExcessCapacity / TotalExcessCapacity);
          if (TargetIncr2 < TargetIncr)
              TargetIncr = TargetIncr2;
          MoveIncrement{PathSet,NewPath} = TargetIncr;
          TotalCapacity = 0;
          foreach Path ( Paths{PathSet} ) {
              TotalCapacity += Capacity{Path};
          }
          foreach Path ( Paths{PathSet} ) {


Villamizar                 Expires  May 18, 1999                 [Page 18]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

              Shares = TargetShare
                  * Capacity{Path} / TotalCapacity;
              if (TrafficShare{PathSet,Path} < Shares)
                  Shares = TrafficShare{PathSet,Path};
              TrafficShare{PathSet,NewPath} += Shares;
              TrafficShare{PathSet,Path} -= Shares;
          }
          insert Paths{PathSet} NewPath
      }



B.5  Deleting Alternate Paths


  Please note that simulations of this portion of the algorithm have not
  been completed.  Some details may change.

  Every PATH_CHECK seconds each path set is checked to see if the set of
  paths contain sufficient excess bandwidth to drop one of the paths.
  A simple single threshold is used but the ratio of contributed traf-
  fic to excess capacity is used to determine the time delay before
  deletion.  This should avoid synchronized actions taken by multiple
  ingress routers.



      Function CheckPathExcess(PathSet) {
          MaxLoad = PathSetMaxLoad{PathSet};
          if (MaxLoad >= PATH_DEL_SHARE) {
              PathSetBelowMax{PathSet} = NEVER;
              return;
          }
          if (PathSetBelowMax{PathSet} == NEVER) {
              PathSetBelowMax{PathSet} = NOW;
              return;
          }
          Elapsed = NOW - PathSetBelowMax{PathSet};
          Traffic = 0;
          ExcessCapacity = 0;
          foreach Path ( Paths{PathSet} ) {
              Traffic += LoadOnLSP{Path};
              Excess = ExcessCapacity(Path);
              if (Excess > 0)
                  ExcessCapacity += Excess;
          }
          TimeToDeletion = PATH_DEL_PERSIST
              * (1 + (Traffic / ExcessCapacity));
          if (Elapsed >= TimeToDeletion) {
              DeletePath(PathSet);
              PathSetBelowMax{PathSet} += PATH_DEL_WAIT;


Villamizar                 Expires  May 18, 1999                 [Page 19]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

          }
      }



  Once the decision is made to drop a path, a path must be selected to
  drop.  The load must be moved to the remaining paths and storage and
  and other resources recovered.



      Function DeletePath(PathSet): {
          HighCost = 0;
          foreach Path ( Paths{PathSet} ) {
              Cost = PathLinkCosts(Path);
              if ((Cost > HighCost)
                  || (Cost == HighCost
                      && TrafficShare{PathSet,Path} < LowShare) {
                  PathToDelete = Path;
                  HighCost = Cost;
                  LowShare = TrafficShare{PathSet,Path};
              }
          }
          foreach Path ( Paths{PathSet} ) {
              if (Path != PathToDelete)
                  TotalCapacity += Capacity{Path};
          }
          while (TrafficShare{PathSet,PathToDelete} != 0) {
              TargetShare = TrafficShare{PathSet,PathToDelete};
              foreach Path ( Paths{PathSet} ) {
                  if (Path == PathToDelete)
                      continue;
                  Shares = TargetShare
                      * Capacity{Path} / TotalCapacity;
                  if (Shares < 1)
                      Shares = 1;
                  if (TrafficShare{PathSet,PathToDelete} < Shares)
                      Shares = TrafficShare{PathSet,PathToDelete};
                  TrafficShare{PathSet,PathToDelete} -= Shares;
                  TrafficShare{PathSet,Path} += Shares;
                  if (TrafficShare{PathSet,PathToDelete} == 0)
                      break;
              }
          }
          delete Paths{PathSet} PathToDelete
          RecoverStorage(PathToDelete);
      }






Villamizar                 Expires  May 18, 1999                 [Page 20]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

B.6  Load Balancing


  There are no differences in the algorithm to balance a set of paths
  in MPLS--OMP from the methods in OSPF--OMP. The path set in MPLS--OMP
  maps directly into a next hop structure as described in the OSPF--OMP
  document [11].  Behavior differs slightly in that in MPLS--OMP the
  decision made at the ingress determines the entire path of the traffic
  while in OSPF--OMP the decision at the ingress is affected by deci-
  sions at every downstream router.  Regardless the algorithm to adjust
  the traffic share is identical.


B.7  Handling Link State Change


  Please note that simulations of this portion of the algorithm have not
  been completed.  Some details may change.  In particular, optimiza-
  tions done in the simulation code which dramatically reduce the number
  of SPF calculations have not been described here.

  Link state changes may make some paths infeasible and make make new
  paths available.  An SPF calculation may involve one or many links
  state changes.

  When an SPF calculation is required, first the set of paths is checked
  to see if any of those paths have become infeasible.  These paths are
  deleted.  This may also render some egress temporarily unreachable.
  If so, the path set structures are deleted as well.



      @@ pseudo-code to be converted from simulation perl code @@



  After processing the set of paths that must be deleted, the set of
  egress that are now unreachable are checked.  If there are feasible
  paths, these paths are created.  The remaining egress are checked to
  see if paths now exist that are less loaded than the path set and have
  a lower cost than some of the paths in use.  If so, these path sets
  are marked with a time stamp.



      @@ pseudo-code to be converted from simulation perl code @@



  The loading on some paths may be reduced by balancing done for other
  egress or balancing done by other routers.  For this reason, the SPF


Villamizar                 Expires  May 18, 1999                 [Page 21]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  must be rechecked periodically to see if better paths than the ones
  being used exist.  This check is performed every SPF_RECHECK seconds.
  Path sets are marked as they would be if an SPF was just completed.



      @@ pseudo-code to be converted from simulation perl code @@



  Every PATH_CHECK seconds those path sets that are timestamped to in-
  dicate that a better path needs to be considered are checked to see
  if it is time to add the new path.  The constant SPF_PERSIST is used
  with a compensation factor for load and traffic contribution simi-
  lar to the compensations made in Appendix B.4.  If the time criteria
  is met, then the path is added.  The SPF must be rechecked to see if
  other paths are available which are better than some of the paths in
  use.  If a few such paths are added the path set can end up with some
  excess capacity and the worst paths can be dropped as described in
  Appendix B.5.



      @@ pseudo-code to be converted from simulation perl code @@



B.8  MPLS--OMP Forwarding


  MPLS--OMP requires that the ingress router perform a hash on the IP
  source and IP destination in order to to spread traffic over a set of
  paths.  This is a simple operation, but access to the IP header is not
  something that can always be taken for granted.

  In the case where an MPLS domain receives only IP encapsulated traffic
  from neighboring domains, the IP header is accessible.  LSRs interior
  to the MPLS domain have no need to look at the IP header.

  The case where the IP header is not immediately accessible is where
  traffic is exchanged across MPLS domains that is already MPLS en-
  capsulated.  The purpose of supporting a label stack is to allow the
  LSP for multiple MPLS domains to be indicated at the ingress of the
  first MPLS domain.  Supplying an MPLS label to an upstream MPLS domain
  that only applies to a continuously changing specific set of source
  and destination hash values is not likely to be feasible.  If so, the
  top label must be assigned at the MPLS domain ingress.  Labels fur-
  ther down the stack might be present and relevant to downstream MPLS
  domains.  In such a scenario it may make sense to use the label pro-
  vided to the ingress to indicate traffic classification only, with
  the egress and hash determined by finding the IP header and looking


Villamizar                 Expires  May 18, 1999                 [Page 22]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  at both IP source and IP destination.  This isolates the upstream MPLS
  domain from any change to internal routing.

  The use of MPLS across MPLS domains as a means of passing traffic
  classification is beyond the scope of this document.  The inability
  to quickly find the IP header in an MPLS stack of unknown depth is a
  cause for concern.  This may be an issue that has to be addressed in
  the MPLS encapsulation.  Alternately label numbers provided to the
  upstream MPLS domain may be assigned such that the label number can
  reliably indicate the label stack depth of incoming traffic.

  Given that the IP header is at the top of the stack or can be located
  quickly enough, a hash is performed on the IP source and destination.
  The hash number space is subdivided on arbitrary boundaries among the
  set of possible LSPs.  This is indicated in the figure below.



      +-------+-------+-------+-------+
      |   IP Source Address           |
      +-------+-------+-------+-------+
      |   IP Destination Address      |
      +-------+-------+-------+-------+
                      |
                      V
               hash function
                      |
                      V
              +-------+-------+
              |  hash value   |
              +-------+-------+
                      |
                      V
      +-------+-------+-------+-------+
      | hash boundary |  MPLS LSP #1  |
      +-------+-------+-------+-------+
      | hash boundary |  MPLS LSP #2  |
      +-------+-------+-------+-------+
           ...
      +-------+-------+-------+-------+
      | hash boundary |  MPLS LSP #N  |
      +-------+-------+-------+-------+
                      |
                      V
                  selection
                      |
                      V
                 MPLS label





Villamizar                 Expires  May 18, 1999                 [Page 23]


INTERNET-DRAFT    MPLS Optimized Multipath (MPLS--OMP)    November 18, 1998

  Conceptually the selection process involves walking through the set
  of hash boundary LSP pairs.  If the hash value exceeds the boundary
  value, the next LSP is tried.  If the last LSP in the data structure
  is reached, it is used regardless of the boundary.  The selection
  would yield a MPLS label to use which would in turn yield all of the
  nessecary forwarding information.

  The actual algorithm used is unlikely to be this one, but would in-
  stead be an algorithm with the same net effect but more efficient with
  a large number of possible LSPs.



C  Comparison of MPLS--OMP and OSPF-OMP Dynamics


  ISIS--OMP and OSPF--OMP differ in fairly insignificant ways.  The
  behavior of MPLS--OMP and OSPF--OMP differ in more significant ways.

  In OSPF--OMP the paths are a direct result of the SPF calculation.  As
  a result, in OSPF--OMP there are limitations to the set of paths on
  which traffic can be split for a given ingress and egress that can
  result in an imperfect load balance.  This limitation can be overcome
  by carefully selecting the link costs.  The most serious difficulty
  arises when links fail and the OSPF costs are suboptimal for the re-
  maining topology.

  MPLS--OMP differs from OSPF--OMP in the way in which the candidate set
  of paths are created.  MPLS--OMP is capable of optimal load balancing
  in situations in which OSPF--OMP does not create a perfect balance
  across critical cutsets in the network topology.  The immediate ben-
  efit is relief from the tedious task (one of N--P complete time com-
  plexity) of selecting link costs.  A second benefit is an improvement
  in the behavior in the presence of link failure.


D  Simulations of MPLS--OMP Behavior


  @@ To be completed.  This work will be made available at
  http://engr.ans.net/mpls-omp/.  @@












Villamizar                 Expires  May 18, 1999                 [Page 24]