Network Working Group                                        K. Kompella
Internet-Draft                                          Juniper Networks
Updates: 3031 (if approved)                                    S. Amante
Intended status: Standards Track             Level 3 Communications, LLC
Expires: January 8, 2009                                    July 7, 2008


              The Use of Entropy Labels in MPLS Forwarding
                  draft-kompella-mpls-entropy-label-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 8, 2009.

















Kompella & Amante        Expires January 8, 2009                [Page 1]


Internet-Draft             MPLS Entropy Labels                 July 2008


Abstract

   Load balancing is a powerful tool for engineering traffic across a
   network.  This memo suggests ways of improving load balancing across
   MPLS networks using the notion of "entropy labels".  It defines the
   concept, describes why they are needed, suggests how they can be
   used, and enumerates properties of entropy labels that allow optimal
   benefit.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Motivation . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.2.  Conventions used . . . . . . . . . . . . . . . . . . . . .  6
   2.  Approaches . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   3.  Entropy Labels . . . . . . . . . . . . . . . . . . . . . . . .  8
   4.  Forwarding and Load Balancing Behaviors for Entropy Labels . .  9
     4.1.  Ingress LSR  . . . . . . . . . . . . . . . . . . . . . . .  9
     4.2.  Transit LSR  . . . . . . . . . . . . . . . . . . . . . . .  9
     4.3.  Egress LSRs  . . . . . . . . . . . . . . . . . . . . . . . 10
   5.  Signaling for Entropy Labels . . . . . . . . . . . . . . . . . 11
     5.1.  LDP Signaling  . . . . . . . . . . . . . . . . . . . . . . 11
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 12
   7.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 13
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 14
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 14
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15
   Intellectual Property and Copyright Statements . . . . . . . . . . 16





















Kompella & Amante        Expires January 8, 2009                [Page 2]


Internet-Draft             MPLS Entropy Labels                 July 2008


1.  Introduction

   Load balancing, or multi-pathing, is an attempt to balance traffic
   across a network by allowing the traffic to use several paths, not
   just a single shortest path.  Load balancing has several benefits: it
   eases capacity planning; it can help absorb traffic surges by
   spreading them across several links; it allow better resilience by
   offering alternate paths should a link or node fail.

   As providers scale their networks, they resort to a small number of
   techniques to achieve greater bandwidth between nodes and,
   subsequently, depend on load-balancing of traffic over those paths.
   Two widely used techniques are: Link Aggregation (LAG) and Equal-Cost
   Multi-Path (ECMP).  LAG is only used to bond together several
   physical circuits between two adjacent nodes so they appear to
   higher-layer protocols as a single, higher bandwidth "virtual" pipe.
   On the other hand, ECMP is used between two nodes, separated by one
   or more hops, to allow load-sharing over more than just the shortest
   path in the network -- this is typically obtained by arranging IGP
   metrics such that there are several equal cost paths between source-
   destination pairs.  In summary, both of these techniques may, and
   oftentimes do, co-exist in various parts of a given providers
   network, depending on various choices made by the provider.

   A very important consideration when load balancing is that packets
   belonging to a given "flow" MUST be mapped to the same path, i.e.,
   the same exact sequence of links across the network.  This is to
   avoid jitter, latency and re-ordering issues for the flow.  However,
   what constitutes a flow varies considerably.  A common example of a
   flow is a TCP session.  Other examples are L2TP sessions
   corresponding to broadband users, or traffic within an ATM virtual
   circuit.  A flow is usually defined, for the purposes of forwarding
   and load balancing, by a hash computed on packet headers such that
   packets belonging to a given flow map to the same hash value.  The
   fields chosen for such a hash depend on the packet type; a typical
   set (for IP packets) is the IP source and destination address, the
   protocol type, and (for TCP and UDP traffic) the source and
   destination port numbers.  A conservative choice of fields leads to
   many flows mapping to the same hash value (and consequently poor load
   balancing); an overly aggressive choice may map a flow to multiple
   values, potentially causing the issues mentioned above.

   For MPLS networks, most of the same principles (and benefits) apply.
   However, finding useful fields in a packet for the purpose of load
   balancing can be more of a challenge.  In many cases, the extra
   encapsulation may require fairly deep inspection of packets to find
   these fields at every hop.  An idea for removing the need for this
   deep inspection is to extract this information *once*, at the ingress



Kompella & Amante        Expires January 8, 2009                [Page 3]


Internet-Draft             MPLS Entropy Labels                 July 2008


   of an MPLS Label Switched Path (LSP), and encode, within the label
   stack itself, in addition to the forwarding semantics of the label
   stack, the load balancing information.  This information can then be
   used on all MPLS hops across the network.  There are three key
   reasons why this is beneficial:

   1.  at the ingress of the LSP, MPLS encapsulation hasn't yet
       occurred, so deep inspection is not necessary;

   2.  the ingress of an LSP has more context and information about
       incoming packets than transit nodes; and

   3.  ingress nodes usually operate at lower bandwidths than transit
       nodes, allowing them to do more work per packet.

   This memo describes a few approaches to solving this problem, and
   focuses on one method, which uses the notion of entropy labels.  This
   memo goes on to define entropy labels, and describes why they are
   needed, and the properties of entropy labels in the forwarding plane:
   how they are generated and received and what is expected of transit
   Label Switching Routers (LSRs).  Finally, it describes in general how
   signaling works and what needs to be signaled, as well as specifics
   for LDP.

1.1.  Motivation

   MPLS is very successful generic forwarding substrate that may
   transport several dozen types of protocols, most notably: IP, PWE3,
   VPLS and IP VPN's.  Within each type of protocol, there typically
   exist several variants as it relates to load-sharing, e.g.: IP: IPv4,
   IPv6, IPv6 in IPv4, etc.; PWE3: Ethernet, ATM, Frame-Relay, etc.
   There are also several different types of Ethernet over PW
   encapsulation, ATM over PW encapsulation, etc. as well.  Finally,
   given the popularity of MPLS, it is likely that it will continue to
   be extended to transport new protocols as the need arises.

   Currently, each MPLS LSR along a given path needs to individually
   infer the underlying protocol within a MPLS packet in order to then
   extract appropriate keys from the payload.  Those keys are then used
   as input into a hash algorithm to determine the specific output
   interface on a LSR that is used for that given "microflow".
   Unfortunately, if the MPLS LSR is unable to infer the MPLS packet's
   payload (as is often the case), they typically will resort to using
   the topmost MPLS labels in the MPLS stack as keys to the load-hashing
   algorithm.  The result is an extremely inequitable distribution of
   traffic across multiple equal-cost paths exiting that node, simply
   because the topmost MPLS labels are very coarse-grained forwarding
   labels that typically describe a next-hop, or provide some other type



Kompella & Amante        Expires January 8, 2009                [Page 4]


Internet-Draft             MPLS Entropy Labels                 July 2008


   of mux/demux forwarding function, and do not describe the granularity
   of the underlying traffic.

   On the other hand, ingress MPLS LER's (PE routers) have detailed
   knowledge of an MPLS packet's contents, typically through a priori
   configuration of encapsulation(s) that are expected at a given PE-CE
   interface, (e.g.: IPv4, IPv6, VPLS, etc.).  PE routers need this
   information to: a) discern the packet's CoS forwarding treatment, b)
   apply filters to forward or block traffic to/from the CE; c) to
   forward routing/control traffic to an onboard management processor;
   or, d) load-share the traffic on its uplinks to P routers.  By
   knowing the expected encapsulation types, an ingress PE router could
   apply a smaller subset of payload parsing routines to extract keys
   appropriate for the given protocol.  Ultimately, this should allow
   for significantly improved accuracy in determining the appropriate
   load-balancing behavior for each protocol.

   In addition, compared to MPLS LSR's, PE routers typically operate at
   lower forwarding rates as well as have more flexible forwarding
   hardware.  As a result, a PE router can typically adapt much more
   quickly to new/emerging protocols and determine the appropriate keys
   used for load-sharing traffic that type of traffic through the
   network.

   An additional advantage of applying entropy labels only at the edge
   of the network, on PE routers, would be that core/transit MPLS LSR's
   could once again return to being completely oblivious to the contents
   of each MPLS packet, and only use the outer MPLS labels to determine
   forwarding and forwarding treatment of MPLS packets.  Specifically,
   there will be no reason to duplicate, from MPLS LER's, extremely
   complex packet/payload parsing functionality within MPLS LSR's and
   attempt to keep to keep this functionality at parity across all
   network elements, e.g.: both MPLS LSR's and LER's.  Ultimately, this
   should result in less complexity within core LSR's allowing them to
   more easily scale to higher forwarding rates, larger port density,
   consume less power, etc.  Finally, the approach discussed in this
   memo would allow for more rapid deployment of new protocols, since
   MPLS LSR's will not have to be developed or modified to understand
   how to properly extract keys to achieve good load-sharing of traffic
   throughout the network.

   In summary, MPLS LSR's are ill-equipped to infer the protocol within
   a packet's payload and choose appropriate keys within the payload to
   correctly identify a given "microflow", which is required to provide
   the most equitable load-sharing over multiple equal cost paths.  On
   the other hand, PE routers have both the knowledge and capabilities
   to more accurately determine the load-sharing treatment that should
   be applied to a given protocol encapsulated within MPLS by MPLS



Kompella & Amante        Expires January 8, 2009                [Page 5]


Internet-Draft             MPLS Entropy Labels                 July 2008


   LSR's.

1.2.  Conventions used

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Labels stacks are denoted <L1, L2, L3>, which L1 is the "outermost"
   label and L3 the innermost (closest to the payload).  Packet flows
   are depicted left to right, and signaling is shown right to left
   (unless otherwise indicated).







































Kompella & Amante        Expires January 8, 2009                [Page 6]


Internet-Draft             MPLS Entropy Labels                 July 2008


2.  Approaches

   There are two main approaches to encoding load balancing information
   in the label stack.  The first allocates multiple labels for a
   particular Forwarding Equivalance Class (FEC).  These labels are
   equivalent in terms of forwarding semantics, but having several
   allows flexibility in assigning labels to flows from the same FEC.
   The other approach encodes the load balancing information as a
   separate label in the label stack.  Here, there are two sub-
   approaches, based on whether this load-balancing label is signaled or
   not.

   The first approach has the advantage that the label stack stays the
   same depth whether using label-based load balancing or not; and so,
   consequently, do forwarding operations on transit and egress LSRs.
   However, it has a major drawback in that signaling and forwarding
   state are both increased significantly.  The number of independent
   choices for load balancing packets belonging to a FEC limits the
   effectiveness of load balancing, so one would like this number to be
   large.  However, the larger this number is, the greater the signaling
   and forwarding state in the network.

   The second approach increases the size of the label stack by one
   label.  This consequently affects operations on ingress, transit and
   egress LSRs.  The sub-approach of signaling the load-balancing labels
   increases signaling and forwarding state, and so suffers from some of
   the problems of the first approach.

   The approach advocated by this memo, and the only one described in
   detail, is the one where the load-balancing labels are not signaled.
   With this approach, there is minimal change to signaling state for a
   FEC; also, there is no change in forwarding operations in transit
   LSRs, and no increase of forwarding state in any LSR.  The only
   purpose of these labels is to increase the entropy in the label
   stack, so they are called "entropy labels".
















Kompella & Amante        Expires January 8, 2009                [Page 7]


Internet-Draft             MPLS Entropy Labels                 July 2008


3.  Entropy Labels

   An entropy label (as used here) is a label:

   1.  that is not used for forwarding;

   2.  that is not signaled; and

   3.  whose only purpose in the label stack is to provide "entropy" to
       improve load balancing.

   Entropy labels are generated by an ingress LSR, based entirely on
   load balancing information.  However, they MUST not have values in
   the reserved label space (0-15).  Entropy labels MUST be at the
   bottom of the label stack, and thus the "end-of-stack" bit in the
   label should be set.  To ensure that they are not used inadvertently
   for forwarding, entropy labels SHOULD have a TTL of 0.

   Since entropy labels are generated by the ingress LSR, an egress LSR
   MUST be able to tell unambiguously that a given label is an entropy
   label.  This of course depends on the underlying application.  If any
   ambiguity is possible, the label above the entropy label MUST be an
   "entropy label indicator" (ELI), which says that the following label
   is an entropy label.  The ELI may be signaled, or may be a reserved
   label reserved specifically for this purpose.  Fortunately, for many
   applications, the use of entropy labels is unambiguous, and does not
   need an ELI.

   Applications for MPLS entropy labels include pseudowires ([RFC4447],
   [I-D.bryant-filsfils-fat-pw]), Layer 3 VPNs ([RFC4364]), VPLS
   ([RFC4761], [RFC4762]) and tunnel LSPs.  This memo specifies general
   properties of entropy labels, and the signaling of entropy labels for
   LDP ([RFC3036]) tunnel LSPs.  Other memos will specify the signaling
   and use of entropy labels for specific applications.

















Kompella & Amante        Expires January 8, 2009                [Page 8]


Internet-Draft             MPLS Entropy Labels                 July 2008


4.  Forwarding and Load Balancing Behaviors for Entropy Labels

4.1.  Ingress LSR

   Suppose that for a particular application (or FEC), an ingress LSR X
   has to push label stack <TL, AL>, where TL is the "tunnel label" and
   AL is the application label.  (Note the use of the convention for
   label stacks described in Section 1.2.  The use of a two-label stack
   is just for illustrative purposes.)  Suppose furthermore that X is to
   use entropy labels for this application.  Thus, the resultant label
   stack will be <TL, AL, EL>, where EL is the entropy label.

   When a packet for this FEC arrives at X, X must first determine the
   fields that it will use for load balancing.  Typically, X will then
   generate a hash H over those fields.  X will then pick an outgoing
   label stack <TL, AL> to push on the packet.  However, X must also
   generate an entropy label EL (based either directly on the load
   balancing fields, or on the hash H).  EL is a "regular" 32-bit label,
   encoded in the usual way; however, the EOS bit MUST be 1 and the TTL
   field MUST be 0.  X then pushes <TL, AL, EL> on to the packet before
   forwarding it to the next LSR.  If X is told (via signaling) that it
   must use an entropy label indicator ELI, then X instead pushes <TL,
   AL, ELI, EL> on to the packet.

   Note that ingress LSR X MUST NOT include an entropy label unless the
   egress LSR for this FEC has indicated that it is ready to receive
   entropy labels.  Furthermore, if the egress LSR has signaled that an
   ELI is needed, then X MUST include the ELI with the entropy label;
   otherwise, X MUST NOT use entropy labels.

4.2.  Transit LSR

   Transit LSRs have no change in forwarding behavior.  For load
   balancing, transit LSRs SHOULD use the whole label stack (e.g., for
   computing the load balance hash).  Transit LSRs MAY choose to look
   beyond the label stack for further load balancing information;
   however, if entropy labels are being used, this may not be very
   useful.  In a mixed environment (or for backward compatibility), this
   is the simplest approach.

   Thus, transit LSRs are almost unaffected by the use of entropy
   labels.  If transit LSRs were programmed to use a subset of the label
   stack, they may have to be reconfigured to use the full stack.  But
   otherwise, no changes are needed.







Kompella & Amante        Expires January 8, 2009                [Page 9]


Internet-Draft             MPLS Entropy Labels                 July 2008


4.3.  Egress LSRs

   An ingress LSR X MUST NOT send entropy labels to an egress LSR Y
   unless Y has signaled its readiness to receive such labels.  Y must
   also determine (for a particular application or FEC), whether it can
   distinguish whether the ingress has added an entropy label or not; if
   Y cannot do so, Y MUST request that an ELI be used for this FEC.
   Alternatively, Y MUST require the use of entropy labels.  (See
   Section 5 for more details on signaling.)

   Suppose Y has signaled that it is prepared to receive entropy labels
   for a given FEC.  In this case, Y must be able to distinguish whether
   an ingress LSR has inserted an entropy label or not based solely on
   the 'end-of-stack' (EOS) bit on the application label for this FEC.
   When Y receives a packet with this application label, then Y looks to
   see if the EOS bit is set.  If not, Y assumes that the label below is
   an entropy label and pops it.  Y MAY choose to ensure that the
   entropy label has its EOS bit set and TTL=0.  Y then processes the
   packet as usual.  Implementations may choose to the order in which
   they apply these operations, but the net result should be as
   specified.






























Kompella & Amante        Expires January 8, 2009               [Page 10]


Internet-Draft             MPLS Entropy Labels                 July 2008


5.  Signaling for Entropy Labels

   Signaling for entropy labels exchanges three types of information:

   1.  whether an LSR Y is prepared to receive entropy labels, or that Y
       MUST receive entropy labels,

   2.  whether receiving LSR Y requires ELIs with entropy labels, and if
       so, what label to use as the ELI, and

   3.  whether an LSR X is able to send entropy labels.

   The uses of this information can be illustrated as follows.  If an
   LSR Y is prepared to receive entropy labels for an application (or
   FEC), it signals that to the ingress LSR(s).  That means that an
   ingress LSR for this application MAY send an entropy label for this
   application; Y MUST be able to distinguish whether or not an entropy
   label was sent based solely on the EOS bit on the application label.
   If this is not the case, Y can choose one of two approaches.  Y can
   signal that an ELI MUST be used for this FEC; Y may also signal what
   ELI to use.  In this case, an ingress LSR will either not send an
   entropy label, or push the ELI before the entropy label.  This makes
   the use/non-use of an entropy label unambiguous.  However, this also
   increases the size of the label stack.  An alternative approach is
   that Y signals that entropy labels MUST be used.  An ingress LSR MUST
   acknowledge that it will do so (via signaling); if an ingress LSR
   cannot do so, the signaling for this application MUST renegotiate to
   not use entropy labels (or fail).

   The specific protocols and encoding details for the above will depend
   on the underlying application; see [I-D.bryant-filsfils-fat-pw] for
   an example for pseudowires.

5.1.  LDP Signaling

   TBD















Kompella & Amante        Expires January 8, 2009               [Page 11]


Internet-Draft             MPLS Entropy Labels                 July 2008


6.  Security Considerations

   Having security is a Good Thing.
















































Kompella & Amante        Expires January 8, 2009               [Page 12]


Internet-Draft             MPLS Entropy Labels                 July 2008


7.  Acknowledgments

   We wish to thank Ulrich Drafz for his contributions, as well as the
   entire "hash label" team for their valuable comments and discussion.















































Kompella & Amante        Expires January 8, 2009               [Page 13]


Internet-Draft             MPLS Entropy Labels                 July 2008


8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

8.2.  Informative References

   [I-D.bryant-filsfils-fat-pw]
              Bryant, S., Filsfils, C., and U. Drafz, "Load Balancing
              Fat MPLS Pseudowires", draft-bryant-filsfils-fat-pw-01
              (work in progress), February 2008.

   [RFC3036]  Andersson, L., Doolan, P., Feldman, N., Fredette, A., and
              B. Thomas, "LDP Specification", RFC 3036, January 2001.

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.

   [RFC4447]  Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G.
              Heron, "Pseudowire Setup and Maintenance Using the Label
              Distribution Protocol (LDP)", RFC 4447, April 2006.

   [RFC4761]  Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
              (VPLS) Using BGP for Auto-Discovery and Signaling",
              RFC 4761, January 2007.

   [RFC4762]  Lasserre, M. and V. Kompella, "Virtual Private LAN Service
              (VPLS) Using Label Distribution Protocol (LDP) Signaling",
              RFC 4762, January 2007.




















Kompella & Amante        Expires January 8, 2009               [Page 14]


Internet-Draft             MPLS Entropy Labels                 July 2008


Authors' Addresses

   Kireeti Kompella
   Juniper Networks
   1194 N. Mathilda Ave.
   Sunnyvale, CA  94089
   US

   Email: kireeti@juniper.net


   Shane Amante
   Level 3 Communications, LLC
   1025 Eldorado Blvd
   Broomfield, CO
   US

   Email: shane@level3.net

































Kompella & Amante        Expires January 8, 2009               [Page 15]


Internet-Draft             MPLS Entropy Labels                 July 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.











Kompella & Amante        Expires January 8, 2009               [Page 16]