Internet Draft Document                            Vach Kompella
       Category: Standards Track                              Joe Regan
       Expires: August 2008                              Alcatel-Lucent
    
                                                           Shane Amante
                                                 Level 3 Communications
    
    
                                                      February 18, 2008
    
                     Conversation Hashing for Pseudowires
                    draft-vkompella-pwe3-hash-label-00.txt
    
    Status of this Memo
    
       By submitting this Internet-Draft, each author represents that
       any applicable patent or other IPR claims of which he or she is
       aware have been or will be disclosed, and any of which he or she
       becomes aware will be disclosed, in accordance with Section 6 of
       BCP 79.
    
       Internet-Drafts are working documents of the Internet
       Engineering Task Force (IETF), its areas, and its working
       groups.  Note that other groups may also distribute working
       documents as Internet-Drafts.
    
       Internet-Drafts are draft documents valid for a maximum of six
       months and may be updated, replaced, or obsoleted by other
       documents at any time.  It is inappropriate to use Internet-
       Drafts as reference material or to cite them other than as "work
       in progress."
    
       The list of current Internet-Drafts can be accessed at
          http://www.ietf.org/ietf/1id-abstracts.txt.
    
       The list of Internet-Draft Shadow Directories can be accessed at
          http://www.ietf.org/shadow.html.
    
       This Internet-Draft will expire on August 21, 2008.
    
    Copyright Notice
    
          Copyright (C) The IETF Trust (2008).
    
    
    
    Abstract
    
       V. Kompella            Expires August 2008           [Page 1]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
    
       This draft proposes a method to introduce granularity on the
       hashing of traffic running over pseudowires.  Most forwarding
       engines are able to hash based on label stacks, so the approach
       here is to introduce additional labels that do not affect the
       handling of packets, but which identify a conversation, and can
       be hashed with granularity.
    
    1. Introduction
    
       This draft proposes a method to introduce granularity on the
       hashing of traffic running over pseudowires.  Typically,
       forwarding hardware is capable of looking at some fields in
       packets to construct hash buckets for conversations or flows.
       The ingress node is able to look at the un-encapsulated packet
       and spread flows around.  At intermediate nodes, for
       pseudowires, there is no information on what layer 2 protocol
       encapsulation is on the packet, so the hardware can only hash on
       is the label stack.  However, the granularity obtained over
       pseudowires is inadequate for real load-balancing, especially
       when the pseudowires emulate fat trunks.
    
    2. The Solution
    
       When two PEs open up a targeted LDP session between them, as
       part of the Capability exchange between the two peers [LDP-Cap],
       the Hash Label TLV is exchanged.  The Hash Label TLV specifies a
       set of labels that instruct the receiving PE to POP and continue
       on to the next label in the stack.
    
       Since forwarding engines generate hash buckets based on the
       label stack, the Hash Label(s) can be used to provide some
       diversity in the conversations in a pseudowire.
    
       Suppose that an LDP session has been established between two
       peers, P and Q, and Q has signaled ten Hash Labels in the range
       101 through 110 (inclusive).  On receiving a packet from the
       attachment circuit, node P will hash the packet into one of ten
       buckets, one for each Hash Label received by P.  P will then
       encapsulate the packet with the PW label at the bottom of stack,
       add the appropriate Hash Label corresponding to the hash bucket,
       and finally add the tunnel encapsulation.  Assume for the moment
       that the tunnel encapsulation is another label.
    
       At P, the layer 2 fields are visible, and a next hop can be
       determined out of the multiple (e.g., ECMP or LAG) next hops.
       However, at an LSR node, the label stack provides more
    
       V. Kompella            Expires August 2008           [Page 2]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
       variability, even though the packets belong to the same
       pseudowire because the Hash Label gives more diversity.
    
       The same set of labels used for hashing can be used between Q
       and any other node that it sets up a targeted LDP session, and
       the same set of labels can be used across different pseudowires.
    
       Note that this solution can be extended, e.g., if P is capable
       of imposing four labels, and if Q is capable of processing a
       four label stack, then P can hash the flows into 100 buckets
       (using two of the hash labels for the conversation diversity).
       This would also require that the intermediate nodes be capable
       of hashing a four label stack.
    
       The order of the labels must be PW label at the bottom, Router
       Alert (if present), and then the Hash Label(s).  Finally, the
       tunnel encapsulation comes at the top of the stack, which may be
       a label (or a pair of labels if the MPLS protocol imposes them,
       e.g., using facility bypass protection [RFC4090], or inter-area
       LDP [LDP-Ext]).
    
    2.1. Protocol Format
    
       We introduce a new Hash Label TLV which has the following
       format.
    
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |U|F|  Hash Label TLV           |        Length                 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | NumPushLabels | NumPopLabels  | NumHashLabels |   AllocType   |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |        MBZ            |    Label 1                            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |        MBZ            |    Label 2                            |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |        MBZ            |      "                                |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    
       Hash Label TLV Type.
            The type of the Hash Label TLV (TBD from IANA).
    
       Length.
            The length of the TLV.
    
       NumPushLabels.
    
       V. Kompella            Expires August 2008           [Page 3]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
            The number of hash labels the node can push.
    
       NumPopLabels.
            The number of hash labels the node can pop.
    
       NumHashLabels.
            The number of hash labels provided for use.
    
       AllocType.
            The type of allocation scheme.  If AllocType = 0, then the
       labels following the AllocType are a list of labels.  If
       AllocType = 1, then exactly two labels must follow the
       AllocType, and they provide the lower and upper bound of a range
       of labels (inclusive).
    
       Label 1, Label 2, etc.
            If AllocType = 0, these are actual labels that may be used
       as hash labels.  If AllocType = 1, then they are the lower and
       upper bound of a range of hash labels that may be used.
    
    3. Packet format with PW hash labels
    
       The following is an example of what could happen if hash labels
       are exchanged between two nodes P and Q, where P sends Q the
       Hash Label TLV with 10 labels between 101 and 110.
    
       The figure below shows the PW and tunnel labels.
    
                               PW label 2001
                       ------------------------------
                       |         -----              |
                       |  ------>| C |-------       |
                       |  | 4000 ----- 7000 |       |
                       |  |                 v       v
                       -----    -----    -----    -----
               AC1-----| P |----| A |----| B |----| Q |-----AC2
                       -----    -----    -----    -----
                        |       ^   |    ^   |     ^
                        |       |   |    |   |     |
                        ---------   ------   -------
                          3000      5000      6000
       Tunnel Labels:
       P->A: 3000
       P->C: 4000
       A->B: 5000
       B->Q: 6000
       C->B: 7000
    
       V. Kompella            Expires August 2008           [Page 4]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
    
       Q hashes a packet from attachment circuit AC2, on whatever
       relevant fields define a conversation or flow, and comes up with
       an index between 1 and 10, say 5.  Then Q constructs the packet
       to P to look like:
    
    
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         6000  (Tunnel Label)  |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         105 (Hash Label)      |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         2001 (PW Label) (BOS) |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         Payload               |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    
       When B receives the packet, it will hash the label stack {6000,
       105, 2001} and come up with one of the next-hops A or C.  Say
       the result is A.  The packet from B to A will look like:
    
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         5000  (Tunnel Label)  |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         105 (Hash Label)      |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         2001 (PW Label) (BOS) |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         Payload               |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    
       P would then receive the following packet from A:
    
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         3000  (Tunnel Label)  |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         105 (Hash Label)      |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         2001 (PW Label) (BOS) |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                       |         Payload               |
                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    
       P will pop 3000, find the hash label 105 (action pop), and then
       process 2001 as the PW label to forward the packet out AC1 with
       whatever necessary encapsulation is required for that DLC.
    
    
       V. Kompella            Expires August 2008           [Page 5]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
       The rationale for putting the hash label between the PSN tunnel
       encapsulation and the PW label is that the forwarding engine
       will not have to process the PW label and then after it has
       taken the appropriate action, be required to remember the
       context while it processes the hash labels.
    
    4. Future considerations
    
       One future application of this method would be to create a basis
       for hash diversity without having to peek below the label stack
       for IP traffic carried over LDP LSPs.
    
    5. References
    
       Normative References
    
    
       Informative References
    
       [LDP-Cap] "LDP Capabilities," R. Thomas et al, draft-ietf-mpls-
       ldp-capabilities-01.txt, work in progress, February 2008.
    
       [RFC4090] "Fast Reroute Extensions to RSVP-TE for LSP Tunnels,"
       P. Pan, RFC 4090, May 2005.
    
       [LDP-Ext] "LDP extension for Inter-Area LSP," B. Decraene et al,
       draft-ietf-mpls-ldp-interarea-02.txt, work in progress, February
       2008.
    
    
    6. Security Considerations
    
       No new security issues arise out of the extensions proposed here
       than exist in the base PWE3 standards.
    
    7. IANA Considerations
    
       No IANA allocations have been specified yet (but a new TLV type
       will be forthcoming, as well as changes to the LDP Capability
       FEC TLV).
    
    8. Authors' Addresses
    
       Vach Kompella
       Alcatel-Lucent
       vach.kompella@alcatel-lucent.com
    
    
       V. Kompella            Expires August 2008           [Page 6]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
       Joe Regan
       Alcatel-Lucent
       joe.regan@alcatel-lucent.com
    
       Shane Amante
       Level 3 Communications
       shane@castlepoint.net
    
    9. Full Copyright Statement
    
       Copyright (C) The IETF Trust (2008).
    
       This document is subject to the rights, licenses and
       restrictions contained in BCP 78, and except as set forth
       therein, the authors retain all their rights.
    
       This document and the information contained herein are provided
       on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
       REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY,
       THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM
       ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
       ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
       INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY
       OR FITNESS FOR A PARTICULAR PURPOSE.
    
    10. Intellectual Property
    
       The IETF takes no position regarding the validity or scope of
       any Intellectual Property Rights or other rights that might be
       claimed to pertain to the implementation or use of the
       technology described in this document or the extent to which any
       license under such rights might or might not be available; nor
       does it represent that it has made any independent effort to
       identify any such rights.  Information on the procedures with
       respect to rights in RFC documents can be found in BCP 78 and
       BCP 79.
    
       Copies of IPR disclosures made to the IETF Secretariat and any
       assurances of licenses to be made available, or the result of an
       attempt made to obtain a general license or permission for the
       use of such proprietary rights by implementers or users of this
       specification can be obtained from the IETF on-line IPR
       repository at http://www.ietf.org/ipr.
    
       The IETF invites any interested party to bring to its attention
       any copyrights, patents or patent applications, or other
       proprietary rights that may cover technology that may be
       required to implement this standard.  Please address the
       information to the IETF at ietf-ipr@ietf.org.
    
    
       V. Kompella            Expires August 2008           [Page 7]


       Internet-Draft        Hashing on Pseudowires       February 2008
    
    
    11. Acknowledgments
    
       Funding for the RFC Editor function is provided by the IETF
       Administrative Support Activity (IASA).
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
       V. Kompella            Expires August 2008           [Page 8]