PWE3 S. Bryant, Ed.
Internet-Draft C. Filsfils
Intended status: Standards Track Cisco Systems
Expires: July 31, 2010 U. Drafz
Deutsche Telekom
V. Kompella
J. Regan
Alcatel-Lucent
S. Amante
Level 3 Communications
January 27, 2010
Flow Aware Transport of Pseudowires over an MPLS PSN
draft-ietf-pwe3-fat-pw-03
Abstract
Where the payload carried over a pseudowire carries a number of
identifiable flows it can in some circumstances be desirable to carry
those flows over the equal cost multiple paths (ECMPs) that exist in
the packet switched network. Most forwarding engines are able to
hash based on label stacks and use this to balance flows over ECMPs.
This draft describes a method of identifying the flows, or flow
groups, to the label switched routers by including an additional
label in the label stack.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119 [RFC2119].
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Bryant, et al. Expires July 31, 2010 [Page 1]
Internet-Draft FAT-PW January 2010
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on July 31, 2010.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
Bryant, et al. Expires July 31, 2010 [Page 2]
Internet-Draft FAT-PW January 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. ECMP in Label Switched Routers . . . . . . . . . . . . . . 5
1.2. Flow Label . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Native Service Processing Function . . . . . . . . . . . . . . 6
3. Pseudowire Forwarder . . . . . . . . . . . . . . . . . . . . . 6
3.1. Encapsulation . . . . . . . . . . . . . . . . . . . . . . 7
4. Signaling the Presence of the Flow Label . . . . . . . . . . . 8
4.1. Structure of Flow Label Sub-TLV . . . . . . . . . . . . . 9
5. Multi-Segment Pseudowires . . . . . . . . . . . . . . . . . . 9
6. OAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7. Applicability of FAT PWs . . . . . . . . . . . . . . . . . . . 11
7.1. Equal Cost Multiple Paths . . . . . . . . . . . . . . . . 12
7.2. Link Aggregation Groups . . . . . . . . . . . . . . . . . 13
7.3. The Single Large Flow Case . . . . . . . . . . . . . . . . 13
7.4. MPLS-TP . . . . . . . . . . . . . . . . . . . . . . . . . 14
8. Applicability to MPLS . . . . . . . . . . . . . . . . . . . . 15
9. Security Considerations . . . . . . . . . . . . . . . . . . . 15
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
11. Congestion Considerations . . . . . . . . . . . . . . . . . . 16
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
13.1. Normative References . . . . . . . . . . . . . . . . . . . 16
13.2. Informative References . . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18
Bryant, et al. Expires July 31, 2010 [Page 3]
Internet-Draft FAT-PW January 2010
1. Introduction
A pseudowire (PW) [RFC3985] is normally transported over one single
network path, even if multiple Equal Cost Multiple Paths (ECMP) exit
between the ingress and egress PW provider edge (PE)
equipments[RFC4385] [RFC4928]. This is required to preserve the
characteristics of the emulated service (e.g. to avoid misordering
SAToP pseudowire packets [RFC4553] or subjecting the packets to
unusable inter-arrival times ). The use of a single path to preserve
order remains the default mode of operation of a pseudowire (PW).
The new capability proposed in this document is an OPTIONAL mode
which may be used when the use of ECMP paths for is known to be
beneficial (and not harmful) to the operation of the PW.
Some pseudowires are used to transport large volumes of IP traffic
between routers at two locations. One example of this is the use of
an Ethernet pseudowire to create a virtual direct link between a pair
of routers. Such pseudowire's may carry from hundred's of Mbps to
Gbps of traffic. Such pseudowire's do not require strict ordering to
be preserved between packets of the pseudowire. They only require
ordering to be preserved within the context of each individual
transported IP flow. Some operators have requested the ability to
explicitly configure such a pseudowire to leverage the availability
of multiple ECMP paths. This allows for better capacity planning as
the statistical multiplexing of a larger number of smaller flows is
more efficient than with a smaller set of larger flows. Although
Ethernet is used as an example above, the mechanisms described in
this draft are general mechanisms that may be applied to any
pseudowire type in which there are identifiable flows, and in which
there is no requirement to preserve the order between those flows.
Typically, forwarding hardware can deduce that an IP payload is being
directly carried by an MPLS label stack, and is capable of looking at
some fields in packets to construct hash buckets for conversations or
flows. However, an intermediate node has no information on the type
pseudowire being carried in the packet. This limits the forwarder at
the intermediate node to only being able to make an ECMP choice based
on a hash of the label stack. In the case of a pseudowire emulating
a high bandwidth trunk, the granularity obtained by hashing the
default label stack is inadequate for satisfactory load-balancing.
The ingress node, however, is in the special position of being able
to look at the un-encapsulated packet and spread flows amongst any
available ECMP paths, or even any Loop-Free Alternates [RFC5286] .
This draft proposes a method to introduce granularity on the hashing
of traffic running over pseudowires by introducing an additional
label, chosen by the ingress node, and placed at the bottom of the
label stack.
Bryant, et al. Expires July 31, 2010 [Page 4]
Internet-Draft FAT-PW January 2010
In addition to providing an indication of the flow structure for use
in ECMP forwarding decisions, the mechanism described in the document
may also be used to select flows for distribution over an 802.1ad
link aggregation group that has been used in an MPLS network.
1.1. ECMP in Label Switched Routers
Label switched routers commonly hash the label stack or some elements
of the label stack as a method of discriminating between flows, in
order to distribute those flows over the available equal cost
multiple paths that exist in the network. Since the label at the
bottom of stack is usually the label most closely associated with the
flow, this normally provides the greatest entropy, and hence is
usually included in the hash. This draft describes a method of
adding an additional label at the bottom of stack in order to
facilitate the load balancing of the flows within a pseudowire over
the available ECMPs. A similar design for general MPLS use has also
been proposed [I-D.kompella-mpls-entropy-label], however that is
outside the scope of this draft.
An alternative method of load balancing by creating a number of
pseudowires and distributing the flows amongst them was considered,
but was rejected because:
o It did not introduce as much entropy as the load balance label
method.
o It required additional pseudowires to be set up and maintained.
1.2. Flow Label
An additional label is interposed between the pseudowire label and
the control word, or if the control word is not present, between the
pseudowire label and the pseudowire payload. This additional label
is called the flow label. Indivisible flows within the pseudowire
MUST be mapped to the same flow label by the ingress PE. The flow
label stimulates the correct ECMP load balancing behaviour in the
packet switched network (PSN). On receipt of the pseudowire packet
at the egress PE (which knows this additional label is present) the
flow label is discarded without processing.
Note that the flow label MUST NOT be an MPLS reserved label (values
in the range 0..15) [RFC3032], but is otherwise unconstrained by the
protocol.
Considerations of the TTL value are described in the Security section
of this document. The flow label can never become the top label in
normal operation, and hence the TTL in the flow label is never used
Bryant, et al. Expires July 31, 2010 [Page 5]
Internet-Draft FAT-PW January 2010
to determine whether the packet should be discarded due to TTL
expiry. Therefore there are no lower restrictions on the TTL value.
2. Native Service Processing Function
The Native Service Processing (NSP) function [RFC3985] is a component
of a PE that has knowledge of the structure of the emulated service
and is able to take action on the service outside the scope of the
pseudowire. In this case it is required that the NSP in the ingress
PE identify flows, or groups of flows within the service, and
indicate the flow (group) identity of each packet as it is passed to
the pseudowire forwarder. As an example, where the PW type is an
Ethernet, the NSP might parse the ingress Ethernet traffic and
consider all of the IP traffic. This traffic could then be
categorised into flows by considering all traffic with the same
source and destination address pair to be a single indivisible flow.
Since this is an NSP function, by definition, the method used to
identify a flow is outside the scope of the pseudowire design.
Similarly, since the NSP is internal to the PE, the method of flow
indication to the pseudowire forwarder is outside the scope of this
document.
3. Pseudowire Forwarder
The pseudowire forwarder must be provided with a method of mapping
flows to load balanced paths.
The forwarder must generate a label for the flow or group of flows.
How the load balance label values are determined is outside the scope
of this document, however the load balance label allocated to a flow
MUST NOT be an MPLS reserved label and SHOULD remain constant for the
life of the flow. It is recommended that the method chosen to
generate the load balancing labels introduces a high degree of
entropy in their values, to maximise the entropy presented to the
ECMP path selection mechanism in the LSRs in the PSN, and hence
distribute the flows as evenly as possible over the available PSN
ECMP paths. The forwarder at the ingress PE prepends the pseudowire
control word (if applicable), and then pushes the flow label,
followed by the pseudowire label.
The forwarder at the egress PE uses the pseudowire label to identify
the pseudowire. From the context associated with the pseudowire
label, the egress PE can determine whether a flow label is present.
If a flow label is present, the label is discarded.
All other pseudowire forwarding operations are unmodified by the
Bryant, et al. Expires July 31, 2010 [Page 6]
Internet-Draft FAT-PW January 2010
inclusion of the flow label.
3.1. Encapsulation
The PWE3 Protocol Stack Reference Model modified to include flow
label is shown in Figure 1 below
+-------------+ +-------------+
| Emulated | | Emulated |
| Ethernet | | Ethernet |
| (including | Emulated Service | (including |
| VLAN) |<==============================>| VLAN) |
| Services | | Services |
+-------------+ +-------------+
| Flow | | Flow |
+-------------+ Pseudowire +-------------+
|Demultiplexer|<==============================>|Demultiplexer|
+-------------+ +-------------+
| PSN | PSN Tunnel | PSN |
| MPLS |<==============================>| MPLS |
+-------------+ +-------------+
| Physical | | Physical |
+-----+-------+ +-----+-------+
Figure 1: PWE3 Protocol Stack Reference Model
The encapsulation of a pseudowire with a flow label is shown in
Figure 2 below
Bryant, et al. Expires July 31, 2010 [Page 7]
Internet-Draft FAT-PW January 2010
+-------------------------------+
| |
| Payload |
| | n octets
| |
+-------------------------------+
| Optional Control Word | 4 octets
+-------------------------------+
| Flow label | 4 octets
+-------------------------------+
| PW label | 4 octets
+-------------------------------+
| MPLS Tunnel label(s) | n*4 octets (four octets per label)
+-------------------------------+
Figure 2: Encapsulation of a pseudowire with a pseudowire load
balancing label
4. Signaling the Presence of the Flow Label
When using the signalling procedures in [RFC4447], a Pseudowire
Interface Parameter Flow Label Sub-TLV (FL Sub-TLV) type is used to
synchronise the flow label states between the ingress and egress PEs.
The presence of an FL Sub-TLV in the interface parameters indicates
to the ingress PE that the egress PE can correctly process a flow
label.
A PE that wishes to use a flow label includes in its label mapping
message a Flow Label Sub-TLV (FL Sub-TLV) with F = 1 (see
Section 4.1). A PE that can correctly process a flow label, and is
willing to receive one, but does not wish to send a flow label,
includes an FL Sub-TLV with F = 0.
If a PE has sent an FL Sub-TLV with F = 1, and has received an FL
Sub-TLV it MUST include a flow lablel in the label stack.
If a PE has sent an FL Sub-TLV with F = 1 and does not receive an FL
Sub-TLV it MUST send a new label mapping using an FL Sub-TLV with F =
0.
A PE that has sent an FL Sub-TLV with F = 0 MUST NOT include a flow
lablel in the label stack.
If a PE that previously did not received a label binding without a FL
Bryant, et al. Expires July 31, 2010 [Page 8]
Internet-Draft FAT-PW January 2010
Sub-TLV receives a new a label mapping with one included, it MAY send
a new label mapping including an FL Sub-TLV with F = 1.
The signalling procedures in [RFC4447] state that "Processing of the
interface parameters should continue when unknown interface
parameters are encountered, and they MUST be silently ignored." The
signalling proceedure described here is therefore backwards
compatible with existing implementations.
If PWE3 signalling [RFC4447] is not in use for a pseudowire, then
whether the flow label is used MUST be identically provisioned in
both PEs at the pseudowire endpoints. If there is no provisioning
support for this option, the default behaviour is not to include the
flow label.
Note that what is signalled is the desire to include the flow label
in the label stack. The value of the label is a local matter for the
ingress PE, and the label value itself is not signalled.
4.1. Structure of Flow Label Sub-TLV
The structure of the flow label TLV is shown in Figure 3.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| FL | Length |F| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Flow Label Sub-TLV
Where:
o FL is the flow label sub-TLV identifier assigned by IANA.
o Length is the length of the TLV in octets and is 4.
o When F=1 a flow label MUST be pushed. When F=0 a flow label MUST
NOT be pushed.
o Reserved bits MUST be zero on transmit and MUST be ignored on
receive.
5. Multi-Segment Pseudowires
The flow label mechanism described in this document works on multi-
segment PWs without requiring modification to the Switching PEs
Bryant, et al. Expires July 31, 2010 [Page 9]
Internet-Draft FAT-PW January 2010
(S-PEs). This is because the flow label is transparent to the label
swap operation, and because interface parameter Sub-TLV signalling is
transitive.
6. OAM
The following OAM considerations apply to this method of load
balancing.
Where the OAM is only to be used to perform a basic test that the
pseudowires have been configured at the PEs, VCCV [RFC5085] messages
may be sent using any load balance pseudowire path, i.e. using any
value for the flow label.
Where it is required to verify that a pseudowire is fully functional
for all flows, VCCV [RFC5085] connection verification message MUST be
sent over each ECMP path to the pseudowire egress PE. This problem
is difficult to solve and scales poorly. We believe that this
problem is addressed by the following two methods:
1. If a failure occurs within the PSN, this failure will normally be
detected by the PSN's Interior Gateway protocol (IGP) link/node
failure detection mechanism (loss of light, bidirectional
forwarding detection [I-D.ietf-bfd-base] or IGP hello detection),
and the IGP convergence will naturally modify the ECMP set of
network paths between the Ingress and Egress PE's. Hence the PW
is only impacted during the normal IGP convergence time.
2. If the failure is related to the individual corruption of an
Label Forwarding Information dataBase (LFIB) entry in a router,
then only the network path using that specific entry is impacted.
If the PW is load balanced over multiple network paths, then this
failure can only be detected if, by chance, the transported OAM
flow is mapped onto the impacted network path, or all paths are
tested. This type of error may be better solved be solved by
other means such as LSP self test [I-D.ietf-mpls-lsr-self-test].
To troubleshoot the MPLS PSN, including multiple paths, the
techniques described in [RFC4378] and [RFC4379] can be used.
Where the pseudowire OAM is carried out of band (VCCV Type 2)
[RFC5085] it is necessary to insert an "MPLS Router Alert Label" in
the label stack. The resultant label stack is a follows:
Bryant, et al. Expires July 31, 2010 [Page 10]
Internet-Draft FAT-PW January 2010
+-------------------------------+
| |
| Payload |
| | n octets
| |
+-------------------------------+
| Optional Control Word | 4 octets
+-------------------------------+
| Flow label | 4 octets
+-------------------------------+
| PW label | 4 octets
+-------------------------------+
| Router Alert label | 4 octets
+-------------------------------+
| MPLS Tunnel label(s) | n*4 octets (four octets per label)
+-------------------------------+
Figure 4: Use of Router Alert LAbel
7. Applicability of FAT PWs
A node within the PSN is not able to perform deep-packet-inspection
(DPI) of the PW as the PW technology is not self-describing: the
structure of the PW payload is only known to the ingress and egress
PE devices. The method proposed in this document provides a
statistical mitigation of the problem of load balance in those cased
where a PE is able to discern flows embedded in the traffic received
on the attachment circuit.
The methods describe in this document are transparent to the PSN and
as such do not require any new capability from the PSN.
The requirement to load-balance over multiple PSN paths occurs when
the ratio between the PW access speed and the PSN's core link
bandwidth is large (e.g. >= 10%). ATM and FR are unlikely to meet
this property. Ethernet may have this property, and for that reason
this document focuses on Ethernet. Applications for other high-
access-bandwidth PW's (e.g. Fibre Channel) may be defined in the
future.
This design applies to MPLS pseudowires where it is meaningful to de-
construct the packets presented to the ingress PE into flows. The
mechanism described in this document promotes the distribution of
flows within the pseudowire over different network paths. This in
turn means that whilst packets within a flow are delivered in order
(subject to normal IP delivery perturbations due to topology
Bryant, et al. Expires July 31, 2010 [Page 11]
Internet-Draft FAT-PW January 2010
variation), order is not maintained amongst packets of different
flows. It is not proposed to associate a different sequence number
with each flow. If sequence number support is required this
mechanism is not applicable.
Where it is known that the traffic carried by the Ethernet pseudowire
is IP the method of identifying the flows are well known and can be
applied. Such methods typically include hashing on the source and
destination addresses, the protocol ID and higher-layer flow-
dependent fields such as TCP/UDP ports, L2TPv3 Session ID's etc.
Where it is known that the traffic carried by the Ethernet pseudowire
is non-IP, techniques used for link bundling between Ethernet
switches may be reused. In this case however the latency
distribution would be larger than is found in the link bundle case.
The acceptability of the increased latency is for further study. Of
particular importance the Ethernet control frames SHOULD always be
mapped to the same PSN path to ensure in-order delivery.
7.1. Equal Cost Multiple Paths
ECMP in packet switched networks is statistical in nature. The
mapping of flows to a particular path does not take into account the
bandwidth of the flow being mapped or the current bandwidth usage of
the members of the ECMP set. This simplification works well when the
distribution of flows is evenly spread over the ECMP set and there
are a large number of flows that have low bandwidth relative to the
paths. The random allocation of a flow to a path provides a good
approximation to an even spread of flows, provided that polarisation
effects are avoided. The method proposed in this document has the
same statistical properties as an IP PSN.
ECMP is a load-sharing mechanism that is based on sharing the load
over a number of layer 3 paths through the PSN. Often however
multiple links exist between a pair of LSRs that are considered by
the IGP to be a single link. These are known as link bundles. The
mechanism described in this document can also be used to distribute
the flows within a pseudowire over the members of the link bundle by
using the flow label value to identify candidate flows. How that
mapping takes place is outside the scope of this specification.
Similar considerations apply to link aggregation groups.
In the ECMP case and the link bundling case the NSP may attempt to
take bandwidth into consideration when allocating groups of flows to
a common path. That is permitted, but it must be borne in mind that
the semantics of a label stack entry (LSE) as defined by [RFC3032]
cannot be modified, the value of the flow label cannot be modified at
any point on the LSP, and the interpretation of bit patterns in, or
Bryant, et al. Expires July 31, 2010 [Page 12]
Internet-Draft FAT-PW January 2010
values of, the flow label by an LSR are undefined.
A different type of load balancing is the desire to carry a
pseudowire over a set of PSN links in which the bandwidth of members
of the link set is less than the bandwidth of the pseudowire. This
problem is addressed in [I-D.stein-pwe3-pwbonding]. Such a mechanism
can be considered complementary to this mechanism.
7.2. Link Aggregation Groups
A Link Aggregation Group (LAG) is used to bond together several
physical circuits between two adjacent nodes so they appear to
higher-layer protocols as a single, higher bandwidth "virtual" pipe.
These may co-exist in various parts of a given network. An advantage
of LAGs is that they reduce the number of routing and signalling
protocol adjacencies between devices, reducing control plane
processing overhead. As with ECMP, the key problem related to LAGs
is that due to inefficiencies in LAG load-distribution algorithms, a
particular component of a LAG may experience congestion. The
mechanism proposed here may be able to assist in producing a more
uniform flow distribution.
The same considerations requiring a flow to go over a single member
of an ECMP path set apply to a member of a LAG.
7.3. The Single Large Flow Case
Clearly the operator should make sure that the service offered using
PW technology and the method described in this document does not
exceed the maximum planned link capacity, unless it can be guaranteed
that it conforms to the Internet traffic profile of a very large
number of small flows.
If the payload on a PW is made of a single inner flow (i.e. an
encrypted connection between two routers), or the flow identifiers
are too deeply buried in the packet, then the functionality described
in this document does not give any benefits, though neither does it
cause harm relative to the existing situation. The most common case
where a single flow dominated the traffic on a PW is when it is used
to transport enterprise traffic. Enterprise traffic may well consist
of a large single TCP flows, or encrypted flows that cannot be
handled by the methods described in this document.
An operator has six options under these circumstances:
1. The operator can do nothing and the system will work as it does
without the flow label.
Bryant, et al. Expires July 31, 2010 [Page 13]
Internet-Draft FAT-PW January 2010
2. The operator can make the customer aware that the service
offering has a restriction on flow bandwidth and police flows to
that restriction. This would allow customers offering multiple
flows to use a larger fraction their access bandwidth, whilst
preventing an single flow from consuming a fraction of internal
link bandwidth that the operator considered excessive.
3. The operator could configure the ingress PE to assign a constant
flow label to all high bandwidth flows so that only one path was
affected by these flows,
4. The operator could configure the ingress PE to assign a random
flow label to all high bandwidth flows so as to minimise the
disruption to the network as a cost of out of order traffic to
the user.
5. The operator could configure the ingress to assign a label of
special significance (such as a reserved label) to all high
bandwidth flows so that some other action (not specified in this
document) could be taken on the flow.
The issues described above are mitigated by the following two
factors:
o Firstly, the customer of a high-bandwidth PW service has an
incentive to get the best transport service because an inefficient
use of the PSN leads to jitter and eventually to loss to the PW's
payload.
o Secondly, the customer is usually able to tailor their
applications to generate many flows in the PSN. A well-known
example is massive data transport between servers which use many
parallel TCP sessions. This same technique can be used by any
transport protocol: multiple UDP ports, multiple L2TPv3 Session
ID's, multiple GRE keys may be used to decompose a large flow into
smaller components. This approach may be applied to IPsec
[RFC4301] where multiple Security Parameters Indexes (SPI's) may
be allocated to the same security association.
7.4. MPLS-TP
The MPLS Transport Profile (MPLS-TP) [RFC5654] requirement 44 states
that "MPLS-TP SHOULD support mechanisms to enable the reserved
bandwidth of a transport path to be decreased without impacting the
existing traffic on that transport path, provided that the level of
existing traffic is smaller than the reserved bandwidth following the
decrease." The flow aware transport of a PW reorders packets (albeit
in an application friendly way), therefore SHOULD NOT be deployed in
Bryant, et al. Expires July 31, 2010 [Page 14]
Internet-Draft FAT-PW January 2010
a network conforming to the MPLS-TP.
8. Applicability to MPLS
A further application of this technique would be to create a basis
for hash diversity without having to peek below the label stack for
IP traffic carried over LDP LSPs. Work on the generalisation of this
to MPLS has been described in [I-D.kompella-mpls-entropy-label].
This is can be regarded as a complementary, but distinct, approach
since although similar consideration may apply to the identification
of flows and the allocation of flow label values, the flow labels are
imposed by different network components, and the associated
signalling mechanisms are different.
9. Security Considerations
The pseudowire generic security considerations described in [RFC3985]
and the security considerations applicable to a specific pseudowire
type (for example, in the case of an Ethernet pseudowire [RFC4448]
apply.
The ingress PE SHOULD take steps to ensure that the load-balance
label is not used as a covert channel.
It is useful to give consideration to the choice of TTL value in the
flow label stack entry [RFC3032]. The flow label is at the bottom of
label stack. Therefore, even when penultimate hop popping is
employed, it will always be will preceded by the PW label on arrival
at the PE. The flow label TTL should therefore never be considered
by the forwarder, and hence SHOULD be set to a value of 1. This will
prevent the packet being inadvertently forwarded based on the value
of the flow label. Note that this may be a departure from
considerations that apply to the general MPLS case.
10. IANA Considerations
IANA is requested to allocate the next available values from the IETF
Consensus range in the Pseudowire Interface Parameters Sub-TLV type
Registry as a Flow Label indicator.
Bryant, et al. Expires July 31, 2010 [Page 15]
Internet-Draft FAT-PW January 2010
Parameter Length Description
ID
TBD 4 Flow Label
11. Congestion Considerations
The congestion considerations applicable to pseudowires as described
in [RFC3985] and any additional congestion considerations developed
at the time of publication apply to this design.
The ability to explicitly configure a PW to leverage the availability
of multiple ECMP paths is beneficial to capacity planning as, all
other parameters being constant, the statistical multiplexing of a
larger number of smaller flows is more efficient than with a smaller
number of larger flows.
Note that if the classification into flows is only performed on IP
packets the behaviour of those flows in the face of congestion will
be as already defined by the IETF for packets of that type and no
additional congestion processing is required.
Where flows that are not IP are classified pseudowire congestion
avoidance must be applied to each non-IP load balance group.
12. Acknowledgements
The authors wish to thank Eric Grey, Kireeti Kompella, Joerg
Kuechemann, Wilfried Maas, Luca Martini, Mark Townsley, and Lucy Yong
for valuable comments on this document.
13. References
13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
Encoding", RFC 3032, January 2001.
[RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol
Label Switched (MPLS) Data Plane Failures", RFC 4379,
February 2006.
Bryant, et al. Expires July 31, 2010 [Page 16]
Internet-Draft FAT-PW January 2010
[RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson,
"Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for
Use over an MPLS PSN", RFC 4385, February 2006.
[RFC4447] Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G.
Heron, "Pseudowire Setup and Maintenance Using the Label
Distribution Protocol (LDP)", RFC 4447, April 2006.
[RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron,
"Encapsulation Methods for Transport of Ethernet over MPLS
Networks", RFC 4448, April 2006.
[RFC4553] Vainshtein, A. and YJ. Stein, "Structure-Agnostic Time
Division Multiplexing (TDM) over Packet (SAToP)",
RFC 4553, June 2006.
[RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal
Cost Multipath Treatment in MPLS Networks", BCP 128,
RFC 4928, June 2007.
[RFC5085] Nadeau, T. and C. Pignataro, "Pseudowire Virtual Circuit
Connectivity Verification (VCCV): A Control Channel for
Pseudowires", RFC 5085, December 2007.
13.2. Informative References
[I-D.ietf-bfd-base]
Katz, D. and D. Ward, "Bidirectional Forwarding
Detection", draft-ietf-bfd-base-11 (work in progress),
January 2010.
[I-D.ietf-mpls-lsr-self-test]
Swallow, G., "Label Switching Router Self-Test",
draft-ietf-mpls-lsr-self-test-07 (work in progress),
May 2007.
[I-D.kompella-mpls-entropy-label]
Kompella, K. and S. Amante, "The Use of Entropy Labels in
MPLS Forwarding", draft-kompella-mpls-entropy-label-00
(work in progress), July 2008.
[I-D.stein-pwe3-pwbonding]
Stein, Y., Mendelsohn, I., and R. Insler, "PW Bonding",
draft-stein-pwe3-pwbonding-01 (work in progress),
November 2008.
[RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-
Edge (PWE3) Architecture", RFC 3985, March 2005.
Bryant, et al. Expires July 31, 2010 [Page 17]
Internet-Draft FAT-PW January 2010
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, December 2005.
[RFC4378] Allan, D. and T. Nadeau, "A Framework for Multi-Protocol
Label Switching (MPLS) Operations and Management (OAM)",
RFC 4378, February 2006.
[RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast
Reroute: Loop-Free Alternates", RFC 5286, September 2008.
[RFC5654] Niven-Jenkins, B., Brungard, D., Betts, M., Sprecher, N.,
and S. Ueno, "Requirements of an MPLS Transport Profile",
RFC 5654, September 2009.
Authors' Addresses
Stewart Bryant (editor)
Cisco Systems
250 Longwater Ave
Reading RG2 6GB
United Kingdom
Phone: +44-208-824-8828
Email: stbryant@cisco.com
Clarence Filsfils
Cisco Systems
Brussels
Belgium
Email: cfilsfil@cisco.com
Ulrich Drafz
Deutsche Telekom
Muenster
Germany
Email: Ulrich.Drafz@t-com.net
Vach Kompella
Alcatel-Lucent
Email: Alcatel-Lucent vach.kompella@alcatel-lucent.com
Bryant, et al. Expires July 31, 2010 [Page 18]
Internet-Draft FAT-PW January 2010
Joe Regan
Alcatel-Lucent
Email: joe.regan@alcatel-lucent.comRegan
Shane Amante
Level 3 Communications
Email: shane@castlepoint.net
Bryant, et al. Expires July 31, 2010 [Page 19]