INTERNET-DRAFT Mingui Zhang
Intended Status: Proposed Standard Xudong Zhang
Expires: May 3, 2012 Donald Eastlake
Huawei
October 31, 2011
TRILL IS-IS MTU Negotiation
draft-zhang-trill-mtu-negotiation-01.txt
Abstract
The IETF TRILL protocol provides least cost pair-wise layer 2 data
forwarding by using IS-IS link state routing. This document defines a
new link MTU size negotiation mechanism to update the TRILL documents
"Routing Bridges (RBridges): Base Protocol Specification" and
"Routing Bridges (RBridges): Adjacency".
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Copyright and License Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Mingui Zhang Expires May 3, 2012 [Page 1]
INTERNET-DRAFT MTU Negotiation October 31, 2011
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Content . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Issues of Link MTU Testing . . . . . . . . . . . . . . . . . . 3
2.1. Global Dependence . . . . . . . . . . . . . . . . . . . . . 4
2.2. Concealing Wrong Configuration . . . . . . . . . . . . . . 4
3. TRILL IS-IS MTU Negotiation . . . . . . . . . . . . . . . . . . 5
3.1. Determination of Lz . . . . . . . . . . . . . . . . . . . . 5
3.3. Link MTU Size Testing Algorithm . . . . . . . . . . . . . . 6
3.4. Re-determining Campus-Wide Sz . . . . . . . . . . . . . . . 7
3.5. Relationship between Port MTU and Sz . . . . . . . . . . . 8
3.6. LSP Synchronization . . . . . . . . . . . . . . . . . . . . 8
4. Determining Link Traffic MTU Size . . . . . . . . . . . . . . . 8
5. Security Considerations . . . . . . . . . . . . . . . . . . . . 9
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 9
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1. Normative References . . . . . . . . . . . . . . . . . . . 9
7.2. Informative References . . . . . . . . . . . . . . . . . . 9
Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10
Mingui Zhang Expires May 3, 2012 [Page 2]
INTERNET-DRAFT MTU Negotiation October 31, 2011
1. Introduction
The base TRILL protocol includes the way how RBridges determine the
minimum inter-RBridge link size for the whole campus (campus-wide
Sz), for the proper operation of TRILL IS-IS. According to [RFC6325],
RBridges need to know the campus-wide Sz before they do the link MTU
size testing. The link MTU size testing therefore depends on the
campus-wide Sz collection.
[RFC6327] defines the diagram of state transitions of an adjacency.
The "link MTU size is successfully tested (A6)" is an articulate
transition between "2-way" state and "Report" state of an adjacency.
It is not clear, in this draft, when an adjacency should start to
synchronize LSP database.
This document analyzes the possible issues caused by the definition
that link MTU size testing depends on campus-wide Sz collection. A
new link MTU size negotiation mechanism is provided to solve the
above problems.
1.1. Content
Section 2 analyzes the issues caused by the dependence on campus-wide
Sz for link MTU size testing.
Section 3 defines a new IS-IS MTU negotiation mechanism to update
[RFC6325].
Section 4 provides a method for link traffic MTU determination.
1.2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Issues of Link MTU Testing
Link MTU size testing is defined in Section 4.3.2 of [RFC6325]. If
the link MTU size is smaller than campus-wide value of Sz, which is
the smallest value of Sz advertised by any RBridge in its LSP
[RFC6325], the link is not included in the global topology. If the
link MTU size X of an adjacency is successfully tested (X >= campus-
wide Sz), its state will move from 2-way to Report, which is defined
in [RFC6327]. The link MTU size testing depends on the value of
campus-wide Sz, which can be problematic. The issues causes by this
dependence are given in the following subsections.
Mingui Zhang Expires May 3, 2012 [Page 3]
INTERNET-DRAFT MTU Negotiation October 31, 2011
2.1. Global Dependence
Sz:1800 Sz:1800 Sz:1800
+---+ +---+ +--+ +---+
|RB1|(2000)-(2000)|RB2| (2000)-(1700)|B1|(1700)-(2000)|RB3|
+---+ ^ +---+ +--+ +---+
(2000) | ^
|<---- Report |
(2000) Report
+---+
|RB4|
+---+ |
Sz:1600 v
Sz:1800 Sz:1800 Sz:1800
+---+ +---+ +--+ +---+
|RB1|(2000)-(2000)|RB2| (2000)-(1700)|B1|(1700)-(2000)|RB3|
+---+ ^ +---+ +--+ +---+
| ^
Report |
2-way
Figure 2.1: Adjacency global dependence
Take Figure 2.1 as an example, all the adjacencies are in report
states. After RB4 leaves the campus, RB2 and RB3 find the campus-wide
Sz grows. They test the MTU according to campus-wide Sz 1800. Since
RB2 and RB3 is connected by a low-end bridge whose port MTU is 1700.
The test will not be successful. This adjacency has to return to 2-
way state. The state of an adjacency can be determined by another
remote adjacency. The stability of the campus Sz can be terrible
resulting in maintenance problems.
2.2. Concealing Wrong Configuration
Take Figure 2.2 as an example, the Sz value of RB3 is falsely
configured to be greater than its port MTU. The link MTU testing is
successful because the campus-wide Sz 1600 is smaller than the two
port MTUs of the adjacency between RB2 and RB3. The adjacency will be
in "Report" state. However, when RB4 leaves the campus and the
campus-wide Sz is updated to 1800, the link MTU test of link RB2-RB3
cannot be successful.
Mingui Zhang Expires May 3, 2012 [Page 4]
INTERNET-DRAFT MTU Negotiation October 31, 2011
Sz:1600 Sz:1800 Sz:1800 Sz:1800
+---+ +---+ +---+ +---+
|RB4|(2000)-(2000)|RB1|(2000)-(2000)|RB2|(2000)-(1700)|RB3|
+---+ ^ +---+ ^ +---+ ^ +---+
| | |
Report Report Report
|
v
Sz:1800 Sz:1800 Sz:1800
+---+ +---+ +---+
|RB1|(2000)-(2000)|RB2|(2000)-(1700)|RB3|
+---+ ^ +---+ ^ +---+
| |
Report 2-way
Figure 2.2: Concealing wrong configuration
3. TRILL IS-IS MTU Negotiation
It is improper to use campus-wide Sz in link MTU testing and LSP
database synchronization. In order to solved the problems depicted in
Section 2, this draft introduces a new value "Lz" which is the
minimum acceptable inter-RBridge link size required by RBridges on a
specific LAN link. Lz is used in link MTU size testing and LSP
database synchronization to replace the role of campus-wide Sz. After
link MTU size is successfully tested, the adjacency is changed to
"Report" state.
3.1. Determination of Lz
RBridges on a LAN link should exchange their local Sz through LSPs
using the originatingLSPBufferSize, TLV #14. The smallest value of
these Sz is Lz. Therefore, Lz is actually a "link-wide Sz". It is
different from the campus-wide Sz which is determined by having each
RBridge in the campus advertise its own assumption of the value of Sz
in LSPs as defined in Section 4.3.1 of [RFC6325].
The maximum size of some types of PDUs should be confined by Lz
rather than campus-wide Sz because they are only exchanged between
neighbors instead of the whole campus. CSNPs and PSNPs are such kind
of PDUs. They are exchanged just on the link after a DRB is selected
on the link.
Mingui Zhang Expires May 3, 2012 [Page 5]
INTERNET-DRAFT MTU Negotiation October 31, 2011
Sz:1800 Sz:1800
+---+ | +---+
|RB1|(2000)-|-(2000)|RB2|
+---+ | +---+
|
Sz:1800 |
+---+ +--+
|RB3|(2000)-(1700)|B1|
+---+ +--+
|
Figure 3.1: Link MTU has to be negotiated
Even all RBridges on a specific LAN link have reached consensus on
the value of Lz, it does not mean that these RBridges can safely
exchange PDUs between each other. Take Figure 3.1 as an example. RB1,
RB2 and RB3 are three RBridges on the same LAN link and their Sz are
1800, so the link-wide Sz of this LAN link is 1800. There is a bridge
(say B1) between RB2 and RB3 whose port MTU size is 1700. If RB2
sends PDUs formatted in the size of 1800, it will be discarded by B1.
Therefore the link MTU size has to be tested. Only after the link MTU
size of an adjacency is successfully tested, these CSNP and PSNP PDUs
will be formatted no greater than the tested link MTU size and will
be safely transmitted on this link.
3.3. Link MTU Size Testing Algorithm
The link MTU size testing method given by the last paragraph of
Section 4.3.2 of [RFC6325] is updated by the following Binary Search
algorithm in which Lz is used in the testing instead of campus-wide
Sz.
Step 0: RB1 sends an MTU-probe padded to the size of Lz.
1) If RB1 successfully receives the MTU-ACK to the probe of size Lz
from RB2, then link MTU size is set to the size of Lz and stop.
2) RB1 tries to send an MTU-probe padded to the size 1470.
a) If RB1 fails to receive an MTU-ACK from RB2 after k tries
(where k is a configurable parameter whose default is 3), RB1
sets the "failed minimum MTU test" flag for RB2 in RB1's Hello
and stop.
b) Link MTU size <-- 1470, X1 <-- 1470, X2 <-- Lz, X <-- [(X1 +
X2)/2] (Operation "[...]" returns the fraction-rounded-up
integer.). Repeat Step 1.
Mingui Zhang Expires May 3, 2012 [Page 6]
INTERNET-DRAFT MTU Negotiation October 31, 2011
Step 1: RB1 tries to send an MTU-probe padded to the size X.
1) If RB1 fails to receive an MTU-ACK from RB2 after k tries, then:
X2 <-- X and X <-- [(X1 + X2)/2]
2) If RB1 receives an MTU-ACK to a probe of size X from RB2 then:
link MTU size <-- X, X1 <-- X and X <-- [(X1 + X2)/2]
3) If X1 >= X2 or Step 1 has been repeated n times (where n is a
configurable parameter whose default is 5), stop. Else go to Step
1.
Since the execution of the above algorithm can be resource consuming,
it is recommended that the DRB takes the responsibility to do the
testing. If the testing is finished and the tested link MTU size is
smaller than the original Lz and the minimum Sz that has been
advertised to the DRB, the DRB should send the tested link MTU size
as its local originatingLSPBufferSize in LSP number zero (shorted as
LSP0). This will trigger other RBridges on the link to update their
Lz to be the size of the tested link MTU. Then CSNPs, PSNPs and LSPs
used for synchronization can be rightly resized and successfully
exchanged on the link.
3.4. Re-determining Campus-Wide Sz
RBridges may join in or leave the campus from time to time. The
campus-wide Sz can become outdated. Section 4.3.1 of [RFC6325] does
not define when to re-determine the campus-wide Sz. The following
suggestions are given for campus-wide Sz re-determination.
1) When a new RB whose Sz is smaller than current campus-wide Sz
joins in the campus, it MUST report its Sz in an LSP which will
cause other RBridges update their campus-wide Sz. The LSPs in the
campus will be resized to be no greater than the new campus-wide
Sz.
2) When an RB whose Sz is right the campus-wide Sz leaves the campus,
and the LSPs generated by this RBridge are purged from the
remaining campus after reaching MaxAge [ISO10589]. The campus-wide
Sz ought to be resized as well. Frequent LSP "resizing" is harmful
to the stability of the whole campus, so it should be dampened.
Within the two kinds of resizing actions, only the upward resizing
will be dampened. When an upward resizing event happens, a timer
is set (this is a configurable parameter whose default value is
300 seconds). Before this timer expires, all subsequent upward
resizing will be dampened.
Mingui Zhang Expires May 3, 2012 [Page 7]
INTERNET-DRAFT MTU Negotiation October 31, 2011
3) An RBridge may generate multiple LSPs. It is recommended that each
RBridge carries its Sz in LSP0 [ISO10589]. Otherwise, if Sz is
absent in LSP0, the campus-wide Sz will be set to a small value
1470 at the receiver RBridge [RFC6325]. When subsequent LSPs
carrying Sz arrives, the campus-wide Sz will be resized again.
3.5. Relationship between Port MTU and Sz
When port MTU size is smaller than the local Sz of an RBridge, this
port should be explicitly disabled from the TRILL campus. On the
other hand, when an RBridge receives an LSP with size greater than
its local Sz or the campus-wide Sz, this LSP should be normally
processed rather than discarded. If an LSP is larger than the MTU
size of a port over which it is to be propagated, no attempt shall be
made to propagate this LSP over the port and an
LSPTooLargeToPropagate alarm shall be generated [ISO10589].
3.6. LSP Synchronization
The DRB of a LAN link is elected as early as in the "Detect" state of
an adjacency. When a DRB is elected, it begins to send out CSNP to
synchronize the LSP database of the RBridges attached to this LAN
link when the adjacency between this RBridge and the DRB moves to 2-
way state. If a non-DRB RBridge receives this CSNP and finds that
LSPx is not in its LSP database, it will send out PSNP to request
LSPx from the DRB. If a non-DRB receives this CSNP and finds that
LSPx is not in the LSP database of the DRB, it will also send out
LSPx to the DRB.
DRB and non-DRB on a link should start to synchronize LSP database
using CSNPs and PSNPs with a neighbor when the adjacency between them
moves to the 2-way state [RBclr]. The CSNPs and PSNPs should be
formatted in chunks of size at most Lz. Since the link MTU size has
not been tested, Lz may be greater than the actually the link MTU
size. In that case, an CSNP or PSNP may be discarded if its size is
greater than the link MTU size. After the link MTU size is
successfully tested, the adjacencies will begin to formatted these
PDUs in the size no greater than it, therefore these LSPs will
successfully get through.
4. Determining Link Traffic MTU Size
Campus-wide Sz is used to confine the size of the TRILL link state
information messages (LSPs). This value is different from the MTU
size that restricting the size of TRILL data frames. TRILL data frame
forwarded by an RBridge can be greater than the campus-wide Sz or Lz.
They are restricted by the physical links and devices.
Mingui Zhang Expires May 3, 2012 [Page 8]
INTERNET-DRAFT MTU Negotiation October 31, 2011
The algorithm defined in link MTU size testing can also be used in
TRILL traffic MTU size testing, only that Lz used in that algorithm
should be replaced with the port MTU of the RBridge sending MTU
probes. The successfully tested size X can be advertised as an
attribute of this link using MTU sub-TLV defined in section 2.4 of
[RBisis]. An end station may collect these values by TRILL ping or
traceroute. Path MTU is the smallest tested link MTU on this path.
5. Security Considerations
This document raises no new security issues for IS-IS.
6. IANA Considerations
No new registry is requested to be assigned by IANA.
7. References
7.1. Normative References
[RFC6325] R. Perlman, D. Eastlake, et al, "RBridges: Base Protocol
Specification", RFC 6325, July 2011.
[RBaf] R. Perlman, D. Eastlake, et al, "RBridges: Appointed
Forwarders", draft-ietf-trill-rbridge-af-05.txt, working
in progress.
[RFC6327] D. Eastlake, R. Perlman, et al, "Routing Bridges
(RBridges): Adjacency", RFC 6327, July 2011.
[RBisis] D. Eastlake, A. Banerjee, et al, "Transparent
Interconnection of Lots of Links (TRILL) Use of IS-IS",
RFC 6326, July 2011.
[RBclr] D. Eastlake, M. Zhang, et al, "RBridges: Clarifications
and Corrections", draft-eastlake-trill-rbridge-clear-
correct-00.txt, working in progress.
7.2. Informative References
[ISO10589] ISO, "Intermediate system to Intermediate system routeing
information exchange protocol for use in conjunction with
the Protocol for providing the Connectionless-mode Network
Service (ISO 8473)," ISO/IEC 10589:2002.
Mingui Zhang Expires May 3, 2012 [Page 9]
INTERNET-DRAFT MTU Negotiation October 31, 2011
Author's Addresses
Mingui Zhang
Huawei Technologies Co.,Ltd
Huawei Building, No.156 Beiqing Rd.
Z-park ,Shi-Chuang-Ke-Ji-Shi-Fan-Yuan,Hai-Dian District,
Beijing 100095 P.R. China
Email: zhangmingui@huawei.com
Xudong Zhang
Huawei Technologies Co.,Ltd
Huawei Building, No.156 Beiqing Rd.
Z-park ,Shi-Chuang-Ke-Ji-Shi-Fan-Yuan,Hai-Dian District,
Beijing 100095 P.R. China
Email: zhangxudong@huawei.com
Donald E. Eastlake, 3rd
Huawei Technologies
155 Beaver Street
Milford, MA 01757 USA
Phone: +1-508-333-2270
EMail: d3e3e3@gmail.com
Mingui Zhang Expires May 3, 2012 [Page 10]