Internet Engineering Task Force                    Ramesh Bhandari
Internet-Draft                                     Siva Sankaranarayanan
                                                   Eve Varma
                                                   Lucent Technologies

Expiration Date: May 2001

                                              November, 2000



  High Level Requirements for Optical Shared Mesh Restoration

            draft-bhandari-optical-restoration-00.txt


Status of this Memo

This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of  RFC2026.

Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.


1. Abstract

In this draft, we provide the high level requirements for optical shared
mesh restoration within the optical transport network.


2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT",  "SHOULD",  "SHOULD  NOT", "RECOMMENDED",  "MAY",  and
"OPTIONAL" in  this  document  are  to  be  interpreted  as
described in RFC-2119.


3. Introduction

Because of the enormity of the traffic that optical networks are
expected to carry, resulting from the continued explosive growth of
data-oriented applications, optical network survivability has become an
issue of paramount importance.  In conjunction, there is a continuing
drive for maximizing efficiency and minimizing costs in large networks.
Very fast restoration mechanisms such as 1+1 schemes (with restoration
times of the order of the tens of milliseconds) exist, but given the
degree of network resource consumption, alternative options are
essential.  With the availability of large optical cross-connects,
shared mesh network restoration at the optical layer is a versatile
approach that should be considered. Simulations [1] have shown that
shared mesh networks require much less additional capacity than rings.
Although less network resource consuming, the trade-off has been service
restoration time. However, mesh based restoration is not inherently
"slow"; if appropriate architectural requirements are established in a
timely manner, it should be possible to enable fast restoration times
(e.g, restoration times comparable to those provided by the SONET ring-
based infrastructures). Within this contribution, we provide
architectural requirements that enable fast and efficient optical mesh
restoration.

4. Optical Mesh Network Architecture

This draft focuses upon next generation optical networks based upon ITU-
T Recommendations G.872 [2] and G.709 [3]. Optical mesh networks
basically consist of optical cross-connects (OXCs) interconnected by
DWDM links. Associated with these OXCs are controllers that facilitate
communications among them. (Note that these controllers may be internal
or external to the controlled OXCs, and a one for one relationship is
not assumed). An optical channel (OCh) connection through the optical
transport network (OTN) is established along a route having capacity
(wavelength availability) between its designated ingress and egress
points. The OCh connection between the source and the destination OXCs
is comprised of a series of OXCs interconnected by OCh link connections,
and a signaling mechanism is used to appropriately configure the OXC
during OCh network connection establishment.

Note that an optical channel transparently carries a variety of client
signals (e.g., IP, SONET/SDH, ATM, GbE), and provides OAM capabilities
such as tandem connection monitoring (TCM) and end-end signal integrity
checking. Thus, an optical channel traversing a series of optical
subnetworks, can be monitored at various points along the route,
typically at subnetwork boundaries, as well as at the OCh termination
points (end-points). When there is a breakdown of an OCh network
connection due to a failed OCh link connection(s) or OXC node, the
affected traffic needs to be restored using an alternate route. There
are two ways in which this restoration may be performed:

1) reroute around the point of failure, e.g., a failed link
connection

2) reroute from the tandem connection monitoring (TCM) or OCh
termination points.

The first method mandates the need for fault localization in advance
of initiating restoration actions. I.e., it is necessary to pinpoint
the precise location of the fault along the OCh network connection so
that rerouting can be performed around it.  Relatively quick fault
isolation might be provided by digitally monitoring OCh overhead at every
optical NE (ONE); however, this builds a dependency upon digital pro-
cessing throughout the entire OTN. This introduces additional cost incurred
from proliferation of OEO throughout the network solely for maintenance
reasons (vs. impairment mitigation), not to mention additional digital
monitoring equipment to determine performance degradation.
Alternatively, controlling the expense by sharing the monitoring
equipment over many optical channels leads to an unacceptably large
fault detection time [4]. More significantly, this method inhibits
evolution towards transparent optical networks. Further, fault
localization in transparent optical networks may be complicated by the
non-linear interactions typical of such networks. This can result in
time-consuming correlations to identify the root cause of signal
impairments.

The second method of restoration involves rerouting from the OCh
terminations or the TCM points, and therefore does not require fault
isolation to occur before initiation of restoration actions. It is
expected to be fast because it utilizes the ability to accurately
detect loss of signal from TCM and OCh termination points, from which
signaling may subsequently be initiated to restore the traffic on an
alternate path. Since the exact location of the fault along the primary
path is unknown, the alternate path has to be "physically-disjoint" from
the primary path. We further note that this approach is conducive towards
evolution to increasingly large transparent (all optical, no
OEO) subnetworks in two ways: it avoids embedding dependencies on
digital processing within the OTN; it is tailored to the needs of all-
optical networks. In what follows, we assume restoration is path-based,
i.e., it takes place from the TCM or OCh termination points, and that
the alternate path is physically-disjoint. It is important to mention
that, if the primary path traverses multiple subnetworks or operator
domains, then due to monitoring at the edge of each domain, restoration
may be performed within that domain. This would also avoid the need for
signaling inter-working between multiple domains.

Clearly, to effect restoration on these alternate disjoint paths, spare
capacity must be reserved on each link of the path. For the network to
be efficient, this spare capacity must be shared for restoration of
other working paths as well. For fast and scaleable optical network
restoration, it is also desirable to maintain the network-state in a
distributed manner. Below we point out some high-level requirements for
restoration at the optical layer.


5. Requirements for Fast Optical Mesh Restoration

Any optical mesh restoration scheme must

- Be independent of OCh client (e.g., IP, ATM, SDH/SONET, GbE).

- Avoid dependency of restoration action initiation on non-time
critical functions. Therefore, it should not require fault
localization to occur before initiating restoration actions.

  -> Restoration must be triggered from the TCM or OCh termination
     points.

  -> The alternate path must be physically disjoint; by physically
     disjoint, we mean not only node and link disjoint, but also
     span-disjoint.

- Have scalability in the event of catastrophic failures such as fiber
cable cuts.

  -> Appropriate mechanisms must be utilized that can restore the
     (expected) large amount of affected traffic rapidly, and in a
     cost-effective manner; e.g., core network application domain
     encompassing up to a few hundred nodes per subnetwork, and
     thousands of point-to-point demands.

- Utilize a robust and efficient signaling mechanism.

  -> The signaling network must remain functional after a failure in
     the transport and/or signaling network infrastructure.

Clearly, for restoration to be carried out effectively, it is necessary
for the connection controllers to have information on the network
topology (such as link state and wavelength availability) as well as on
physical aspects of  the transport network such as fiber span and span-
sharing links. Appropriate algorithms are needed to determine physically
disjoint paths for restoration (see, e.g., [5]), since restoration must
take place from the TCM or OCh termination points. To ensure that paths
are actually physically disjoint (i.e., node, link, and span
disjoint), span-sharing link topologies or Shared Link Risk Groups
(SRLGÆs) [5-6] of the actual physical fiber network must be understood.
For special high quality services [7], another key consideration
involves regions of failure, specified by the corresponding radii of
failure.  This is because, for such services, diverse routes should not
pass through a region where there is the risk of both the primary and
alternate paths failing simultaneously due to catastrophic disasters
such as earthquakes, floods, etc.

Appropriate mechanisms  (see, e.g., [5]) and algorithms may need to be
constructed to expedite the restoration process and to make the
restorable mesh network cost effective by sharing spare capacity.
Approaches to garner information on network topology are currently under
consideration within various fora (e.g., via the use of appropriate
extensions to OSPF (see, e.g., [8])).


5.References


[1] S. Baroni et. al., Proc. Conference on Optical Fiber Communications,
Paper TuK2 March 2000.
2] Agreed revisions to Version 2 of G.872 per October 1999 Q19/13
Meeting, provided to T1X1.5 for information,
ftp://ftp.t1.org/pub/t1x1/2000x15/0x150500.pdf
[3] Draft ITU-T Recommendation G.709, Oct. 2000 version submitted for
approval at the Feb. 2001 SG 15 meeting, provided to T1X1.5 for
information, ftp://ftp.t1.org/pub/t1x1/x1.5/0x152460.doc
[4] G. Newsome, "Maintenance Philosophy for the OTN", T1X1.5/99-108R1
[5] R. Bhandari, "Survivable Networks: Algorithms for Diverse Routing",
Kluwer Academic Publishers (1999)
[6] S. Chaudhuri et al, "Control of Lightpaths in an Optical Network",
Internet Draft <draft-chaudhuri-ip-olxc-control-00.txt> February 2000
[7] H. Ishimatsu et al, "Carrier Needs Regarding Survivability and
Maintenance for Switched Optical Networks", <draft-hayata-ipo-carrier-
needs-01.txt>, submitted in this meeting.
[8] G. Wang et al, "Extensions to OSPF/IS-IS for Optical Networking",
Internet Draft <draft-wang-ospf-isis-lamda-te-routing-00.txt> March 2000

6. Authors' Contact Information

Ramesh Bhandari
Lucent Technologies
bhandari1@lucent.com

Sivakumar Sankaranarayanan
Lucent Technologies
ssnarayanan@lucent.com

Eve Varma
Lucent Technologies
evarma@lucent.com



                        Expiration Date: May 2001