Internet Engineering Task Force                    Hirokazu Ishimatsu
Internet-Draft                                     Yoshihiro Hayata
                                                   Susumu Yoneda
                                                   Japan Telecom Co., LTD.

Expiration Date: May 2001                          Ramesh Bhandari
                                                   George Newsome
                                                   Eve Varma
                                                   Lucent Technologies

                                                   November, 2000



  Carrier Needs Regarding Survivability and Maintenance for
                          Switched
                      Optical Networks

            draft-hayata-ipo-carrier-needs-00.txt


Status of this Memo

This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of  RFC2026.

Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.


1.  Abstract

As discussed in [1], the need for survivable optical networks is
critical, and introducing capabilities that further enhance network
survivability continues to be an essential objective.   This is
particularly important for operators with stringent requirements for
network resilience and service survivability.  However, disruption of
service can result not only from faults, but also from scheduled
maintenance procedures. This draft introduces some additional
considerations and carrier needs related to failure recovery and
scheduled maintenance work in switched optical networks. These are of
critical importance for serving -business customers who require super
high quality service assurance and pay correspondingly high tariffs in
order to guarantee this level of QoS.


2.  Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT",  "SHOULD",  "SHOULD  NOT", "RECOMMENDED",  "MAY",  and
"OPTIONAL"  in  this  document  are  to  be  interpreted  as
described in RFC-2119.

3.  Introduction

The explosion of data services is increasingly imposing challenging
network infrastructure requirements at the same time that wavelength
services are emerging in the marketplace. Next generation optical
networking solutions must enable scalable, flexible, and reliable
networks as well as increased responsiveness to client network needs.
Provision of an optical layer service framework has been discussed in
the context of service considerations considered important for inter-
city network operators [2].  As described in this material, some key
objectives include service functionality, a workable business model, and
evolvability in a heterogeneous network environment.

Key service functionality cited in [2] has included rapid provisioning
and restoration.   Automated provisioning of optical layer resources in
support of scheduled and demand-based customer/client needs offers
opportunities for supporting new services as well as handling routine
maintenance activities in a non-service disrupting manner (e.g.,
scheduled or predictable maintenance-related churn).

Assuring support for a workable business model that can adapt to change,
e.g., arbitrage, is important. In particular, it has become clear that
there is a range of reasonable business models that might be utilized in
an operator's network, depending upon the scope and objectives of the
enterprise.  In particular, as discussed in [4], such models might be
used in various ways, and for various purposes, even by different
organizations within the same network operator domain.

Evolvability is an important consideration as it is essential for
service providers to have a smooth network evolution path for addressing
the unique problems inherent in simultaneously supporting an existing
network while deploying a new multi-service infrastructure.  Clearly, it
is also necessary to enable emergent service providers to optimally
tailor their networks for their targeted market and service offerings;
however, emergent providers quickly need to deal with embedded base as
soon as initial deployment of resources has occurred.

Within the remainder of this draft, we will focus upon service
functionality and business model objectives in relation to service
survivability and maintenance considerations for highly reliable
services such as the super high quality services discussed below.

4.  Switched Optical Services

The basic requirement of a switched optical service is that a channel is
established via an appropriate signaling mechanism before data can be
transferred and that this establishment is achieved in the following
manner:
- a real-time client specifies its traffic characteristics and its end-
  to-end performance requirements to the server
- the most suitable route for a channel that meets these requirements is
  determined
- translate the end-to-end parameters into local parameters at each NE
  and attempt to reserve resources via signaling.

The service abstraction defines a contractual relationship between
client and server. Hence once the connection is established the server
guarantees in the absence of a failure that it will meet its contractual
obligations. This contract is basically agreed before data transfer.
When the server guarantees the contract, several actions have to be
taken in case of a failure. This paper addresses those actions in Sec.
4.2.

4.1 Super High Quality Services Characteristics

Super high quality services (also known as private line services)
offered by a carrier currently have the following characteristics:

- The exact physical and logical location of a private line userÆs path
  in the  network is known and uniquely identifiable, (i.e. the optical
  fiber cable, fiber, optical channel, SDH logical path, port of
  transmission equipment/router, etc) is known to the network operator.
- When a logical path or port is switched to an alternate route (i.e.,
  a back-up path) due to an unexpected event, after the event or
  failure is repaired, the carrier switches traffic back from the
  alternate path to the original path.
- For scheduled maintenance, the carrier  always asks customers having
  super high quality services (that may be affected due to this
  maintenance work) their preference in terms of when this work may be
  carried out.   The carrier then carries out the scheduled maintenance
  work according to customer preference regarding date and time, as it
  is essential that important customers not be adversely impacted in
  any way by scheduled maintenance work.
- The carrier provides for guaranteed service survivability in the
  event of failures.  It does so by providing alternate paths for
  carrying services, with the service and alternate paths being
  physically and topologically diverse.


4.2 Service Survivability Considerations

As  discussed in [3], there is a range of failures that can occur
within  a network, and high reliability  applications will require a
variety of failures to be taken into account.  Examples  that have been
considered include office  outages, failures  arising  from diverse
circuits  traversing  shared protection  facilities such as rings, and
natural disasters. It is essential to fully prepare for those natural
disasters such as earthquakes, volcanoes and typhoons.   Further, for
super high quality services, there is  extreme sensitivity to service
interruptions.  Thus,  it is  important  that the service and alternate
paths  do  not have  links  that  are part of any Shared Risk  Link
Groups (SRLG)  [3], or pass through the same "region of failure".
Additionally, in order to assure an  optimized survivable network
architecture, it is desirable  that the alternate path can be switched-
back to the original  service path once the failure is repaired (note
that not all carriers may choose to revert).  The following different
grades of services may be defined with actions to be taken in the event
of a failure:

- Standard service, which is provided from a given source to a given
  destination over a path computed in accordance with normal network
  capacity constraints; when the customer loses connection on account
  of a fault, the customer may request the same connection which the
  network will then try to establish on a newly computed path.

- Medium High Quality Service which, at the customerÆs request,
  provides a connection over a path that avoids a certain set of cities
  or regions, which are prone to damage due to natural disasters such
  as earthquakes, volcanoes, typhoons, etc. These "regions of failure"
  may each be ascribed a "radius of failure" determined from a study of
  the past history of the spatial extent and severity of damage in
  those regions; in the event of a failure of this service, the
  customer may request reestablishment of a connection, which the
  network will attempt to provide over a new path.

- High Quality Service, which is provided with a physically disjoint
  back-up path in case of failure of the primary path; there are no
  requirements on city avoidance, etc; as a result, the back-up
  basically provides guarantee of continuity of service only in the
  event of link or equipment failure.

- Super High Quality Service, which is provided with a physically
  disjoint back-up path, constrained to have no "region of failure" in
  common with the original path. Such type of service may be requested
  by big business customers who essentially want continuity of service
  at all times. In fact, since the downtime of the primary path may be
  significantly large in major catastrophes such as those due to
  earthquakes, floods, etc., a carrier may offer to provide a back-up
  for the back-up over which the guaranteed services were switched upon
  failure of the primary path.

 The above four types of services may be summarized in the table below:

        Service Type    Physically disjoint     Avoid a Region of
                        protection path         Failure

        Regular         No                      No
        Medium          No                      Yes
        High            Yes                     No
        Super High      Yes                     Yes

In the event the constraints for the above high quality services can
only be met partially (e.g., 100% physical diversity between a given
pair of source and destination cannot be provided, e.g., because it just
does not exist for the particular source-destination pair), then the
customer, instead of being refused the desired service, may simply be
offered service with a correspondingly reduced level of service
protection; for example, if the percentage amount of fiber overlap on
the primary and secondary routes is x, then the customer may be offered
the service with a reduction in service continuity guarantee by x%, and
thus also with correspondingly reduced costs to the customer.
Furthermore, in those cases, where the customer does not want to pay the
full cost of the above high quality services, even when such service
exists, then service may still be provided, but with corresponding
reduced quality guarantees within the class of service under
consideration.

4.3 Data Bases and Algorithms

Because natural disasters such as earthquakes, typhoons, etc. can damage
a large area in one instance, it is important to ascertain the regions
within the service provider's network prone to damage by such
calamities. Normally, such areas have a history of damage, and it should
be possible to construct a data base on the location, intensity of
disaster, its frequency, and the size of the area affected; the area
affected may be expressed as a "radius of failure". It may also be
possible to use the information on the intensity of disaster and the
frequency of occurrence to assign probabilities of failure to the
offered services. For path computation, the following data bases are
needed:

- Nodes, links, and their fiber span content, or alternatively, nodes,
  fiber spans and links riding the individual spans also called Shared
  Risk Link Groups (SRLG's); clearly, if a link or node is not in
  service, it is not included in path computation.

- Regions of failure, corresponding radii of failure and locations
  within the service provider's network; these should be taken into
  account before computing paths for the medium high and super high
  quality services.

For highly reliable services such as the super high quality services,
physically-disjoint paths for real-life networks (which involve span-
sharing links or SRLGÆs) are required. Ref. [5] describes algorithms for
such real-life networks. The algorithms emphasize optimality to save
network costs. Depending upon the span-sharing topologies of a given
network, these optimal algorithms can be very fast, and thus suitable
for running in the real-time environment. For networks, with very
complicated span-sharing topologies, exact algorithms do exist [5], but
they are slow for large networks, since the problem becomes NP-complete.
In such situations, fast heuristics may be developed [5] (see also [2]
for a discussion on diversity).

4.4  Business Model Considerations

As described in [4], there are several business models that may be
applicable for network operators: ISP owning all Layer 1 infrastructure
and only delivering IP-based services, ISP owning or leasing Layer 1
infrastructure and only delivering IP-based services, retailer or
wholesaler for multi-services, and a carrierÆs carrier or bandwidth
broker.   A carrier  owns the layer 1 infrastructure and sells multiple
service types to customers, which may include other operator networks.
This bandwidth brokering, or reseller, role takes on a new meaning in
the context of service resilience.  For many years, in Japan, operators
have collaborated to handle traffic in the event of natural disasters,
so that bandwidth can be borrowed from each other.  Thus, if an operator
doesnÆt have the capacity, they can borrow capacity from another
network.  Accommodating the unexpected is a key factor in this case.
Indeed it seems to be a common pattern in industry that businessÆs that
provide service and operate their own infrastructure tend to separate
into two businessÆs. This makes it likely that even though
infrastructure may be whole owned today, it may well not be tomorrow.
This makes it important to take account of fully separated business
models (case 3 and 4 of [4]) even if this does not seem to represent the
majority of today's business's.

5.  Implications for switched optical networks

Considering the discussion in Subsections 4.1 - 4.4, switched optical
networks must minimally:
- Support the various grades of high quality services, including the
  Super High Quality Service described in Sec. 4.1.
- Support survivability considerations related to diverse routing,
  tailored to the unique characteristics of JapanÆs geography and
  routing of fibers.
- Enable "bandwidth borrowing on demand" from other carriers as well as
  support for multiple service types.

Examples of necessary functionality are provided in more detail below,
as well as some related connection setup operations.

5.1     Functions

- When referring to Section 4, we can see that the following functions
  need to be supported:
- Ability for network operator to manually set the date and time that a
  path switching function should take place, and have that occur
  automatically.  (The guarantee that the switch occurs as scheduled is
  closely linked to resource allocation policies; see T1X1.5/2000-194
  for further discussion on scheduled connections.)
- Ability to specify switching to a physically/topologically disjoint
  path from the service path.
- Ability to maintain and update the data bases in a timely manner so
  that a connection request is supported with the most current
  knowledge of the network.
- Ability for operator to support a survivability policy that enables
  the capability for switch-back to the original service path.
- Ability to support an operator policy to prioritize service requests
  so that, in the event of a fault, customers with super high quality
  services have first priority in being switched to disjoint paths.
- Ability to enable key customers to request constraints on the
  connection path(e.g., avoid City X because an earthquake has just
  occurred, or simply because the city is very much prone to damage
  from natural disasters such as earthquakes, volcanoes and typhoons.
  This involves the ability to express geographic constraints, as
  opposed to just physical (equipment) or topological constraints.
- Ability to prevent new customers from being added to a particular
  link for a certain amount of time (e.g., because of a failure,
  natural disaster, scheduled maintenance).  This requires the ability
  to mark particular resources as out of service.
- Ability for the operator to query service management function to
  establish the exact location and characteristics of service paths for
  key customers.
- Ability for the operator to view information regarding which
  customer/user is associated with which service path(s).

5.2  Connection Setup Operation

Referring  to [4], some relevant connection setup parameters include:

1) Scheduled service - ability to request the connection to  be made at
some specified time in the future (see T1X1.5/2000-194 for further
discussions).
2) Scheduled duration - ability to specify a duration  for  the
Connection.
3) Resilience - ability to request resilience against  server layer
faults, and specify a particular degree of risk (see Sec. 4.2)
4) Connection Constraints - ability to specify the constraints as in the
three levels of high quality service described in Sec. 4.2.

6.  References

[1] J. Luciani, B. Rajagopalan, D. Awduche, B. Cain, B. Jamoussi, "IP
over Optical Networks - A Framework", <draft-ip-optical-framework-
oo.txt>, March 2000
[2] John Strand, "Optical Layer Services Framework", T1X1.5/2000-142
[3] Monica Lazer, John Strand, "Some Routing Constraints", T1X1.5/2000-
143
[4] George Newsome, "ASON - Requirements at the Client API",
T1X1.5/2000-158
[5] Ramesh Bhandari, "Survivable Networks - Algorithms for Diverse
Routing", Kluwer Academic Publishers, 1999.

7. Authors' Contact Information

Hirokazu Ishimatsu
Japan Telecom
hirokazu@japan-telecom.co.jp

Yoshihiro Hayata
hayata@japan-telecom.co.jp

Sussumo Yoneda
Japan Telecom
yone@japan-telecom.co.jp

Ramesh Bhandari
Lucent Technologies
bhandari1@lucent.com

George Newsome
Lucent Technologies
gnewsome@lucent.com

Eve Varma
Lucent Technologies
evarma@lucent.com


                        Expiration Date: May 2001