Operations and Management Area Working Group Q. Wu
Internet-Draft M. Wexler
Intended status: Standards Track Huawei
Expires: December 7, 2014 P. Jain
Nuage Networks
June 5, 2014
Problem Statement and Architecture for Transport Independent OAM in the
multiple layer network
draft-ww-opsawg-multi-layer-oam-00.txt
Abstract
Operations, Administration, and Maintenance (OAM) mechanisms
[RFC6291] are basic building blocks for every communication layer and
technology. The current practice is that many technologies and
layers have their own OAM protocols. In the current situation there
is a little or no re-use of software and hardware in the existing OAM
protocols. Vendors and operators waste a lot through the whole OAM
life-cycle when a new technology is introduced. Integration of OAM
across multiple technologies is extremely difficult. In many cases
it is desirable to have a generic OAM to cover heterogeneous
networking technologies. An example to this generic approach is the
Bidirectional Forwarding Detection [BFD] mechanism that offers a way
to monitor, troubleshoot and maintain the network and services in
support multi-layer OAM independent of media, data protocols, and
routing protocols. Generic OAM tools can be deployed over various
encapsulating protocols, and in various medium types.
An example of an environment in which a generic and integrated OAM
protocol would be valuable is Service Function Chaining. A Service
Function Chaining is composed by series of service Functions, that
can act in different layers but providing an end-to-end chain or path
from a source to destination in a given order [I.D-ietf-sfc-problem-
statement]. In service function chaining environment it is necessary
to provide end to end OAM across certain or all entities and
involving many layers. OAM information should be exchanged between
service functions in different layers while using various
encapsulating protocols. In some cases OAM should cross different
administration and/or maintenance domains.
This document sets out the problem statement and architecture for the
Generic OAM in the Service Layer Routing. This document will cover
at least the basic OAM functions and information such as Connectivity
Verification (CV), Path Verification and Continuity Checks (CC),Path
Discovery / Fault Localization and Performance Monitoring necessary
to monitor and maintain the network.
Wu, et al. Expires December 7, 2014 [Page 1]
Internet-Draft Service Layer OAM June 2014
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 7, 2014.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. What is Generic OAM in the multi-layer network? . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Standards Language . . . . . . . . . . . . . . . . . . . 4
3. Overview of Use Cases . . . . . . . . . . . . . . . . . . . . 4
3.1. Fault localization in multi-layer network . . . . . . . . 5
3.2. Multi-layer OAM in support of service function chain . . 5
4. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 5
4.1. Use of Existing Protocol Mechanisms . . . . . . . . . . . 6
4.2. Strong Technology dependency . . . . . . . . . . . . . . 6
4.3. Weakness of cross-layer OAM . . . . . . . . . . . . . . . 7
4.4. Lack of OAM above Layer 3 . . . . . . . . . . . . . . . . 7
4.5. Issues of Abstraction . . . . . . . . . . . . . . . . . . 7
4.6. Issue of OAM information gathering from Service Function 8
Wu, et al. Expires December 7, 2014 [Page 2]
Internet-Draft Service Layer OAM June 2014
5. Existing Work . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Architectural Consideration . . . . . . . . . . . . . . . . . 9
6.1. Basic Components . . . . . . . . . . . . . . . . . . . . 9
6.1.1. Interconnect OAM at different layers . . . . . . . . 9
6.1.2. Interconnect OAM at the same shim layer above layer 3 9
6.2. OAM Functions in Data Plane . . . . . . . . . . . . . . . 9
6.2.1. Continuity Check . . . . . . . . . . . . . . . . . . 9
6.2.2. Connectivity Verification . . . . . . . . . . . . . . 9
6.2.3. Path Discovery . . . . . . . . . . . . . . . . . . . 9
6.2.4. Performance measurement . . . . . . . . . . . . . . . 9
6.2.5. Protection Switching . . . . . . . . . . . . . . . . 9
6.2.6. Alarm/defect indication . . . . . . . . . . . . . . . 10
6.2.7. Maintenance commands . . . . . . . . . . . . . . . . 10
6.3. OAM in Management plane . . . . . . . . . . . . . . . . . 10
7. Building on Existing Protocols . . . . . . . . . . . . . . . 10
8. Scoping Future Work . . . . . . . . . . . . . . . . . . . . . 11
9. Manageability Considerations . . . . . . . . . . . . . . . . 11
10. Security Considerations . . . . . . . . . . . . . . . . . . . 11
11. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
13.1. Normative References . . . . . . . . . . . . . . . . . . 11
13.2. Informative References . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12
1. Introduction
Operations, Administration, and Maintenance (OAM) mechanisms
[RFC6291] are basic building blocks for every communication layer and
technology. The basic concepts of OAM and the functional roles in
monitoring and diagnosing the behavior of telecommunications networks
have been long term studied at the Layer 1&2 & Layer 3 levels.
Certain OAM functions are used in many management applications for
(i) defect and failure detection, (ii) reporting the defect/failure
information, (iii) defect/failure localization, (iv) performance
monitoring, and (v) service recovery.
The current practice is that many technologies and layers have their
own OAM protocols. There is little or no re-use of software and
hardware for each OAM protocol. Vendors and operators waste a lot
through the whole OAM life-cycle when a new technology is introduced.
Integration of OAM across multiple technologies is extremely
difficult. When having networks with more than one technology,
maintenance and troubleshooting are done per technology and layer,
operation process can be very cumbersome. In many cases it is
desirable to have a generic OAM to cover heterogeneous networking
technologies. Generic OAM tools should be deployed over various
encapsulating protocols, and in various medium types. An example to
Wu, et al. Expires December 7, 2014 [Page 3]
Internet-Draft Service Layer OAM June 2014
this generic approach is the Bidirectional Forwarding Detection [BFD]
mechanism that offers a way to monitor, troubleshoot and maintain the
network and services in support multi- layer OAM independent of
media, data protocols, and routing protocols.
An example of an environment in which a generic and integrated OAM
protocol would be valuable is Service Function Chaining. A Service
Function Chaining is composed by a series of service Functions, that
can act in different layers but providing an end-to- end chain or
path from a source to destination in a given order [I.D -ietf-sfc-
problem-statement]. In service function chaining Environment, it is
necessary to provide end to end OAM across certain or all entities
and involving many layers. OAM information should be exchanged
between service functions in different layers while using various
encapsulating protocols. In some cases OAM should cross different
administration and/or maintenance domains.
This document sets out the problem statement and architecture for the
Generic OAM in the multi-layer network and outlines the problems
encountered with existing OAM protocol variety and their impact on
introduction of new technologies. The scope of this document will at
least cover the basic OAM functions and information (Connectivity
Verification (CV), Path Verification and Continuity Checks (CC),Path
Discovery / Fault Localization,Performance Monitoring) necessary to
monitor and diagnose network.
1.1. What is Generic OAM in the multi-layer network?
In an multi-layer network, generic OAM is the ability to exchange OAM
information across layers between nodes along forwarding path and
gather and provide it to the management application through unified
interface. OAM information includes OAM configuration and
operational data abstracted from various network technologies,
protocols and layers.
2. Terminology
2.1. Standards Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. Overview of Use Cases
Wu, et al. Expires December 7, 2014 [Page 4]
Internet-Draft Service Layer OAM June 2014
3.1. Fault localization in multi-layer network
A user who wishes to issue a Ping command or a Traceroute or initiate
a session monitoring can do so in the same manner regardless of the
underlying protocol or technology. Consider a scenario where an IP
ping to device B from Device A failed. Between device A and B there
are IEEE 802.1 bridges a,b and c. Let's assume a,b and c are using
[8021Q] CFM. Upon detecting IP layer ping failure, the user may wish
to "go down" to the Ethernet layer and issue the corresponding fault
verification (LBM) and fault isolation (LTM) tools, using the same
API.
3.2. Multi-layer OAM in support of service function chain
In service function chain, the service packets is steered through a
set of service nodes distributed in the network.When the service
packet enters the network OAM information needs to be imposed by
ingress node of the network into the packet and pass throught the
network in the same route as the service nodes. In case several SFs
are co-located in the same service node, the packet is processed by
all SFs in the service node, Once the packet is successfully handled
by one SF, the packet is forwarded to the next SF that is in the same
service node. When the packet leave the network, the OAM information
needs to stripped out from the packet. To provide unified view of
OAM information, these OAM information needs to gathered from various
layer using different encapsulation and tunneling techniques and
abstracted and provided to the management application via the unified
management interface.
4. Problem Statement
OAM mechanisms are usually oriented to a single network technology or
a single layer. Each technology or layer has its best suited OAM
tools. Some of them providing rich functionality in one protocol,
the other providing each function with a different protocol and each
technology is developed independently. In the current situation
There is little or no re-use of software and hardware for each OAM
protocol. Integration of OAM across multiple technologies is
extremely difficult. Vendors and operators waste a lot through the
whole OAM life-cycle when a new technology is introduced. (1) Design
and development: For every new protocol we invest in design and
development of data, control and management planes. In some cases,
even adding a single OAM function requires the above whole life-cycle
(2) Operation and Maintenance: There is a need to train operation
people for every new technology or feature. The above causes a slow
time-to-market and a waste of time and effort for any new technology
and/or OAM function.
Wu, et al. Expires December 7, 2014 [Page 5]
Internet-Draft Service Layer OAM June 2014
Specifically, in service function chaining environment, every
function may operate in a different layer and may use different
encapsulation and tunneling techniques. When taking into account
virtualization related technologies, the number of encapsulation and
tunneling options is very high. Still, end-to-end service OAM
mechanisms and information exchanges between functions should be
provided to operate and maintain the network as a whole. This
requires a generic tool-set that can provide all standard tools in
context of multi-technology, multi-layer, physical and virtual
environments.
An interesting angle to aspect of this problem is how the OAM
information at different layer is made available to management
application for use and learnt via the unified management interface.
For example, in the case of an multi-layer network, OAM information
needs to be imposed to the packet and injected into the network and
at last abstracted from various layer and provide them to the
management application.
4.1. Use of Existing Protocol Mechanisms
OAM information relies on network technology at each layer and may
currently be exchanged at each network layer in a domain by using
various encapsulation technologies at the Layer 2 & Layer 3 levels.
OAM information may be gathered and exported from a domain (for
example, northbound) using SNMP,I2RS or NetConf/Yang.
It is desirable that a solution to the problem described in this
document does not require the implementation of a new, network-wide
protocol or introduce a shim layer to carry OAM information.
Instead, it would be advantageous to make use of an existing protocol
or functionality that is commonly implemented on routers and is
currently deployed. This has many benefits in network stability,
time to deployment, and operator training.
It is recognized, however, that existing protocols or functionalities
are unlikely to be immediately suitable to this problem space without
some protocol extensions. Extending protocols must be done with care
and with consideration for the stability of existing deployments. In
extreme cases, there is a lack of functionality, although similar
mechanisms exist in other technologies, a new protocol can be
preferable to a messy hack of an existing protocol.
4.2. Strong Technology dependency
OAM protocols are relying heavily on the specific technology they are
associated with. Addressing scheme is a good example for an issue
that has a high price for being non-generic. Ping of IPv4 and IPv6
Wu, et al. Expires December 7, 2014 [Page 6]
Internet-Draft Service Layer OAM June 2014
looks different in the addressing scheme as well in the ICMP
indication field, but they have the same OAM functionalities.
4.3. Weakness of cross-layer OAM
Troubleshooting is cumbersome due to protocol variety and lack of
multi-layer OAM. Usually OAM messages should not cross layer
boundaries. Each of the service, network and transport layers
possesses its well- discernable and native OAM stream. In addition,
OAM messages should not be leaked outside of a management domain
within a layer, where a management domain is governed by a single
business organization. When having networks with more than one
technology, maintenance and troubleshooting are done per technology
and layer.
This could in some cases ease the understanding in which technology
the operation is done or fault is located. In some cases, when one
layer OAM fails, it would be more desirable to drop down to the
another layer OAM and issue the corresponding OAM command, using the
same API if OAM in multiple layers can be supported. However, in
most cases switching tools and layers in the same operation process
is cumbersome and not serving the main idea - to find the root cause
location. It would be very helpful to have a generic mechanisms that
is end to end basis and can ping IPv4 host by an IPv6 source or
having one tool to troubleshoot combined IP, MPLS, Ethernet, GRE and
VXLAN network.
4.4. Lack of OAM above Layer 3
The Layer 2/3 protocols are quite rich in their functionality, well
defined, standardized and heavily used. In the last years a lot of
work was done to consider maintenance domains and levels in order to
better handle the issues of cross technology, vendor and operator
domains to provide smooth interoperability and domain separation.
The above mechanisms are not defined for the technologies above Layer
3. Therefore, in the SFC environment no standard exists as a
reference for OAM since when the service packets is steered through a
set of service nodes distributed in the network, each service node
work at different layers above layer 3.
4.5. Issues of Abstraction
In multi-layer network,OAM function is enabled at different layer and
various OAM information needs to be gathered from various layer.
Without multi-layer OAM in place, it is hard for management
application to understand what these information at different layer
stands for. One possible solution to the issues is to abstract the
Wu, et al. Expires December 7, 2014 [Page 7]
Internet-Draft Service Layer OAM June 2014
OAM information shared across layers, i.e., using the same tool or
API to activate the OAM functions at different layers and retrieve
the results.
The trick to this multi-layer problem, is to abstract in a way that
retains as much useful information as possible while removing the
data that is not needed. An important part of this trick is a clear
understanding of what information is actually needed.
4.6. Issue of OAM information gathering from Service Function
When the service packets is steered through a set of service nodes
distributed in the network, each service node work at different
layers above layer 3 and may have several SFs collocated with itself.
When OAM mechanism is applied, it is necessary to allow OAM packet
exchanged between these service nodes or service function at
different layers. when Service function involved in the SFC doesn't
support OAM capability(e.g., SF is SFC-unaware service function),
Service node should be responsible for monitor and diagnose the
Service function and check service availability to these service
function. It is more desirable to allow service function register to
service node. Either service function report status to service node
or service node perform live check to these service function.
In addtion, service functions usually don't have Layer 2-3 switching/
routing capability and therefore are not aware of any OAM function at
layer 2-3. Also when there is no OAM functions at service layers at
top of layer 3, it is hard to identify layer that can be used to
gather OAM information when it comes to a fault situation or
degradation of performance. For example, when a data packet is
transmitted from one service function to another service function and
the data packet may be lost between two service functions or
discarded by either of service function, assume two service functions
are embedded in two different service nodes, how to detect the fault
between them and how to isolate problem to that layer?
5. Existing Work
The following subsections discuss related IETF work and are provided
for reference. This section is not exhaustive, rather it provides an
overview of few initiatives tackling the pain-points of OAM.
1. An important work done in [I-D.tissa-netmod-oam] create a YANG
unified data model for OAM that is based on IEEE CFM model. This
model can be used also for IP OAM functionality. The above work
is focused on the management plane of OAM and should be
complemented by an accompanying data-plane and/or control-plan
Wu, et al. Expires December 7, 2014 [Page 8]
Internet-Draft Service Layer OAM June 2014
work. It may require also some extensions to address wider
variety of functions and technologies.
2. Several works done in the last years tried to address new
technologies using existing mechanisms. [I-D.jain-nvo3-overlay-
oam] and MPLS-TP OAM documents are only examples for such
efforts.
6. Architectural Consideration
6.1. Basic Components
6.1.1. Interconnect OAM at different layers
6.1.2. Interconnect OAM at the same shim layer above layer 3
6.2. OAM Functions in Data Plane
6.2.1. Continuity Check
This type of mechanisms check that the monitored layer and/or entity
are alive and providing connectivity from specific point(s) to other
point(s). Some examples are BFD and ETH CC.
6.2.2. Connectivity Verification
Verifying that the actual connection is consistent with the required
connection and no misconnection occurred. Some examples are IP Ping,
VCCV and ETH loopback.
6.2.3. Path Discovery
Used to discover the path that specific service traverses in the
network. Some examples are LSP Trace, IP Trace-route and Ethernet
Trace.
6.2.4. Performance measurement
A function that monitors the performance parameters of a network
entity. Such parameters could be Delay, Delay-variation, loss,
availability of services and class of services. Examples are TWAMP/
OWAMP and Y.1731.
6.2.5. Protection Switching
A function that is used to signal protection switching states and
commands. Examples are ETH APS messages.
Wu, et al. Expires December 7, 2014 [Page 9]
Internet-Draft Service Layer OAM June 2014
6.2.6. Alarm/defect indication
A function that is used to indicate that a failure occured downstream
or upstream within a connection/service. Used also to trigger fast
protection or to suppress alarms. Examples are ETH AIS and ETH RDI.
6.2.7. Maintenance commands
A function that is used to signal a maintenance state or command
within a connection/service. Examples can be ETH Lockout.
6.3. OAM in Management plane
Management systems play an important role in configuring or
provisioning OAM functionality consistently across all devices in the
network, and for automating the monitoring and troubleshooting of
network faults. However OAM is not provision,In general,
Provisioning is used to configure the network to provide new
services, whereas OAM is used to keep the network in a state that it
can support already existing services.
There are two phases to OAM provision. The first phase is the
network provisioning phase, which sets up Maintenance Domains (MD)
and Maintenance Intermediate Points (MIP) and enables basic OAM
functionality(e.g.,Connectivity Fault Management (CFM)) on the
devices.
The second provision phase is the service activation phase,which
enable the origin of ping and trace packets, as well as configure
continuity-check and cross-check functionalities.
The different OAM tools may be used in one of two basic types of
activation:
o Proactive activation - indicates that the tool is activated on a
continual basis, where messages are sent periodically, and errors
are detected when a certain number of expected messages are not
received.
o On-demand activation - indicates that the tool is activated
"manually" to detect a specific anomaly.
7. Building on Existing Protocols
Wu, et al. Expires December 7, 2014 [Page 10]
Internet-Draft Service Layer OAM June 2014
8. Scoping Future Work
9. Manageability Considerations
10. Security Considerations
Security considerations are not addressed in this problem statement
only document. Given the scope of OAM, and the implications on data
and control planes, security considerations are clearly important and
will be addressed in the specific protocol and deployment documents.
11. Summary
This document highlights problems associated with OAM in packet
technologies today. We detail the problem scope, identified the main
OAM functions that should be addressed based on the current
aggregated functions.
12. Acknowledgements
The authors would like to thank Romascanu, Dan, Tissa Senevirathne
for their valuable reviews and suggestions on this document.
13. References
13.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", March 1997.
13.2. Informative References
[I-D.ietf-nsc-problem-statement]
Quinn, P., Guichard, J., and S. Surendra, "Network Service
Chaining Problem Statement", ID draft-quinn-nsc-problem-
statement, August 2013.
[I-D.jain-nvo3-overlay-oam]
Jain, P., "Generic Overlay OAM and Datapath Failure
Detection", ID draft-jain-nvo3-overlay-oam-01, February
2014.
[I-D.tissa-netmod-oam]
Senevirathne , T., Finn, N., Kumar , D., and S. Salam ,
"YANG Data Model for Operations Administration and
Maintenance (OAM)", ID draft-tissa-netmod-oam-00, March
2014.
Wu, et al. Expires December 7, 2014 [Page 11]
Internet-Draft Service Layer OAM June 2014
Authors' Addresses
Qin Wu
Huawei
101 Software Avenue, Yuhua District
Nanjing, Jiangsu 210012
China
Email: bill.wu@huawei.com
Mishael Wexler
Huawei
Email: mishael.wexler@huawei.com
Pradeep Jain
Nuage Networks
755 Ravendale Drive
Mountain View, CA 94043
USA
Email: pradeep@nuagenetworks.net
Wu, et al. Expires December 7, 2014 [Page 12]