An Overview of Operations, Administration, and Maintenance (OAM) Tools
draft-ietf-opsawg-oam-overview-16

Note: This ballot was opened for revision 05 and is now closed.

(Stewart Bryant) Discuss

Discuss (2011-08-10 for -)
The following is an edited version of the comments received 
from the Routing Directory Review. 

Summary:

There are significant concerns that need to be addressed 
before publication. 

The document does no needs to clearly identify the target 
audience. Since a document written as a tutorial for a beginner 
has different requirements from that written for a subject 
matter expert this clarification is important in terms of 
expectations in terms of depth and precision of the text. A 
tutorial document for the beginner would be most welcome 
considering the extent of OAM discussions that have taken 
place in the IETF and it is assumed by the reviewer that 
this is the intent of the document.

To that end the document needs to - 

Include  a “Historical Background”  session that goes 
beyond the single sentence in Section 1 (“OAM was originally 
used in the world of telephony, and has been adopted in packet 
based networks”)

Provide a clear view of OAM functionality and its relationship 
to various “planes” of networking (data plane, control plane, 
management plane). In particular, the importance of 
fate-sharing of OAM and user traffic flows in packet networks 
should be explained.

Explicitly map the ideas,  terms and methods that have been 
adopted  from technologies owned by ITU-T and/or IEEE to  
IETF-owned technologies.  If such a mapping is not possible, 
it should be explicitly stated.

Explain in a neutral way points of contention regarding 
various OAM-related issues.

The draft as written is is a partial annotated list of 
references to IETF and non-IETF protocols and mechanisms that 
deal with certain aspects of OAM in IP, IP/MPLS, MPLS-TP 
and Ethernet networks. The draft does not describe the underlying
reasons for selecting particular protocols for description.

It is not clear why the now obselete is ITU-T Y.1711 considered 
in detail. The reviewer proposed giving consideration to I.610 
as a protocol, although I am not sufficinetly familiar with 
I.610 to determine its relevence. It should however be examined. 
Similarly it may be useful to introduce the reader to E-LMI 
(defined by MEF).

In terms of MPLS-TP why is there no discussion of MPLS-TP 
fault management OAM - (draft-ietf-mpls-tp-fault-05) is omitted??

There are a number of readibility issues that arrise from the 
terms and concepts taken from  the referenced documents 
having different meaning in these documents.  E.g.,. in Section 4.1
the draft states that ICMP ping provides “connectivity 
verification for Internet Protocol”. However, in Section 3.2.4 
the draft says that “connectivity verification function allows 
an MP to check whether it is connected to a peer MP or not”. 
Since MPs are not mentioned with regard to ICMP, it is not 
clear whether “connectivity verification” means the same thing 
in these two cases.

In some cases the text is detailed beyond the needs of the 
beginner, whilst other imporatnt concepts are not detailed 
sufficiently for example:

- The OWAMP TCP port information is not needed, whilst the IPPM 
 
- In Section 3.2.3 the draft defines the term “Maintenance 
Entity” (ME), whilst “Maintenance Entity Group” (MEG), a.k.a. 
“Maintenance Association (MA), is only defined by reference

-  In Section 4.5.2 the draft mentions security aspects of 
IPPM protocols. Howeverwhilst, these aspects are not even 
mentioned in Section 4.2. discussing BFD.

The document therefore needs another pass to ensure 
consistency of detail.

 

Major Issues:

 
 The concepts of data plane, control plane and management 
plane are not well explored in the draft and need to 
expained with their OAM context.

=======

The relationship between OAM functionality and network 
management as presented in the draft is unclear.  
For example

a. (Section 1) Other aspects associated with the OAM 
acronym, such as management, are outside the scope of 
this  document  <<Management is out of scope>>

b. (Section 4.6.4)   The FDI function is used by 
an LSR to report a defect to affected client layers, 
allowing them to suppress alarms  about this defect 
<< Alarms are arguable part of management >>

c. (Section 4.7.2) When the ETH-CC function detects 
a defect, it reports one of the following defect conditions:

i. Loss of continuity (LOC): Occurs when at least when 
no CCM messages have been received from a peer MEP during 
a period of 3.5 times the configured transmission period 

iii. Unexpected period: Occurs when the transmission 
period field in the CCM does not match the expected 
transmission period value  << Since transmission period 
field in ETH-CC is defined by management, this defect 
reports a management issue>

d. (Section 4.7.6)   The Alarm Indication Signal 
indicates that a MEG should suppress alarms about 
a defect condition at a lower MEG level, i.e., since 
a defect has occurred in a lower hierarchy in the 
network, it should   not be reported by the current node  
<<Alarms’ suppression again…>>

e. (Section 4.7.9)   The Y.1731 standard defines the 
frame format for Automatic Protection   Switching 
frames. The protection switching operations are defined 
in  other ITU-T standards. <<Whether PS is part of
OAM seems to depend on which SDO is considering the 
problem and this needs to be made clear to the reader>>

3.   OAM in connectionless vs. connection-oriented networks:

a. (2a) above suggests that OAM is applicable only 
to connection-oriented networks (if you do not have 
connections, connection problems do not exist by definition)

b. At  the same time, the draft discusses ICMP Ping 
(Section 4.1) operating in connectionless IP networks, 
and Ethernet OAM (Sections  4.7 and 4.8) operating 
in connectionless Ethernet networks.

 The authors should define the scope of OAM explicitly 
and clearly - and then remove the sections dealing with 
protocols and mechanisms that happen to be out of this 
scope. In particular, explaining the relationship of 
each specific defect to a specific networking plane.

 

 MEs, MPs, MEPs and MIPs

 
Caveat: It may well be that the problem is not with the 
draft but with the concept itself (or at least with the 
attempts to extend it to IP, IP/MPLS and MPLS-TP networks)

Consider the following statements:

 
1. (Section 3.2.2)   A Maintenance Entity (ME) 
is a point-to-point relationship between two Maintenance 
Points (MP). The connectivity between these Maintenance 
Points is managed and monitored by the OAM protocol.  
A pair of MPs engaged in an ME are connected by a Communication Link

2. (Section 3.2.3) A Maintenance Point (MP) is a 
functional entity that is defined at a node in the 
network, and either initiates or reacts to OAM messages. 
A Maintenance End Point (MEP) is one of the end points of an ME, 
and  can initiate OAM messages and respond to them. A Maintenance 
Intermediate Point (MIP) is an intermediate point between two MEPs, 
that does not initiate OAM frames, but is able to respond to 
OAM  frames that are destined to it, and to forward others.

3. (Section 3.2.3)  The 802.1ag defines a finer distinction 
between Up MPs and Down MPs. An MP is a bridge 
interface, that is monitored by an OAM protocol…

4. (Section 4.1)   ICMP provides a connectivity 
verification function for the Internet Protocol… ICMP is 
also used in Traceroute for path discovery.

An OAM beginner would not be able to answer the following 
questions:

1.      Can a communication link exist without any 
MPs on it?

2.      Suppose that I have defined a P2P bidirectional 
communication link with two MEPs  forming an ME.  What 
would happen to this ME if I add a MIP between the two MEPs?

3.      What is the relationship (if any) between MEPs 
and interfaces? Or is it just something specific to Ethernet bridges?

4.      Does a MIP really forward OAM frames that 
are not destined to it? 

5.      Operation of ICMP Ping does not require creation of 
MPs. How does it provide a connectivity verification function for IP?

The authors need to remove conflicting definitions, to fix typos 
(e.g., the definition of ME would be less problematic if it referred 
to a pair of MEPs and not to a pair of MPs) and inaccurate statements 
(in IP, IP/MPLS and MPLS-TP MIPs (as a component) do NOT forward 
OAM packets that are not destined to them – but they do 
that in Ethernet OAM).

 
Minor Issues:

Connectivity Check vs. Continuity Check

 The draft mainly uses the term “Continuity Check”. However, 
in some places the term “Connectivity Check” is used as well, e.g.:

1.      (Section 4.12) A key element in some of the OAM standards 
that are analyzed in this document is the continuity check. It is 
thus important  to present a more detailed comparison of the 
connectivity check mechanisms defined in OAM standards.

2.      (Section 4.3)  LSP Ping extends the basic ICMP Ping 
operation (of data-plane connectivity and continuity check)…

Please look at the use of the terms and ensure they are 
applied consistently.
 

Caveat: Similar inconsistency in IEEE 802.1ag (but not in ITU-T Y.1731).


Continuity Check vs. Connectivity Verification

In Section 3.2.4. the draft refers to  RFC 5860 as the ultimate
 source of information about the difference between Continuity 
Check and Connectivity Verification. Looking up RFC 5860 (Section 2.2.3),
 I’ve learned that connectivity verification is a function that 
allows an End Point to find out whether it is connected to a specific 
End Point(s) by means of an expected PW, LSP or Section. At the same 
time, the draft says (in the same Section 3.2.4) that “A connectivity 
verification function allows an MP to check whether it is connected 
to a peer MP or not”.  The omitted  words from RFC 5860 “by means of…” 
make such a definition unclear; also it is unclear whether End Points 
(of Section, LSP or PW) which, presumably, are MEPs, can be 
extended to be MEPs or MIPs (the draft uses the term MPs).

It is also not clear whether the draft considers LSP Ping (see 
Section 4.3.) functionality “to verify data-plane vs. control-plane 
consistency for a Forwarding Equivalence Class (FEC)”  as  related to 
Connectivity Verification. This is especially strange since the 
draft also states (in the same section)  that “LSP Ping extends the 
basic ICMP Ping operation”  while Section 4.1 states that “ICMP 
provides a connectivity verification function for the Internet   Protocol”.  

 
Another problem is the statement (in Section 4.2.3) that “BFD 
Echo provides a connectivity verification function”, especially 
since draft-ietf-mpls-tp-cc-cv-rdi-05 in Section 3.5 expands 
format of the BFD control packets in order to provide CV function, 
while BFD Echo is not even mentioned in this document. It might be
 worth noting that we are not considering BFD Echo mode for MPLS-TP.

Finally, the draft does not explain whether there is any 
correlation between the defects detected by the continuity 
check and those detected by connectivity verification 
(Section 4.10.3.1 looks  a logical place for this).

 

Inaccurate Representation of IEEE 802.1ag


In Section 3.2.3 of the draft theer is the following text:

“The 802.1ag defines a finer distinction between Up MPs and Down 
MPs. An MP is a bridge interface, that is monitored by an OAM 
protocol either in the direction facing the network, or in the 
direction    facing the bridge. A Down MP is an MP that receives 
OAM packets from,    and transmits them to the direction of the 
network. An Up MP receives OAM packets from, and transmits 
them to the direction of the bridging entity”.

 
However IEEE 802.1ag states (see Section 22.1.3 of that 
document ) that: “All Up MEPs belonging to MAs that are attached 
to specific VIDs are placed between the Frame filtering entity 
(8.6.3) and the Port filtering entities (8.6.1, 8.6.2, and 8.6.4). 
Separately for each VLAN, there can be from zero to eight Up 
MEPs, ordered by increasing MD Level, from Frame filtering 
towards Port filtering”.

 
That seems to imply that 802.1ag MEPs are NOT bridge interfaces 
(since there can be are multiple MEPs per VLAN and multiple 
VLANs per bridge interface).


Defects, Faults and Failures

In Section 3.2.5 the draft discusses the terms Defect, Fault 
and Failure. However, these terms seem to apply to the 
“communication link” the term needs to be clarified to 
indicate that this is a data plane entity, or the term data 
plane used in its place.

At the same time, “Unexpected Period” and “Unexpected MEP” 
are mentioned as defects detected by ETH-CC in Section 4.7.2 
even if, to the best of my understanding, these conditions 
are side effects of mis-configuration i.e., a management plane problem.

 

VCCV: An OAM Mechanism or a Control Channel?


In Section 4.4. the draft states that VCCV “provides 
end-to-end fault detection and diagnostics for PWs”. 
This seems to point that VCCV is an OAM mechanism/protocol.

However, later in the same section is states that “The 
VCCV switching function provides a control channel associated 
with each PW… and allows sending OAM packets in-band with PW data”.  
And on the next line it explains that “VCCV currently supports 
the following OAM mechanisms: ICMP Ping, LSP Ping, and BFD” 
(which are all mentioned as OAM mechanisms providing 
continuity check and/or connectivity verification in the draft). 
So it remains completely unclear whether VCCV is an OAM 
mechanism or just a channel for separating user data from OAM flows.

The issue here may well be historic because VCCV predates 
the modern ACH mechanism. This should be clarified in the text.
 

MEs, MEGs and MEG levels

The draft explicitly defines a Maintenance Entity (ME) in 
Section 3.2.2, but defers to MPLS-TP OAM Framework for 
the definition of the Maintenance Entity Group (MEG). The 
text defining ME in the draft differs from that in the 
MPLS-T_ OAM Framework document  
(see http://datatracker.ietf.org/doc/draft-ietf-mpls-tp-oam-framework/?include_text=1, Section 2.2). 
At the same time, it resembles the definition of ME in 
Section 3.1 of this document.

MEG level is mentioned a couple of times in the draft, 
but the only explanation given (in Section 4.7.2) is 
“The MEG level is a 3-bit number that defines the level 
of hierarchy of the MEG”; and this seems to be the only 
text in the draft that deals with MEG hierarchy. A more 
details description should be provided.


Differences between Approaches to Packet/Frame Loss Measurement

 There is no description the fundamental difference 
between two approaches to measuring packet loss – 
that of the IPPM WG (based on counting synthetic packets) 
and that of Y.1731 (based on counting the user packets), 
even if both are mentioned in the draft. MPLS-TP BTW 
provides a tool for doing loss measurement and notes 
that the instrumentation technique is independent of 
the method of making the measuremnet.
Comment (2011-08-10 for -)
No email
send info
Unidirectional/Bidirectional OAM vs. One-way/Two-way OAM - 
Both pairs of terms are used in the draft (One-way/Two-way - 
in Section 4.5.1, Unidirectional – in section 4.12, /Bidirectional – 
in Section 4.2.2).  Neither  the terms nor their 
equivalence are explained in the draft.


In section 4.10.3.1: “Continuity Check and Connectivity 
Verification (CC-V) are OAM operations generally used in 
tandem, and compliment each other.” – probably should 
be “complement”?

“There are a few differences between the two standards in t
erms of terminology”  do you mean: “There are a few differences 
in terminology between the two standards”.

(Ron Bonica) Yes

(Benoît Claise) Yes

(Gonzalo Camarillo) No Objection

(Ralph Droms) No Objection

Comment (2011-08-10 for -)
No email
send info
Minor editorial suggestions...


In section 3.2.5, the word "intermittently" doesn't seem right.
Perhaps "interchangeably"?

I was OK with this sentence in section 3.2.5:

   The terms Failure, Fault, and Defect are intermittently used in the
   standards, [...]

until I read in the next paragraph that ITU-T differentiates among the
three terms.  Perhaps the quoted sentence should specify which
standards?  Also in the title of 3.2.6?

(Wesley Eddy) No Objection

Comment (2011-08-06 for -)
No email
send info
IPPM has defined other metrics that aren't mentioned here (e.g. duplication and reordering) ... is there a reason why those aren't included?

It was also unclear if psamp, netflow, and ipfix were excluded for a reason.

(Adrian Farrel) (was Discuss) No Objection

Comment (2011-08-08 for -06)
No email
send info
The Ballot Text write-up seems missing the Technical Summary.

---

I'm nervous of a document that makes a comparative analysis of OAM
mechanisms developed in another SDO without seeking input from that
SDO.

---

idnits warns about the unnecessary 2119 boilerplate and the unresolved
references. There is no reason for an I-D to reach this stage with 
theese warnings. Please clean up before passing to the RFC Editor.
                                     
---

You say:
                                     
   o ICMP Echo request, also known as Ping, as defined in [ICMPv4], and
      [ICMPv6]. ICMP Ping is a very simple and basic mechanism in
      failure diagnosis, and is not traditionally associated with OAM,

"Traditionally" gives me an image of my great grandfather hand-crafting
packets from kiln-dried apple wood.

You might want to find out which tools are most commonly used by network
operators to diagnose their networks. According to that research and 
your definition of OAM, you will possibly find that ICMP Ping is very
much associated with OAM.

---

Odd that Section 1 calls out MPLS-TP and RFC 5860, butdoes not call out
RFCs 4377 and 4378.

---

Table 1 seems confused about whether it needs to make citations (in
square brackets). It does not need to state "work in progress" for
I-Ds that are referenced and marked as such in the references section.

---

Table 1 seems to be missing some of the references used in the text.
For example for p2mp LSP ping. Can you do a cross-check with the text?

Actually, the table seems a bit mixed. Some protocols are listed, while
in other areas you just list the requirements and frameworks.

---

Did you consider discussing permformance metrics at other layers as
part of the diagnostic toolset? You certainly seem open to OAM at 
"various layers." Have a look at draft-ietf-pmol-metrics-framework
and maybe think about RFC 6076.

---

Section 3.1

Add ACH, ETH, FEC, GAL, LDP, LOC, LOCV, MC, MTU, UC
LSP is a Label Switched Path
I thought the 'M' in ME and MIP stood for MEG

---

Section 3.2.6

The table shows "System" for BFD Maintenance Point Terminology. It is
not clear to me what that word means.

---

Section 4.12

   |BFD        |BFD    |Negotiat|UC |My Discr| Control Detection Time |
   |           |Control|ed durin|   |iminator| Expired                |

"My Discriminator"? Who are you?

---

I should have liked Section 5 to have included a discussion of the 
security considertions of OAM in general, and the security provisions
available for the various OAM mechanisms discussed.

---

Should you include RFC 4950?

---

NEW COMMENT

I wonder if you need to also consider draft-ietf-trill-rbridge-channel

(Russ Housley) (was Discuss) No Objection

(Pete Resnick) No Objection

Comment (2011-08-11 for -)
No email
send info
RFC Editor note addresses my comment.

And now, some snark and sarcasm for the amusement of my fellow ADs and anyone else who cares:

<sarcasm>My ballot notwithstanding, I hereby object to the fact that this document (a) defines OAM and (b) does not normatively reference RFC 6291/BCP 161. (*snort*)</sarcasm>

(Peter Saint-Andre) No Objection

Comment (2011-08-08 for -)
No email
send info
1. The use of the term "localization" in the Abstract is potentially confusing, since localization in application protocols refers to presenting textual strings that are appropriate for a given locale. Perhaps the term "isolation" might be more appropriate?

2. This paragraph is confusing:

   o IP Performance Metrics (IPPM) is a working group in the IETF that
      defined common metrics for performance measurement, as well as a
      protocol for measuring delay and packet loss in IP networks.
      Alternative protocols for performance measurement are defined, for
      example, in MPLS-TP OAM [MPLS-TP OAM], and in Ethernet OAM [ITU-T
      Y.1731].

As far as I can see, MPLS-TP OAM and Ethernet OAM were not developed in the IETF's IPPM WG; I suggest moving the second sentence to a separate paragraph.

(Robert Sparks) No Objection

(Sean Turner) (was Discuss) No Objection