IESG Narrative Minutes
Narrative Minutes of the IESG Teleconference on 2011-08-11. These are not an official record of the meeting.
Narrative scribe: John Leslie (The scribe was sometimes uncertain who was speaking.)
Corrections from: Dan
1 Administrivia
2. Protocol Actions
2.1 WG Submissions
2.1.1 New Items
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
2.1.2 Returning Items
2.2 Individual Submissions
2.2.1 New Items
Telechat:
2.2.2 Returning Items
3. Document Actions
3.1 WG Submissions
3.1.1 New Items
Telechat:
Telechat:
|BFD |BFD |Negotiat|UC |My Discr| Control Detection Time | | |Control|ed durin| |iminator| Expired |"My Discriminator"? Who are you?
Telechat:
3.1.2 Returning Items
3.2 Individual Submissions Via AD
3.2.1 New Items
Telechat:
Telechat:
Telechat:
Telechat:
3.2.2 Returning Items
3.3 IRTF and Independent Submission Stream Documents
3.3.1 New Items
Telechat:
3.3.2 Returning Items
1246 EDT break
1251 EDT back
4 Working Group Actions
4.1 WG Creation
4.1.1 Proposed for IETF Review
4.1.2 Proposed for Approval
4.2 WG Rechartering
4.2.1 Under evaluation for IETF Review
Telechat:
Telechat:
Telechat:
Telechat:
4.2.2 Proposed for Approval
5. IAB News We can use
6. Management Issues
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
7. Agenda Working Group News
1358 EDT Entered Executive Session
(at 2011-08-11 07:29:28 PDT)
draft-ietf-mpls-p2mp-lsp-ping
Very clear and readable document. Thank you.
I have a few minor editorial comments.
If I understand the formats of the P2MP Sub-TLVs in sections 3.1.1.1
and 3.1.1.2 correctly, the P2MP ID in both sub-TLVs are 32 bits long.
It is potentially confusing to show them as different sizes in the two
diagrams in those sections.
I was surprised not to find diagrams for the formats of the Egress
Address and Node Address Responder Identifier sub-TLVs in section 3.
If the authors think the format will be obvious to an implementor
based on the description of the sub-TLVs - they each carry a single
IPv4 or IPv6 address - there's no need for the diagrams.
I'm not sure how I would interpret this text from section 4.2.1.2
(similar text is used elsewhere in the document):
MAY be set to value 3 ('Replying
router is an egress for the FEC at stack-depth <RSC>') or any other
error value as needed
Should this be interpreted as "the Return Code MUST be set to either
the value 3 or an error value" or "the Return Code MAY be set; if it
is set, is MUST be set to the value 3 or an error value".
(1) 4.2.1.3 gives "guidelines" as to what to do and has a list of Return Codes nodes that "MAY" be used but with no MUSTs. I found that a bit puzzling so maybe saying why these are guidelines and whether guideline == SHOULD would be good. (2) 4.3.3 says that when you get no answer, you SHOULD retry with a larger TTL. It doesn't say how long to wait though, nor when the SHOULD doesn't apply. Maybe that SHOULD would be better as a "could" and explicitly say that you're just leaving it all to the implementer? (3) This is probably a dumb question but it wasn't clear to me how the replies get back to the node sending the ping. 4.1.2 says that the P2MP LSPs are unidirectional and that replies "are often sent back through the control plane" but that, by itself, doesn't make it clear to me. I assume that this would be clear to implementers but just wanted to check.
I found this document well-written and easy to read. I support the
standardization of ping and traceroute for this environment.
However, the
standard defined in this document needs additional work, especially the RFC2119
keyword usage, TBD usage, and the control of exceptional conditions. Hopefully,
the authors can point out how these are already addressed somehow, that I simply
overlooked. If not, I think these should be fixed before this standard is
approved.
in 3.2, why SHOULD ignore rather than MUST ignore? Does this present any
security vulnerabilities, such as covert channels?
in 3.2.1, why SHOULD NOT send or receive
in 3.2.1, "MUST respond only if" - the only here is ambiguous; if this is a node
on the path, it MUST respond; it is not stated that a node not on the path MUST
NOT respond. Note that the use of "MUST respond only if" is used in multiple
places, ad is ambiguous; this can be interpreted as "the MUST only applies if",
and when the condition is false, there is no standardzied behavior.
throughout section 3, the requirements language needs to be tighter, or the
valid exceptions to the SHOULDs need to be documented. for example "The address
in this Sub-TLV SHOULD be of any transit, branch, bud or egress node for that
P2MP LSP." Why not MUST? What should a receiver do with such an address? What
are the security and operational impact of using an address that is not one of
these? (for example, if the receiver can ignore this value, could this field be
used to create a covert channel?)
in 3.3, why should wait, and should ignore, rather than must?
in 3.4, why should and not must? why should not rather than must not? why should
ignore and not must?
in 3.5, this document, which updates 4379, relies on a work-in-progress to
deprecate a TLV in 4379. So will both these documents update 4379? or will DDTM
update this document? or will this document update DDTM?
If there are multiple documents doing updating, then changes to fields, such as
the global flags field should probably have a registry to coordinate the flag
field (re-)definition from multiple documents.
in 4.1, "If the Echo Jitter TLV is present in an echo request for any other type
of LSPs, the responding egress MAY apply the jitter behavior as described here."
what effect might this have on other types of LSPs, which are not necessarily
expecting jitter tlvs? Is processing the new jitter TLV in a non-p2mp tlv
consistent with 4379 processing?
in 4.2.1, why should and not must? what happens if a transit node DOES generate
an echo reply?
in 4.2.1, "As mentioned previously, the Return Code might change based on the
presence of Responder Identifier TLV or Downstream Detailed Mapping TLV." is
this behavior deterministic? Can an operator depend on certain return codes
being reported when DDMT or Responder ID TLVs are present?
TBD apppears to have multiple values; these should probably be differentiated so
IANA and the RFC Editor know how to substitute them easily, such as TBD1, TBD2,
etc. This will help avoid mistakes.
this seems to be especially needed in
4.2.1.1, wherte the TBD is apparently meant ot be assigned for the DDMT
document.
transit and branch and bud should have references on first usage, or a
terminology section should identify their meanings. I got through to 4.2.1.1
without knowing the distinction, but here knowing the distinction is important.
I could not locate a description of transit and branch in this document ro in
rfc4379, and nothing pointed me to the document that defines the distinction.
in 4.2.1.2, "Egress nodes do not put in any Downstream Detailed Mapping TLV in
the echo reply." should this be MUST NOT?
in 4.2.1.3, there is a case structure to determine behavior. But the
descriptions are not described in normative terminolgy. Please specify the
appropriate responses using RFC2119 keywords, and clarify "the Return Code MAY
be set to value X ('Label switched at stack-depth <RSC>') or any other error
value as needed."
in 4.3.1, "Therefore it is RECOMMENDED that traceroute operations provide for a
configurable upper limit on TTL values. Hence the user can choose the depth to
which the tree will be probed." In the absence of such a configurable limit,
this could be detrimental to the network operation, possibly to the point of
denial of service. I think this should be a MUST. (As an SNMP guy, I remember
when SNMP autodiscovery in some products tried to discover the whole Internet
because there was no configurable limit. Not a good thing.)
in 4.3.2, the same issue - this should be REQUIRED, not RECOMMENDED.
What are the security and operational impact in the PHP scenario?
in 4.3.3, if a node fails to respond, the ingress should send another with a
greater TTL. What happens if the next node fails, and the next node? and the
next node? what if the first failed node provides the connectivity to the
subsequent nodes, such that the ? the nodes prior to the failure will be
peppered with requests, none of which are succeeding. At what point should the
ingress stop this behavior? how will it know when it has reached that point?
in 4.3.4, it discusses the processing if an egress address responder identifier
sub-TLV is present, which has a limited effect on the response. However, it does
not discuss the processing when that sub-TLV is not present, and I have concerns
about the processing load that such processing might impose. Can you describe
the behavior when the egrees ID is not present?
in 4.3.5, the reply SHOULD carry a limited response. Why is this not a MUST, and
what is the security/operational impact of a SHOULD-compliant implementation
choosing to do otherwise?
The sentence "Due to this restriction, the cross-over node will not duplicate"
wil not is not really accurate since this is not a MUST. (and of course, with
non-compliant implementations)
in 5, a non-compliant router does not respond. Does this cause the type of
failure included in 4.3.3?
in 1.2, "The terminology for MPLS OAM can be found in [RFC4379]." However, RFC
4379 never mentions the term OAM. You should probably reference the MPLS OAM
framework draft-ietf-mpls-tp-oam-framework-11 and/or RFC4378.
in 6, there are strong reasons presented that justify moving many of the SHOULDs
in this document to MUSTs.
in 6, "Such an interface SHOULD also provide the ability to disable all active
LSP Ping operations to provide a quick escape if the network becomes congested."
There is no discussion of how this can be accomplished. Obviously, the ingress
node can stop sending new requests, but how can the processing of already sent
requests be disabled throughout the network? Maybe there should be a "kill"
command that could be sent to disable continued processing of previously sent
requests that require time-consuming processing. (or maybe such a kill command
would just add to the congestion)
In 7, the early allocation process has already assigned values; why do we still
have TBDs in the text?
In 8, "This document does not introduce security concerns over and above those
described in [RFC4379]." I strongly disagree. The scalability issues can create
a vulnerability to denial of service attacks, and the control of exceptional
conditions could be improved.
in 3.4, s/on the packet/in the packet/ in 4.2, s/is RECOMMENDED to/SHOULD/ in 4.2.1.1, there is a reference to <RSC>. your terminology section mentions <RSC>, butmade no sense to me at 1.2, and by the time I reached 4.2.1.1, I had forgotten it was mentioned in 1.2. I think this would be more effective if you provide the reference at first usage.
Please consider the comments from the Gen-ART Review by Alexey Melnikov on 31-May-2011. The document has been updated since the review was posted, but it looks like these comments were not addressed. None of the comments are showstoppers. The review can be found here: http://www.ietf.org/mail-archive/web/gen-art/current/msg06355.html
Two nits: 1) Maybe indent the bit that's copied from 4379: The motivations listed in [RFC4379] are reproduced here for completeness: When an LSP fails to deliver user traffic, the failure cannot always be detected by the MPLS control plane. There is a need to provide a tool that enables users to detect such traffic "black holes" or misrouting within a reasonable period of time. A mechanism to isolate faults is also required. 2) To line up with RFC 4875 r/Must/MUST in the packet figures in sections 3.1.1.1 and 3.1.1.2 (x4).
draft-ietf-mpls-lsp-ping-enhanced-dsmap
Very minor editorial suggestions: Section 1: This documents describes methods for performing LSP-Ping (specified in [RFC4379] traceroute over MPLS tunnels. change to "This document [...]"; is there a right paren missing? In the next sentence "in case where" sounds wrong; perhaps use "in the case where" for both parts of the sentence? Section 3.3.1: MAY be include [...] change to "MAY be included"
I see that this defines a "NIL FEC" that can be used to hide information, which is just fine. That made me wonder though if this is only needed here or maybe also elsewhere (e.g. in draft-ietf-mpls-p2mp-lsp-ping)? Or, perhaps other MPLS ping functions need some equivalent? (Just checking.)
This document looks good overall.
I have a few questions, especially about a few
places where ambiguity exists about what is required in a compliant
implementation. I think these will all be easy to resolve.
1) in 3.2.2, should there be a RECOMMENDED predictable return code when FEC
hiding is used, rather than leaving it implementation-dependent?
2) in 3.3, shouldn't some of these SHOULDs be MUSTs to improve interoperability?
If SHOULD, then what are the acceptable MAY conditions?
3) in 3.3.1.3, Figure 6 shows that an address can be 0 length, but the
description of a remote peer address does not specify how to specify an
Unspecified address. The table under address type seems to indicate that when
type-0, length MUST be 0, when type=1, length MUST be 4, and when type=2 length
MUST be 16, but these requirements are not stated explicitly in RFC2119
language.
4) in 4.1.1, "the origination point of a new tunnel" is the tunnel always new?
or is this the origination point of a tunnel that already exists? Aren't we
tracing existing tunnels? If so, I recommend "the origination point of a
tunnel". If this is a **new** tunnel just being created, can you describe when
this circumstance would occur? or does "new" mean it is new to the traceroute
operation? If this is what new means, can you describe how the traceroute
fucntionality on this node would know that?
5) in 4.1.1, "If the transit node does not know the address of the remote peer,
it MUST leave it as Unspecified." The term 'it' is ambiguous; does this mean the
address type must be set to 0 per 3.3.13? what if the node knows the address
type of the tunnel, but not the address, or knows the address type but chooses
to not reveal the address? could it specify the address type as IPv4 or IPv6 but
leave the address 0-length? and would this be useful for the ingress (operator)
to know?
6) in 4.1.1, the last paragraph starts with a conditional, but contains MUST
langauge, "The transit node SHOULD add 1 FEC Stack change sub-TLV of operation
type PUSH, per new tunnel being originated at the transit node." Is this
normative langauge only applicable to nodes that wish to hide the nature of the
tunnel? or shoudl this start a new paragraph?
7) in 4.1.2, I might just be confused. You have multiple levels of return codes;
which return code is intended here - "Nodes C and D SHOULD set the Return Code
to "Label switched with FEC change" (Section 6.3) to indicate change in FEC
being traced."
8) in 4.1.2, the description following figure 8 would be easier to read, and
less ambiguous, if it used sentences rather than clauses. For example,
"Downstream information for node E when echo request contains RSVP-B as top of
FEC stack and an appropriate Return Code." If thi smenas what i think it means,
it would be much easier to parse if it said "When the echo request contains
RSVP-B as top of FEC stack, then the response should contain Downstream
information for node E and an appropriate Return Code." But I can parse this
sentence in other ways to get different results, including "When the echo
request contains RSVP-B as top of FEC stack and an appropriate Return Code...,
Downstream information for node E " (which means that I determine whether to
include downstream data based on the return code in the request.)
The number of
conditions makes this section difficult to read, and writing in non-sentences
makes it unnecessarily more difficult.
in 3.2.1, s/the node needs to return a different Return Code/Return SubCode for each downstream. / the node needs to return a Return Code/Return SubCode for each downstream./ (I suspect that each return code may not need to be different) in 3.4, "The Downstream Mapping TLV has been deprecated." should this be "This document deprecates the Downstream Mapping TLV"? If another document specified the deprecation, please provide a reference.
draft-ietf-behave-ftp64
This is a well-written and well-motivated document.
#1) It's probably worth expanding ALG in the title. It's not expanded in the abstract and as a security guy I thought you were going to be talking about an Algorithm ;) #2) Section 4 includes: As such, it is recommended to update FTP clients and servers as required for IPv6-to-IPv4 translation support where possible, to allow proper operation of the FTP protocol without the need for ALGs. r/recommended/RECOMMENDED? #3) Section 5: missing right parenthesis: ([RFC4217] #4) I think you need a normative reference for UTF-8.
draft-ietf-ccamp-asymm-bw-bidir-lsps-bis
I am moving from a discuss to a comment on the basis of the following proposal by Adrian: 1. Revise this I-D to keep the same code points as were used before 2. Lou and I will undertake a review of the RSVP registries in order to sort them out and make sure this problem won't arise again. 3. We will produce a revision of RFC 3936 that updates the main registries to allow for experimentation and applies "what we know now" to partitioning the codespace. This will be done in a backward compatible way. Original Discuss text: This document proposes no change to the RFC5467 other than to change the codepoints. This will invalidate any existing implementations (should they exist), and hazard an application that uses the reclaimed codepoints. The motivation seems to be to free up the expert review codepoints used by RFC5467 for other experiments. If that is the only motivation, then it would seem safer and less disruptive to reassign three standards action codepint to expert review and confirm the RFC5467 codepoints as now in use by this IETF standards track document.
Normally there's be a section that highlights changes between 5467 and this draft. Are the only changes really those two listed in the last para of Section 1? What about the errata on 5647 (I realize it's an editorial errata and I think it's adopted but it'd be better to call it out explicitly)?
draft-ietf-roll-of0
"It is important that devices deployed in a particular network or
environment use the same OF to build and operate DODAGs. If they do
not, it is likely that sub-optimal paths will be selected. "
Surely it's rather more serious that that. Incompatible OF's may cause loops
(look back 3 paragraphs), meaning that it is essential that the same OF be used
to ensure LF operation.
Nit "An implementation SHOULD allow to configure a... " Do you mean "An implementation SHOULD allow the operator to configure a..." ?
1. I would like to discuss the interoperability and deployability of
nodes in a RPL Instance that uses OF0. If I understand section 4.1
correctly, a node is free to compute step-of-rank using any method it
chooses. There is a suggestion that using link properties for the
computation is preferred, but the specific link properties and the
method of computation are not specified. Later, section 4.1 defines
two parameters, rank_factor and stretch_factor, that modify
step_of_rank to generate a "stretched step_of_rank". The latter value
is then multiplied by the MinHopRankIncrease to compute the node's
rank_increase.
My concern is the degrees of freedom provided to the implementation in
the computation of rank_increase. I'm hoping I can get an explanation
of some reason to believe that independent implementations, using
different methods to compute step_of_rank and potentially different
configured values for rank_factor and stretch_factor, will
interoperate successfully to form an operational network.
More specifically, if a simple metric that leads to a hop-count
computation for path lengths and route computation is know to yield
unusable performance, why is it not required that a node use some sort
of computation based on link characteristics? And, if the computation
of the step_factor is entirely up to the node, why are the rank_factor
and stretch_factor needed? Why not just leave the entire computation
to the implementation, with limitations that the resulting step_factor
lie in the range MINIMUM_STEP_OF_RANK and MAXIMUM_STEP_OF_RANK?
2. I see that the WG last call was conducted on draft-ietf-roll-of0-05,
while draft-ietf-roll-of0-12 was submitted for publication and we are
reviewing draft-ietf-roll-of0-15. There is at least one major change
between the specification as reviewed in WG last call and the version
of the spec we are reviewing: OF0 is stated in draft-ietf-roll-of0-15
to be applicable to wired and wireless networks, while rev -05
explicitly constrains OF0 to wired networks. As a point of
clarification, I think what's really intended here is that an OF based
on a hop count is only applicable to a network composed of fixed cost
links, and will be sub-optimal for a network composed of variable cost
links.
I'd like to know if the WG considered this major technical change and
my conclusion in point 1 that OF0 may bypass the RPL specification
requirement for a consistent objective function computation throughout
a RPL Instance by allowing nodes to choose any method for computing
the rank increase for that node, with the only constraint being the
range of values for the rank increase.
3. I would also like to discuss the requirement for a
mandatory-to-implement OF, whether OF0 is that mandatory-to-implement
OF and where the requirement to implement OF0 is specified.
4. In section 7.1, why is support of the DODAG Configuration option
only a SHOULD?
An OF0 implementation SHOULD support the DODAG Configuration option
as specified in section 6.7.6 of [I-D.ietf-roll-rpl] and apply the
parameters contained therein.
5. This point is more editorial than completely technical, but I list
it as a Discuss because of the potential for confusion among
implementors. I found the use of "DODAG Version" a little confusing,
because it appears sometimes to mean one of the successive Versions of
one DODAG and at other times to mean a specific DODAG chosen from
several DODAGs. For example, from the Introduction:
RPL forms Directed Acyclic Graphs (DAGs) as collections of
Destination Oriented DAGs (DODAGs) within instances of the protocol.
Each instance is associated with a specialized Objective Function. A
DODAG is periodically reconstructed as a new DODAG Version to enable
a global reoptimization of the graph.
An instance of RPL running on a device uses an Objective Function to
help it determine which DODAG Version it should join. The OF is also
used by the RPL instance to select a number of routers within the
DODAG Version to serve as parents or as feasible successors.
In my opinion, the first use of "DODAG Version" refers to a sucessive
Version of one DODAG, while the second use refers to a selection of
one DODAG from many DODAGs. Would it be possible to just use "DODAG"
for the second intended use? Or am I confused about the intended
meanings?
6. In section 4.2.1, what does it mean to "validate a router"? Why
would a router that passes validation ("succeeded that validation
process") only be "preferable"?
1. I consider the following comment to be a technical rather than an editorial suggestion because of the redundancy and potential conflict between the text in the Introduction of this document and the contents of draft-ietf-roll-rpl-19. It would improve the document to edit the Introduction down to a paragrpah that combines the following sentence from the first paragraph with the content from the last paragraph: An Objective Function defines how a RPL node selects and optimizes routes within a RPL Instance based on the information objects available. Editing out the text describing RPL would also presumably address Stewart's Discuss about the use of multiple OFs in one RPL Instance. 2. I don't understand this phrase from the last paragraph of the Introduction: [...] OF0 enforces normalized values for the rank_increase of a normal link and its acceptable range Do I have it right that a node uses OF0 to determine the rank_increase for a successor within the range: MINIMUM_STEP_OF_RANK < stretched step_of_rank < MAXIMUM_STEP_OF_RANK Unless I'm missing something, these limits on the stretched step_of_rank enforce (indirectly) a range on rank_increase, but don't normalize rank_increase by some normalizing factor. 3. Is there a reason to switch between "successor" and "parent" throughout the document? For example, the title of section 4.2 is "Feasible Successors Selection" and the title of the immediately following section 4.2.1 is "Selection Of The Preferred Parent". Are "successor" and "parent" are synonymous in this context? Would it be asking for foolish consistency to choose one or the other? 4. In section 7.1, what is "build-time"? At build-time [...] Also in section 7.1, what is a "fixed constant" (as opposed to any other kind of "constant"). Why are overridden values - I assume these are overriden by some administrative method like CLI configuration or some management protocol? - only applied at the next Version of the DODAG? 5. In section 3, how is "good enough" defined? From reading the rest of the spec, I don't see where any assessment of quality is applied to ensure that "good enough" is more than basic connectivity. I suggest just dropping "good enough".
I reviewed this document, but need to trust others with more background in routing to have reviewed it for its impact on forwarding. I reviewed the management considerations, and am pleased that they provided an information model to monitor the parameters in use. It would have been nice to have a mandatory-to-implement management protocol and data model to ensure interoperable management.
I now understand the history of the document between WGLC and now. However, there is scant little evidence of what happened in either the document writeup or the proto writeup. That would be useful in the future.
I have the same questions Ralph asks in his Discuss points
concerning interoperability and what is mandatory to implement.
As specified in -15, the variable behavior of of0
could be entirely determined by each elements (non-standard)
provisioning. With everything else being equal, swapping an
element from Vendor A with one from Vendor B can, and likely
will, result in substantially different network behavior, and
that's likely to be observable through the network's performance.
For someone to deploy this, they will have to ask each vendor
"What does your of0 actually do, and what knobs can I turn in it?".
If that's what the group expects the effect of this codepoint to be,
that interoperability consequence should be called out more clearly.
from the nits-checker:
== The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but
does not include the phrase in its RFC 2119 key words list.
== Unused Reference: 'I-D.ietf-roll-security-framework' is defined on line
569, but no explicit reference was found in the text
Note that for the 1st one you can keep 'NOT RECOMMENDED' in the text you just
need to add it to the 2119 paragraph immediately after RECOMMENDED.
draft-ietf-mpls-mldp-recurs-fec
draft-ietf-softwire-dslite-radius-ext
The last paragraph of Section 3 is too strict. It will cause IPv6-only
clients to not work.
draft-ietf-softwire-dslite-radius-ext-04 The PROTO writeup says this draft "passes nits". However, running the ID nits tool on the draft yields an error related to a dowref. Adrian has already a discuss on it because the downref did not seem to be called out during the IETF LC. That discuss needs to be resolved before moving this draft forward. Also, acronyms need to be expanded on their first use.
Notes from the dns-directorate review: o In the introduction, AFTR should be expanded once, before first use o I'm not too familiar with RADIUS folklore, so this might be obvious to others, but section 4 does not clearly state which format is used for DS-Lite-Tunnel-Name. Section 5 later states that "The data type of DS-Lite-Tunnel-Name is a string", but earlier it is suggested that DS-Lite-Tunnel-Name be fed with the data obtained through DHCP in OPTION_AFTR_NAME, which is clearly in DNS wire format. If this option uses text (presentation) format instead, it would need to say whether it's all ASCII (A-Label) or not (where it is unlikely anybody intended to use U-Labels). The picture at the top of page 9 suggests the DS-Lite-Tunnel-Name has a fixed length (of 6 octets, for that matter), so a more open ended graph as used in the RADIUS RFCs might be advised.
Thanks for addressing my Discuss and comment in the new revision
1) in 4.1,"This attribute MAY be used in Access-Accept packets as a hint to the
RADIUS server; for example if the NAS is pre-configured with a default tunnel
name, this name MAY be inserted in the attribute. "
Is the hint to the server
inserted in the Access-Accept packet (which comes FROM the server), or the
Access-Request (which is sent to the server) packet?
2) in 4.1, "If the NAS is
pre-provisioned with a default AFTR tunnel name and the AFTR tunnel name
received in Access-Accept is different from the configured default, then the
AFTR tunnel name received from the AAA server MUST overwrite the pre-configured
default on the NAS. " Why does this overwrite the preconfigured default? An
administrator may have pre-configured the default; so the RADIUS server is
asserting itself OVER the administrator? Apparently AAA is asserting itself as
the master of configuration OVER both DHCP and any human adminstrators. What
happens when the next person logs into the same NAS? Do they get redirected into
the same tunnel, even though the default would have been approrpiate for them?
Operators have been pretty clear in the past, notably in the COPS-PR discussions
- automated provisioning should NOT overwrite manual-configuration because the
operators may know something that a AAA server or policy server does not. For
example, an admin might modify the default tunnel becaue they are
troubleshooting a reported problem with a given default tunnel; if somebody
requests a session while the admin is troubleshooting, the AAA server could
change the default out from under the operator. That is a bad thing; you could
use a compromise, where there is a setting on the device (or a hueristic) that
says the configured default should NOT be changed by action of the AAA server.
We had a similar debate in ISMS, and determined that it could create security
vulnerabilities and create operations problems to overwrite an adminstratively-
set VACM configuration. See RFC6065 7.2.4
Maybe there should be a mechanism for
an operator to determine that AAA modified the device configuration, and
possibly a mechanism to identify WHICH AAA server made the change.
At a minimum,
the security and operational considerations should be documented on this design
decision.
3) This is a DISCUSS-DISCUSS related to point 2 - I think AAA
proposals are drifting from **authorizing sessions** - REQUIRING specific
settings be used for a specific user session - into **provisioning sessions** -
MODIFYING the settings for functionality used in a specific user session - and
into configuration - modifying the DEVICE CONFIGURATION (such as defaults). This
could affect all future AAA sessions, but also other functionalities. The
device may have been purposefully configured to have settings different than the
session-specific setting required for a particular user. The configuration may
have been set by other means, such as DHCP, Netconf, SNMP, CLI, and so on.
Functions other than the service/session being authorized/provisioned by AAA may
have dependencies on the existing defaults or other non-session-specific
configuration.
Based on my experience with SNMP, Netconf, COS-PR, and the
"Operators' World Tour", I think AAA-overwriting of a **non-session-specific**
device configuration is a mistake.
(I'm not thrilled with AAA expanding from
authorizing to provisioning, but as long as provisioing is session-specific, I
can live with that. But modifying device configuration, especially without
addressing the issues related to coexistence with SNMP/Netconf/DHCP and other
configuration protocols and their data models such as MIBs that might specify
very specific behaviors, is a serious problem.)
4) in 4.1, "The Change-of-
Authorization (CoA) message [RFC5176] can be used to modify the current
established DS-Lite tunnel." Should this be MUST be used, to ensure
interoperability? or maybe RECOMMENDED, with mention of possible alternative
approaches?
5) in 4.1, [...] "Upon receiving the new AFTR tunnel name the B4
MUST terminate the current DS-Lite tunnel and the B4 MUST establish a new DS-
LITE tunnel with specified AFTR."
This normative text (MUST terminate) is within
a paragraph that starts with CoA "can be used" (which is obviously optional). So
does the MUST only apply when CoA is used, i.e., the trigger is the receipt of
the CoA message not just a new name, or does the MUST apply regardless of the
method used to change the tunnel name? Should "Upon receiving" be the start of a
separate paragraph?
6) in 4.1, "The DS-Lite-Tunnel-Name RADIUS attribute and
MUST NOT appear more than once in a message." I cannot parse this sentence
(remove the 'and'?). If it is not just an extra 'and' then I don't understand
the MUST requirement.
7) in 6, I disagree this attribute has no security impact
beyond the basic RADIUS security consdierations. Could a MTM send multiple of
access-requests with different names resulting in a denial of service attack?
Especially since this attribute forces all existing tunnels to be terminated and
re-established, this could generate a lot of traffic, a lot of processing
overhead, and a lot of interruption of user's work..
8) in 4.1, is left-to-right a traditional RADIUS way to describe network-order? would saying network-order be less subject to interpretation differences?
Updated DISCUSS - taking out one issue that was resolved in draft-05.
The DISCUSS and COMMENT is in part based on the reviews and discussions on the
AAA-Doctors list.
1. Figure 2 shows a RADIUS exchange with no NAS present, or user authentication
occurring.
As noted in RFC 5080 Section 2.1.1 describes the requirements for authorization-
only Access-Requests:
Access-Request packets that contain a Service-Type
attribute with the
value Authorize Only (17) MUST contain a State attribute.
Access-
Request packets that contain a Service-Type attribute with value Call
Check (10) SHOULD NOT contain a State attribute. Any other Access-
Request
packet that performs authorization checks MUST contain a
State attribute.
This last requirement often means that an Access-
Accept needs to contain a
State attribute, which can then be used in
a later Access-Request that
performs authorization checks.
The document does not describe the contents of the Access-Request in enough
detail to understand whether it is compliant with RFC 5080, 2865 or other RADIUS
protocol documents. So either this is a protocol violation, or the exchange
described is under-specified.
2. RFC 5176 requires that session identification attributes not be used to
request authorization changes. I am not clear whether the DSLITE-Tunnel-Name
Attribute would be classified as a session identification attribute, but RFC
5176 does classify other IPv6-related configuration attributes (e.g. Framed-
IPv6-Prefix) as session-identification attributes which cannot be changed by
CoA-Request packets. The reasoning is that changing a host's address without
notifying the host is a bad idea, so that it is better to notify the host first
then initiate another Access-Request/Accept sequence than to send a CoA-Request.
(Note 1) Where NAS or session identification attributes are included
in
Disconnect-Request or CoA-Request packets, they are used for
identification
purposes only. These attributes MUST NOT be used for
purposes other than
identification (e.g., within CoA-Request packets
to request authorization
changes).
1. The use of keywords in section 3 seems inconsistent. Why are the 'may' and 'shall' in the second paragraph of the section non-capitalized, while in the rest of the section the keywords are capitalized. > This list may also contain the AFTR Tunnel Name. When the NAS receives a DHCPv6 client request containing the DS-Lite tunnel Option, the NAS shall use the name returned in the RADIUS DS- Lite-Tunnel-Name attribute to populate the DHCPv6 OPTION_AFTR_NAME option in the DHCPv6 reply message. 2. I support item #2 in David's DISCUSS about the need to document the operational considerations of the NAS configuration and device configuration changes. 3. Two of the AAA-Doctors made in early reviews the comment that it was not clear why the authors needed a new AVP when the RADIUS tunnel attributes (RFC2868) could probably be reused here. One of them even proposed an alternative based on RFC2868 but authors argued that their approach was better… that may be true, but it would have been good to document the decision and explain why the alternative was rejected.
draft-ietf-msec-gdoi-update
The draft says it describes an "updated" version of GDOI. However, the draft seems to obsolete the previous spec, not to update it. In any case, this has already been captured in Adrian's discuss.
You resolved my Discuss. Thanks.
(1) Shouldn't DES and other algorithms be deprecated more obviously? E.g. 5.3.2.1 doesn't say not to do this. Same question for MD5 in SIG_HASH_MD5 etc. (2) Has anything been learned about authorizing GKCS's since rfc 3547? If so, wouldn't it be good to include something about that, even if it doesn't provide a fool-proof way to authorize a host as a GKCS? (Text like that would go nicely in 3.1 I think.)
draft-gutmann-cms-hmac-enc
draft-gutmann-cms-hmac-enc-05.txt
This draft, whose intended status is PS, has a dowref to RFC 2898, which is an
Informational RFC. The IETF LC text below seems to have called this down ref
incorrectly:
This specification contains one normative references to a proposed
standard: RFC 2898.
Do we need to re-run the IETF LC or this is just considered a small error in the
IETF LC?
Sorry, added comment for wrong document.
I am balloting No Objection based on a light read and the support of the Sposoring AD. I have several minor obeservations about the document. --- I prefer Individual Submissions to have a third-party document shepherd as a sanity check on the IETF demand for the work. But that is not something to think about fixing at this time. --- It would be helpful if some more of the acronyms were expanded on first use, even if they are familiar to normal security technogeek. For example, you expand CEK but not KEK. --- Using a fixed-length key rather than making it a user-selectable parameter is done for the same reason as AES' quantised key lengths: there's no benefit to allowing, say, 137-bit keys over basic 128- and 256-bit lengths, it adds unnecessary complexity, and if the lengths are user-defined then there'll always be someone who wants keys that go up to 12. "keys that go up to 12" had no meaning to me! Lengths up to 12 words? --- If there is any ambiguity over which key size should be used then it's recommended that the size be specified explicitly in the macAlgorithm AlgorithmIdentifier. Is that "RECOMMENDED"?
(1) The AES-256 justification is counter productive. People have computers
waste cycles this way, encouraging them on the basis of mistaken user
impressions is not a good approach.
(2) I assume the use of PBKDF2 is because you claim its better supported. If so
say so. I've seen programmers use PB* functions because they know what a
password is when those are enormously wasteful in CPU. I'd be more comfortable
with something with no iteration count temptation. (Your own argument based on
users' flawed logic would imply that someone will want to use 1000 rather than
1 since that is 1000-times-better.) With PBKDF2, why not say that you MUST
use an iteration count of 1 unless the underlying key has low entropy in which
case you MUST use 1000 (or whatever)?
(3) A 3DES example when AES is what you defined is a bad idea.
Why not generate an AES-128 example?
(4) There is no MTI encryption alg (or if there is I missed it on the
plane where I read this). IMO AES-128 should MTI - why am I wrong?
(1) "extensively analysed" as a claim requires some references (2) "keys that go to 11" will not be understood by many readers, keeping the colourful antipodean twist on US phrasing is fine but you also need to make what you mean clear to the non native English reader. (3) "various analyses" requires a reference (4) Its fine for an RFC author to write amusingly, and especially fine to do so when criticising stuff in emails, (I really do appreciate that), but showing off is less desirable in RFC text since it leads to more confusion. IMO this text tries to show off too much. I can't be bothered to make so much of this as to do a blow-by-blow analysis, but the author might consider whether less of that would improve overall clarity. (And Peter - you can feel free to post an apparently outraged mail about how frusty people get within months of getting on the IESG:-) (5) Most other CMS docs provide an ASN.1 module for those that like those. Is there a reason to not do that here?
Please consider the comments from the Gen-ART Review by Alexey Melnikov on 8 July 2011. The review can be found here: http://www.ietf.org/mail-archive/web/gen-art/current/msg06501.html
With tongue slightly in cheek, I note that this paragraph is a bit breezy: Providing an option for keys that go to 11 avoids potential user acceptance problems when someone notices that the authEnc pseudo-key has "only" 128 bits when they expect their AES keys to be 256 bits long. I'm not sure that a phrase from Spinal Tap belongs in an RFC, but at least if it's included can we put "go to 11" in scare quotes and include a citation for the movie? (As for "go up to 12", we all know that amplifiers don't have that setting!)
draft-ietf-tsvwg-rsvp-security-groupkeying
I have updated my Comment to include some points raised by Dimitri Papadimitriou in his Routing Directorate review. Please consider a new revision to pick up all of these points. I support the publication of this document as a useful contribution to the problem of securing RSVP, RSVP-TE, and GMPLS signaling in a scalable manner. There are a few small points that you might like to look at as part of the general document polishing. --- Nit: Section 2 The trust an RSVP node has to another RSVP node has an explicit and an implicit component. Explicitly the node trusts the other node to maintain the RSVP messages intact or confidential, depending on whether authentication or encryption (or both) is used. This means only that the message has not been altered or seen by another, non- trusted node. Implicitly each node trusts each other node with which it has a trust relationship established via the mechanisms here to adhere to the protocol specifications laid out by the various standards. Another explicit element of this trust model is that the peer will have applied at least the same level of authentication with its next hop peer. That is, a message that is received by C from B using authentication, was originated at B, was received by B from A using authentication, or was triggered at B because of the receipt at B of a message from A that used authentication. Without this element of the trust model, the authentication between B and C is not particularly useful. I think your definition of security domain (in Section 1.1) could be used in new text to make this point. --- WIBNI I might have put the final paragraph of Section 2 into Section 3.2 --- Nit: Section 3.1 Most current RSVP authentication implementations support per interface RSVP keys. When the interface is point-to-point (and therefore an RSVP router has only a single RSVP neighbor on each interface), this is equivalent to per neighbor keys in the sense that a different key is used for each neighbor. This slightly neglects the possibility of parallel interfaces to the same neighbor. --- Wrinkle In Figure 4, isn't it the case that one node, say R3, could be in both security domains and apply the group key on a per interface / per neighbor / per IP next hop basis? --- WIBNI Section 5.2 is largely dedicated to FRR. It might be nice to give FRR its own section and leave just the last two paragraphs (P2MP and hierarchy) in 5.2, perhaps renaming it "Other RSVP-TE and GMPLS Functions" --- Smile Section 10.1 A subverted node is defined here as an untrusted node, for example because an intruder has gained control over it. If we knew that a subverted node was subverted, it would be untrusted. And that would be the end of the story. But the problem is that we *do* trust the subverted node! --- I wouldn't like to have to argue in a court of law that some of the references are not normative (e.g. 2205). === Comments from Dimitri's review as follows: A general comment: I've noticed that only "may" (in small letters) is used and the phrasing is sometimes not prescriptive, is it intentional (due to informational status) ? o) Introduction . Suggest to add reference after "It is however often necessary to regularly change keys due to network operational requirements." or summarize the "operational requirements" o) Section 2 . Spell out acronym GDOI o) Section 3.1 . It may be appropriate to explain where/how are the boundaries of RSVP security domains defined. . States "As discussed in the previous section, per neighbor and per interface keys can not be used in the presence of non-RSVP hops." while that Section states "This means that per interface and per neighbor keys cannot easily be used in the presence of non-RSVP routers on the path between senders and receivers." there is a different interpretation one can assume from the term "easily"? . Any specific scenario where a Single Group Key Server across security domains could be applicable ? How do R3 and R4 determine there are in different domains ? afaik the INTEGRITY object doesn't provide such information. . States "Because a group key may be used to verify messages from different peers, monotonically increasing sequence number methods are not appropriate." make clear what is actually recommended e.g. Sequence Numbers Based on a Real Time Clock (Section 3.2 of RFC 2747) o) Section 4.1 . States "Since it is not feasible to carry out a key change at the exact same time in communicating RSVP nodes, some grace period needs to be implemented during which an RSVP node will accept both the old and the new key. " what is recommended/proposed this dual-key temporary state doesn't become permanent and one ends up with a list of keys ? . States "In this solution, a key server authenticates each of the RSVP nodes independently" explain independently from what ? o) Section 5.1 Explain use of group keying for Notify messages between "domains" (referring to Fig.4). o) Section 5.2 . The P2MP case deserves a bit more explanations. In particular, RFC 4875 mentions "An administration may wish to limit the domain over which P2MP TE tunnels can be established. This can be accomplished by setting filters on various ports to deny action on a RSVP path message with a SESSION object of type P2MP_LSP_IPv4 or P2MP_LSP_IPv6." does the present document overrule "filter setting" or complements it when reaching security domain boundaries ? . For RFC 4206, clarify that non hop-by-hop signaling is used to signal LSP tunneled over an FA. o) Section 6.1 Spell out acronyms the SPD and SPI o) Section 6.3 Is the tunnel mode also excluded for hosts attached to the network by a non-RSVP host ? o) Section 7 What is actually referred as "the network" in the following sentence "If the end systems are part of the same security domain as the network itself, group keying can be extended to include the end systems." i.e. is the network the set of nodes enabling host attachment to the network ? o) Section 8 Mentions "From the viewpoint of securing end-to-end RSVP, ..." is it securing RSVP end-to-end or edge-to-edge (edge being the end-points of the MPLS-TE tunnels, the edges of the PCN domain, etc.) ?
(1) If RSVP supported asymmetric authentication of messages then a lot of this could be easier, or at least different, e.g. for inter-domain, public key based schemes might be better. It might be good to mention this possibility here since currently rfc 2747 mandates hmac-md5 and one would assume that that may be revised in future in which case a signature based scheme could be a good addition. So, I'd say that a paragraph mentioning that key management for a putative signature based RSVP integrity scheme could be relatively simple (compared to group key management) would be a fine addition. Or, if there are reasons why key management for a putative signature based RSVP authentication scheme would be equally hard, that'd be good to add as well. (2) Last para of 10.1 - presumably group keying means a subverted node might be less easily detected/traced if it sends fake RSVP messages to a non-neighbour (maybe with a fake source address). Per-neighbour or interface keying would mean that the subverted node has to send the fake stuff to a nearby non-subverted node and will generally therefore be more easily traced.
1) Should the reference to 3547 be swapped out for a reference to draft-ietf- msec-gdoi-update? It's on the same telechat and draft-ietf-msec-gdoi-update obsoletes RFC 3547. They'd both be published at around the same time assuming all goes smoothly. 2) Expand GKS (group key server) in Section 3. It'd be best to explain what it is before Figure 2. 3) In Section 6.2, is it worth pointing out that an additional difference is that AH has some kind of algorithm agility while the INTEGRITY object mechanism does not?
draft-ietf-decade-survey
The draft references the DECADE problem statement draft for further discussions on use cases. Additionally, it might be useful to provide a brief description (e.g., a one liner) of the prototypical use case for DECADE. For example the problem statement draft provides this example: "As a simple example, a peer of a P2P application may upload to other peers through its in-network storage, saving its usage of last-mile uplink bandwidth." In the section on Amazon S3, an additional example of a third-party service (which is very popular nowadays) that uses it would be dropbox.
Minor editorial suggestions. In section 4.5.2: The content provider can access network edge servers and store content on them. Or edge servers can retrieve content from content providers. Combine these two sentences into one with a comma. In section 4.7: A key aspect of NDN is that router have the capability s/router/routers/
This is a "please discuss" DISCUSS ... I don't intend to block things past the
telechat, but just want to make sure we think about one thing with regards to
DECADE now that the documents are starting to come out.
I find the systemic use of "in-network storage" to be misleading and inaccurate
for DECADE, which is not generically about in-network storage but about
provider-supplied upstream-of-access-network storage.
I think the work that's been done in DECADE so far is on-track and I think the
other DECADE documents (problem statement and requirements) along with the
introduction text in this document actually describe the intended scope quite
well; it's just the specific (and heavily used) "in-network storage" that bugs
me ... if there's no better solution, I understand and will be happy to clear
during the telechat, but would appreciate folks to think about it a little and
see if they're similarly bothered.
I keep thinking of the 90s, when in-network storage meant NFS on a LAN of Sun
machines where "the network is the computer", but the DECADE work has nothing to
do with that. Since DECADE is *not* generically about storage within the
network, but totally about storage in specific places within the network,
provided by certain parties, it just seems like there has to be a better way to
describe it that will be more accurate.
Thanks for bearing with me ... I'm interested in hearing what the authors,
chairs, and IESG think.
Thanks for a readable and informative document. I have just a few very petty comments. --- The Abstract says... This document surveys deployed and experimental in-network storage systems and describes their applicability for DECADE. This may be too terse. Replacing "DECADE" with the name of the working group doesn't really help. Can you spell out what you mean by "DECADE" --- Section 1 High-capacity and low-cost in-network storage devices introduces /introduces/introduce/ --- Deploying a CDN for publicly available content is expensive. Is the price really a function of whether the content is publicly available? --- Second, applications may benefit from explicit control of in-network storage, which P2P caches do not provided. s/provided/provide/ --- 4.4.2 CDMI-specific operations are specified in which data objects are embedded as fields inside of a JavaScript Object Notation (JSON) object, but the protocol also defines interfaces in which the contents of data objects can be written via simple HTTP GET/PUT operations. I can't parse the start of the sentence. --- Section 4.5 A Content Delivery Network (CDN) provides services that improve network performance by maximizing bandwidth, improving accessibility and maintaining correctness through content replication. Technically they don't change the bandwidth. I'm not sure they maximize the utilized bandwidth either. But I'm being pedantic. --- 4.7 A key aspect of NDN is that router have the capability to cache the s/router/routers/ --- 4.12 In addition, making use of P2P caches do not require changes to P2P protocols and can be deployed transparently from clients. s/do not/does not/ --- 4.12 P2P caches operate similarly to web caches, in that they temporarily Could use a forward pointer to 4.14
Please consider the comments from the Gen-ART Review by Kathleen Moriarty on 18-July-2011. The review can be found here: http://www.ietf.org/mail-archive/web/gen-art/current/msg06526.html
draft-ietf-opsawg-oam-overview
The following is an edited version of the comments received
from the Routing Directory Review.
Summary:
There are significant concerns that need to be addressed
before publication.
The document does no needs to clearly identify the target
audience. Since a document written as a tutorial for a beginner
has different requirements from that written for a subject
matter expert this clarification is important in terms of
expectations in terms of depth and precision of the text. A
tutorial document for the beginner would be most welcome
considering the extent of OAM discussions that have taken
place in the IETF and it is assumed by the reviewer that
this is the intent of the document.
To that end the document needs to -
Include a “Historical Background” session that goes
beyond the single sentence in Section 1 (“OAM was originally
used in the world of telephony, and has been adopted in packet
based networks”)
Provide a clear view of OAM functionality and its relationship
to various “planes” of networking (data plane, control plane,
management plane). In particular, the importance of
fate-sharing of OAM and user traffic flows in packet networks
should be explained.
Explicitly map the ideas, terms and methods that have been
adopted from technologies owned by ITU-T and/or IEEE to
IETF-owned technologies. If such a mapping is not possible,
it should be explicitly stated.
Explain in a neutral way points of contention regarding
various OAM-related issues.
The draft as written is is a partial annotated list of
references to IETF and non-IETF protocols and mechanisms that
deal with certain aspects of OAM in IP, IP/MPLS, MPLS-TP
and Ethernet networks. The draft does not describe the underlying
reasons for selecting particular protocols for description.
It is not clear why the now obselete is ITU-T Y.1711 considered
in detail. The reviewer proposed giving consideration to I.610
as a protocol, although I am not sufficinetly familiar with
I.610 to determine its relevence. It should however be examined.
Similarly it may be useful to introduce the reader to E-LMI
(defined by MEF).
In terms of MPLS-TP why is there no discussion of MPLS-TP
fault management OAM - (draft-ietf-mpls-tp-fault-05) is omitted??
There are a number of readibility issues that arrise from the
terms and concepts taken from the referenced documents
having different meaning in these documents. E.g.,. in Section 4.1
the draft states that ICMP ping provides “connectivity
verification for Internet Protocol”. However, in Section 3.2.4
the draft says that “connectivity verification function allows
an MP to check whether it is connected to a peer MP or not”.
Since MPs are not mentioned with regard to ICMP, it is not
clear whether “connectivity verification” means the same thing
in these two cases.
In some cases the text is detailed beyond the needs of the
beginner, whilst other imporatnt concepts are not detailed
sufficiently for example:
- The OWAMP TCP port information is not needed, whilst the IPPM
- In Section 3.2.3 the draft defines the term “Maintenance
Entity” (ME), whilst “Maintenance Entity Group” (MEG), a.k.a.
“Maintenance Association (MA), is only defined by reference
- In Section 4.5.2 the draft mentions security aspects of
IPPM protocols. Howeverwhilst, these aspects are not even
mentioned in Section 4.2. discussing BFD.
The document therefore needs another pass to ensure
consistency of detail.
Major Issues:
The concepts of data plane, control plane and management
plane are not well explored in the draft and need to
expained with their OAM context.
=======
The relationship between OAM functionality and network
management as presented in the draft is unclear.
For example
a. (Section 1) Other aspects associated with the OAM
acronym, such as management, are outside the scope of
this document <<Management is out of scope>>
b. (Section 4.6.4) The FDI function is used by
an LSR to report a defect to affected client layers,
allowing them to suppress alarms about this defect
<< Alarms are arguable part of management >>
c. (Section 4.7.2) When the ETH-CC function detects
a defect, it reports one of the following defect conditions:
i. Loss of continuity (LOC): Occurs when at least when
no CCM messages have been received from a peer MEP during
a period of 3.5 times the configured transmission period
iii. Unexpected period: Occurs when the transmission
period field in the CCM does not match the expected
transmission period value << Since transmission period
field in ETH-CC is defined by management, this defect
reports a management issue>
d. (Section 4.7.6) The Alarm Indication Signal
indicates that a MEG should suppress alarms about
a defect condition at a lower MEG level, i.e., since
a defect has occurred in a lower hierarchy in the
network, it should not be reported by the current node
<<Alarms’ suppression again…>>
e. (Section 4.7.9) The Y.1731 standard defines the
frame format for Automatic Protection Switching
frames. The protection switching operations are defined
in other ITU-T standards. <<Whether PS is part of
OAM seems to depend on which SDO is considering the
problem and this needs to be made clear to the reader>>
3. OAM in connectionless vs. connection-oriented networks:
a. (2a) above suggests that OAM is applicable only
to connection-oriented networks (if you do not have
connections, connection problems do not exist by definition)
b. At the same time, the draft discusses ICMP Ping
(Section 4.1) operating in connectionless IP networks,
and Ethernet OAM (Sections 4.7 and 4.8) operating
in connectionless Ethernet networks.
The authors should define the scope of OAM explicitly
and clearly - and then remove the sections dealing with
protocols and mechanisms that happen to be out of this
scope. In particular, explaining the relationship of
each specific defect to a specific networking plane.
MEs, MPs, MEPs and MIPs
Caveat: It may well be that the problem is not with the
draft but with the concept itself (or at least with the
attempts to extend it to IP, IP/MPLS and MPLS-TP networks)
Consider the following statements:
1. (Section 3.2.2) A Maintenance Entity (ME)
is a point-to-point relationship between two Maintenance
Points (MP). The connectivity between these Maintenance
Points is managed and monitored by the OAM protocol.
A pair of MPs engaged in an ME are connected by a Communication Link
2. (Section 3.2.3) A Maintenance Point (MP) is a
functional entity that is defined at a node in the
network, and either initiates or reacts to OAM messages.
A Maintenance End Point (MEP) is one of the end points of an ME,
and can initiate OAM messages and respond to them. A Maintenance
Intermediate Point (MIP) is an intermediate point between two MEPs,
that does not initiate OAM frames, but is able to respond to
OAM frames that are destined to it, and to forward others.
3. (Section 3.2.3) The 802.1ag defines a finer distinction
between Up MPs and Down MPs. An MP is a bridge
interface, that is monitored by an OAM protocol…
4. (Section 4.1) ICMP provides a connectivity
verification function for the Internet Protocol… ICMP is
also used in Traceroute for path discovery.
An OAM beginner would not be able to answer the following
questions:
1. Can a communication link exist without any
MPs on it?
2. Suppose that I have defined a P2P bidirectional
communication link with two MEPs forming an ME. What
would happen to this ME if I add a MIP between the two MEPs?
3. What is the relationship (if any) between MEPs
and interfaces? Or is it just something specific to Ethernet bridges?
4. Does a MIP really forward OAM frames that
are not destined to it?
5. Operation of ICMP Ping does not require creation of
MPs. How does it provide a connectivity verification function for IP?
The authors need to remove conflicting definitions, to fix typos
(e.g., the definition of ME would be less problematic if it referred
to a pair of MEPs and not to a pair of MPs) and inaccurate statements
(in IP, IP/MPLS and MPLS-TP MIPs (as a component) do NOT forward
OAM packets that are not destined to them – but they do
that in Ethernet OAM).
Minor Issues:
Connectivity Check vs. Continuity Check
The draft mainly uses the term “Continuity Check”. However,
in some places the term “Connectivity Check” is used as well, e.g.:
1. (Section 4.12) A key element in some of the OAM standards
that are analyzed in this document is the continuity check. It is
thus important to present a more detailed comparison of the
connectivity check mechanisms defined in OAM standards.
2. (Section 4.3) LSP Ping extends the basic ICMP Ping
operation (of data-plane connectivity and continuity check)…
Please look at the use of the terms and ensure they are
applied consistently.
Caveat: Similar inconsistency in IEEE 802.1ag (but not in ITU-T Y.1731).
Continuity Check vs. Connectivity Verification
In Section 3.2.4. the draft refers to RFC 5860 as the ultimate
source of information about the difference between Continuity
Check and Connectivity Verification. Looking up RFC 5860 (Section 2.2.3),
I’ve learned that connectivity verification is a function that
allows an End Point to find out whether it is connected to a specific
End Point(s) by means of an expected PW, LSP or Section. At the same
time, the draft says (in the same Section 3.2.4) that “A connectivity
verification function allows an MP to check whether it is connected
to a peer MP or not”. The omitted words from RFC 5860 “by means of…”
make such a definition unclear; also it is unclear whether End Points
(of Section, LSP or PW) which, presumably, are MEPs, can be
extended to be MEPs or MIPs (the draft uses the term MPs).
It is also not clear whether the draft considers LSP Ping (see
Section 4.3.) functionality “to verify data-plane vs. control-plane
consistency for a Forwarding Equivalence Class (FEC)” as related to
Connectivity Verification. This is especially strange since the
draft also states (in the same section) that “LSP Ping extends the
basic ICMP Ping operation” while Section 4.1 states that “ICMP
provides a connectivity verification function for the Internet Protocol”.
Another problem is the statement (in Section 4.2.3) that “BFD
Echo provides a connectivity verification function”, especially
since draft-ietf-mpls-tp-cc-cv-rdi-05 in Section 3.5 expands
format of the BFD control packets in order to provide CV function,
while BFD Echo is not even mentioned in this document. It might be
worth noting that we are not considering BFD Echo mode for MPLS-TP.
Finally, the draft does not explain whether there is any
correlation between the defects detected by the continuity
check and those detected by connectivity verification
(Section 4.10.3.1 looks a logical place for this).
Inaccurate Representation of IEEE 802.1ag
In Section 3.2.3 of the draft theer is the following text:
“The 802.1ag defines a finer distinction between Up MPs and Down
MPs. An MP is a bridge interface, that is monitored by an OAM
protocol either in the direction facing the network, or in the
direction facing the bridge. A Down MP is an MP that receives
OAM packets from, and transmits them to the direction of the
network. An Up MP receives OAM packets from, and transmits
them to the direction of the bridging entity”.
However IEEE 802.1ag states (see Section 22.1.3 of that
document ) that: “All Up MEPs belonging to MAs that are attached
to specific VIDs are placed between the Frame filtering entity
(8.6.3) and the Port filtering entities (8.6.1, 8.6.2, and 8.6.4).
Separately for each VLAN, there can be from zero to eight Up
MEPs, ordered by increasing MD Level, from Frame filtering
towards Port filtering”.
That seems to imply that 802.1ag MEPs are NOT bridge interfaces
(since there can be are multiple MEPs per VLAN and multiple
VLANs per bridge interface).
Defects, Faults and Failures
In Section 3.2.5 the draft discusses the terms Defect, Fault
and Failure. However, these terms seem to apply to the
“communication link” the term needs to be clarified to
indicate that this is a data plane entity, or the term data
plane used in its place.
At the same time, “Unexpected Period” and “Unexpected MEP”
are mentioned as defects detected by ETH-CC in Section 4.7.2
even if, to the best of my understanding, these conditions
are side effects of mis-configuration i.e., a management plane problem.
VCCV: An OAM Mechanism or a Control Channel?
In Section 4.4. the draft states that VCCV “provides
end-to-end fault detection and diagnostics for PWs”.
This seems to point that VCCV is an OAM mechanism/protocol.
However, later in the same section is states that “The
VCCV switching function provides a control channel associated
with each PW… and allows sending OAM packets in-band with PW data”.
And on the next line it explains that “VCCV currently supports
the following OAM mechanisms: ICMP Ping, LSP Ping, and BFD”
(which are all mentioned as OAM mechanisms providing
continuity check and/or connectivity verification in the draft).
So it remains completely unclear whether VCCV is an OAM
mechanism or just a channel for separating user data from OAM flows.
The issue here may well be historic because VCCV predates
the modern ACH mechanism. This should be clarified in the text.
MEs, MEGs and MEG levels
The draft explicitly defines a Maintenance Entity (ME) in
Section 3.2.2, but
defers to MPLS-TP OAM Framework for
the definition of the Maintenance Entity
Group (MEG). The
text defining ME in the draft differs from that in the
MPLS-T_ OAM Framework document
(see http://datatracker.ietf.org/doc/draft-
ietf-mpls-tp-oam-framework/?include_text=1, Section 2.2).
At the same time, it
resembles the definition of ME in
Section 3.1 of this document.
MEG level is mentioned a couple of times in the draft,
but the only explanation given (in Section 4.7.2) is
“The MEG level is a 3-bit number that defines the level
of hierarchy of the MEG”; and this seems to be the only
text in the draft that deals with MEG hierarchy. A more
details description should be provided.
Differences between Approaches to Packet/Frame Loss Measurement
There is no description the fundamental difference
between two approaches to measuring packet loss –
that of the IPPM WG (based on counting synthetic packets)
and that of Y.1731 (based on counting the user packets),
even if both are mentioned in the draft. MPLS-TP BTW
provides a tool for doing loss measurement and notes
that the instrumentation technique is independent of
the method of making the measuremnet.
Unidirectional/Bidirectional OAM vs. One-way/Two-way OAM - Both pairs of terms are used in the draft (One-way/Two-way - in Section 4.5.1, Unidirectional – in section 4.12, /Bidirectional – in Section 4.2.2). Neither the terms nor their equivalence are explained in the draft. In section 4.10.3.1: “Continuity Check and Connectivity Verification (CC-V) are OAM operations generally used in tandem, and compliment each other.” – probably should be “complement”? “There are a few differences between the two standards in t erms of terminology” do you mean: “There are a few differences in terminology between the two standards”.
Minor editorial suggestions... In section 3.2.5, the word "intermittently" doesn't seem right. Perhaps "interchangeably"? I was OK with this sentence in section 3.2.5: The terms Failure, Fault, and Defect are intermittently used in the standards, [...] until I read in the next paragraph that ITU-T differentiates among the three terms. Perhaps the quoted sentence should specify which standards? Also in the title of 3.2.6?
IPPM has defined other metrics that aren't mentioned here (e.g. duplication and reordering) ... is there a reason why those aren't included? It was also unclear if psamp, netflow, and ipfix were excluded for a reason.
In general, Informational documents are unlikely to do harm to the
Internet and so do not so easily attract Discusses. However, I find
a number of errors of ommission that mean that the document may
result in confusion. You may be able to fix these by clarifying the
scope of the document, or by adding text.
---
I don't think your definition of OAM is tight enough to limit the
discussion to only the mechanisms that you describe. You refer to OAM
being scoped as:
this document refers to OAM in
the context of monitoring communication entities, e.g., nodes, paths,
physical links, or logical links.
And you define:
Operations, Administration, and Maintenance (OAM) is a general term
that refers to a toolset that can be used for fault detection and
localization, and for performance measurement.
You also talk about
OAM mechanisms are used in various
layers in the protocol stack, and are applied to a variety of
different protocols.
With these definitions:
- a control plane module is possibly a communication entity
- a node component (e.g., a CPU) is possibly a communication entity
- a protocol keep-alive for an out-of-band protocol is possibly an OAM
mechanism
- a protocol keep-alive for an in-band protocol is definitely an OAM
mechanism
- LoS and LoL mechnisms in optical networks are OAM
- Control plane error propagation and falt isolation mechanisms are OAM
- Management plane tools and protocols are OAM (notwithstanding that
you say:
Other aspects associated with the
OAM acronym, such as management, are outside the scope of this
document.
...because management tools may be used for fault detection and
localization, and for performance measurement.
You appear to have limited yourself to monitoring layer-3 and below
forwarding plane components of packet/frame networks. Please take the
time to
scope this document more clearly.
---
You should include a discussion of multicast ping and multicast
traceroute (draft-ietf-mboned-ssmping and draft-ietf-mboned-mtrace-v2)
---
Section 4.2.1
You cite [MPLS-TP Ping BFD] as an example and give it as a reference,
but this work has been retired by the MPLS working group, and the
relevant material moved to another document.
The Ballot Text write-up seems missing the Technical Summary.
---
I'm nervous of a document that makes a comparative analysis of OAM
mechanisms developed in another SDO without seeking input from that
SDO.
---
idnits warns about the unnecessary 2119 boilerplate and the unresolved
references. There is no reason for an I-D to reach this stage with
theese warnings. Please clean up before passing to the RFC Editor.
---
You say:
o ICMP Echo request, also known as Ping, as defined in [ICMPv4], and
[ICMPv6]. ICMP Ping is a very simple and basic mechanism in
failure diagnosis, and is not traditionally associated with OAM,
"Traditionally" gives me an image of my great grandfather hand-crafting
packets from kiln-dried apple wood.
You might want to find out which tools are most commonly used by network
operators to diagnose their networks. According to that research and
your definition of OAM, you will possibly find that ICMP Ping is very
much associated with OAM.
---
Odd that Section 1 calls out MPLS-TP and RFC 5860, butdoes not call out
RFCs 4377 and 4378.
---
Table 1 seems confused about whether it needs to make citations (in
square brackets). It does not need to state "work in progress" for
I-Ds that are referenced and marked as such in the references section.
---
Table 1 seems to be missing some of the references used in the text.
For example for p2mp LSP ping. Can you do a cross-check with the text?
Actually, the table seems a bit mixed. Some protocols are listed, while
in other areas you just list the requirements and frameworks.
---
Did you consider discussing permformance metrics at other layers as
part of the diagnostic toolset? You certainly seem open to OAM at
"various layers." Have a look at draft-ietf-pmol-metrics-framework
and maybe think about RFC 6076.
---
Section 3.1
Add ACH, ETH, FEC, GAL, LDP, LOC, LOCV, MC, MTU, UC
LSP is a Label Switched Path
I thought the 'M' in ME and MIP stood for MEG
---
Section 3.2.6
The table shows "System" for BFD Maintenance Point Terminology. It is
not clear to me what that word means.
---
Section 4.12
|BFD |BFD |Negotiat|UC |My Discr| Control Detection Time |
| |Control|ed durin| |iminator| Expired |
"My Discriminator"? Who are you?
---
I should have liked Section 5 to have included a discussion of the
security considertions of OAM in general, and the security provisions
available for the various OAM mechanisms discussed.
---
Should you include RFC 4950?
---
NEW COMMENT
I wonder if you need to also consider draft-ietf-trill-rbridge-channel
Following the Gen-ART Review by Francis Dupont on 23-Jul-2011,
there was discussion that lead me to expect changes to this
document where traceroute is discussed. The authors have not
made the changes yet.
Section 2 should be removed. This document does not use 2119 keywords, and the [KEYWORDS] citation is missing in References anyway. And now, some snark and sarcasm for the amusement of my fellow ADs and anyone else who cares: <sarcasm>My ballot notwithstanding, I hereby object to the fact that this document (a) defines OAM and (b) does not normatively reference RFC 6291/BCP 161. (*snort*)</sarcasm>
1. The use of the term "localization" in the Abstract is potentially confusing,
since localization in application protocols refers to presenting textual strings
that are appropriate for a given locale. Perhaps the term "isolation" might be
more appropriate?
2. This paragraph is confusing:
o IP Performance Metrics (IPPM) is a working group in the IETF that
defined common metrics for performance measurement, as well as a
protocol for measuring delay and packet loss in IP networks.
Alternative protocols for performance measurement are defined, for
example, in MPLS-TP OAM [MPLS-TP OAM], and in Ethernet OAM [ITU-T
Y.1731].
As far as I can see, MPLS-TP OAM and Ethernet OAM were not developed in the
IETF's IPPM WG; I suggest moving the second sentence to a separate paragraph.
I expect this one will be easily cleared (author has agreed to add some
information in the security considerations):
Tal: I think we can add a general description (1-2 paragraphs) of the security
threats in OAM protocols, and list the security mechanisms defined in the
protocols discussed in this document.
secdir reviewer (Paul Hoffman): A short list in the Security Considerations
section saying briefly what each OAM protocol currently has for security (even
if it is "well, um, none") should be fine for a "catalog" document such as this.
draft-shiomoto-ccamp-switch-programming
Please consider the comments from the Gen-ART Review by Ben Campbell on 14-July-2011. The review can be found here: http://www.ietf.org/mail-archive/web/gen-art/current/msg06524.html
draft-burgin-ipsec-suiteb-profile
I am balloting No Objection after a quick read and based on the support of the Responsible AD
(1) I'd like to be reassured that nothing here requires implementers to
add some suite-B-specific, but non interoperable code to a node not
trying to be suite-B conformant but otherwise doing all the right
algorithms at the right sizes. If we had that problem then suite-B
would no longer be a simple profile but would become something close to
a national algorithm. (In terms of the non-interoperable aspects that
would then exist.)
(2) Saying you MUST NOT use e.g. RSA is wrong. I think you need to
qualify that by saying "when operating in suite-B mode" or something
like that. The same applies elsewhere. I don't think we want to
encourage suite-B-only implementations but rather implementations that
can be operated in a suite-B-conformant mode so I think some generic
text saying that anything in this entire document only applies "when
operating in suite-B conformant mode" or something like that would fix
this.
(3) The SP* references look normative to me not informative.
(4) The IPR declaration refers to a "Standard" so I've no idea if
its relevant for this document or not.
(1) MUST is used before section 2 (2) What does "having" an X.509 cert mean for interop? I think you want to say use somewhere. (3) Saying "using the curves with foo" is a little unclear - maybe say "using the curves with foo specified in bar section baz" would be clearer. (4) "appear in the literature with different names" maybe give references (5) 4.2 says "each system MUST specify" but systems don't specify, specifications do. Suggest rewording.
(I expect this will be cleared during the call)
Is it common practice to publish documents of this sort (with normative
requirements for IPSec) as "Informational"? It seems rather odd.
draft-law-rfc4869bis
Assuming the appropriate IANA Expert Review has been done.
My dissuss point about the IPR declaration for draft-burgin-ipsec-suiteb- profile also applies here. Its probably ok for this to be a comment though.
draft-forte-lost-extensions
Just an observation - I found the example in section 5.2 a little
confusing, because it appears to conflate "user location" with "within
distance" in Figure 3. I would have thought the user location would
be just the "O" under "User" and the circular shape ("*" border) to
represent the "within distance" shape. If the user location really is
represented by the circular shape because of uncertainty, wouldn't the
"within distance" shape then be a larger circular shape?
Later on, I see that the "within distance" is represented by setting
the uncertainty in the user's location. Seems like an odd overloading
but it's workable now that I understand it.
I have no objection to the publication of this document, but I am surprised that there is no concept of reachability included in this work. Think bridges over rivers, borders without holding a passport, long and winding roads.
Strictly editorial: Section 5 is generally too verbose with unnecessary verbiage and unnecessary examples. This is a protocol specification, not an academic paper. It should be shortened. I was back and forth on DISCUSS for the following, but I trust you will take the following into account and adjust as is reasonable. 5.4 - "Limit" is the wrong semantic for this element. "Limit" implies only a limited number, but this has the secondary meaning of an *order* of returned elements (that is, by distance), something not required by LoST. But I could see extensions having a different ordering (by travel time; by current wait time at the location; by price of product; etc.) where you would still want a limit, but not want the order returned to be by distance. I can also see a use for ordering by distance, but not having a limited number. I suggest introducing a different element for ordering of results independent of limiting the number. 5.5 says: We introduce a new element, namely <serviceLocation>. The <serviceLocation> element contains the location of a point of service and SHOULD be used for all non-emergency services. I don't understand the SHOULD there. I can see emergency cases (hospitals) where location is a good thing; I can see plenty of non-emergency cases (delivery services) where location is uninteresting. I see nothing harmed in interoperability by not providing location. There doesn't need to be a 2119 directive here. 6 - I don't understand why the extension SHOULD NOT be used for emergency services. Does RFC 5222 allow a LoST server to fail in the face of extensions it does not know?!? If so, I would agree with the SHOULD NOT here, but would think an update to 5222 is required /tout de suite/!
Here are suggestions regarding a few small points you might want to clarify or correct. 1. You might want to change "walk or drive" to "travel" (thus including bicycles, boats, horses, jetpacks, etc.). 2. You might want to note that neither "urn:service:food.pizza" nor "urn:service:local.pizza" has been registered with the IANA. (Also, is there a difference between those two services?) 3. Given that you are re-using the "xsd:boolean" datatype for the <region/> element, you might want to add an implementation note about the fact that W3C XML Schema has two different lexical representations for boolean: "1" or "true" vs. "0" or "false". [formerly a discuss topic, and I discussed it with the responsible AD...] 4. In Section 6, you might consider adding a few words to the effect that finding localized emergency services for purposes other than routing emergency sessions (e.g., fire station "safe sites") would need to use newly-defined service types other than "sos".
#1) Section 8: Isn't the framework in 5222 not 5582? 5582 is "Location-to-URL Mapping Architecture and Framework" and 5222 is the actual LoST protocol/architecture. #2) Section 8: It's probably worth adding to the end of the 1st paragraph that these exchanges all normally happen over TLS (i.e., here's how it works and here's how it's protected). #3) As noted in the secdir review, it might be worth nothing somewhere that though out-of-scope the information is likely more volatile in a commercial context since the issues probably does not arise (at least as frequently) in the RFC 5222 context. #4) Also noted in the secdir review, it might be worth adding that a server can adjust a request in order to provide service would be helpful. Does there need to be some means for the server to indicate to the client that there may be some additional related responses that can't be retrieved using the provided request? If not, it seems like there may be some blind spots for some queries.
draft-simpson-isis-ppp-unique
This document proposes an extension to the ISIS protocol to allow an
intermediate system to automatically select a system identifier, to
automatically detect a system identifier conflict, and to negotiate an new
system identifier in the event of a conflict.
The document has been reviewed by a number of ISIS experts all of whom have
raised various issues with the document. The discussion and resolution of these
issues needs to take place within an IETF working group that is competent to
deal with the ISIS protocol. Ideally this would be the ISIS working group,
although with an appropriate change to title and document scope and
applicability text it might be dealt with in the TRILL WG with review in the
ISIS WG. Details of the email threads and reviews will be made available to the
ISE on request.
I have discussed the matter with the ISIS Working Group Chairs and they confirm
that they are not in receipt of any email from the author asking them to
consider the subject matter contained in the draft as an item for working group
discussion.
Given the above my recommended RFC5742 response from the IESG to the ISE is:
"The IESG has concluded that this document extends an IETF protocol
in a way that requires IETF review and should therefore not be
published without IETF review and IESG approval.
If the ISE decides to proceed with publication without the requested
redirection
to the ISIS (or TRILL) WG, the IESG requests the opportunity to
add an
IESG note to the document before publication."
ID Nits says the following:
-- Obsolete informational reference (is this intentional?): RFC 1220
(Obsoleted by RFC 1638)
-- Obsolete informational reference (is this intentional?): RFC 1717
(Obsoleted by RFC 1990)
ISIS WG seems like the right place for this document to be handled.
I see no reason not to support Stewart's 5742 action.