IESG Narrative Minutes
Narrative Minutes of the IESG Teleconference on 2013-03-28. These are not an official record of the meeting.
Narrative scribe: John Leslie and Susan Hares (The scribe was sometimes uncertain who was speaking.)
Corrections from: Barry
1 Administrivia
2. Protocol Actions
2.1 WG Submissions
2.1.1 New Items
Telechat:
Telechat:
Telechat:
Telechat:
2.1.2 Returning Items
2.2 Individual Submissions
2.2.1 New Items
2.2.2 Returning Items
3. Document Actions
3.1 WG Submissions
3.1.1 New Items
Telechat:
Telechat:
Telechat:
3.1.2 Returning Items
3.2 Individual Submissions Via AD
3.2.1 New Items
Telechat:
3.2.2 Returning Items
Telechat:
3.3 IRTF and Independent Submission Stream Documents
3.3.1 New Items
3.3.2 Returning Items
1257 EDT break
1302 EDT back
4 Working Group Actions
4.1 WG Creation
4.1.1 Proposed for IETF Review
4.1.2 Proposed for Approval
4.2 WG Rechartering
4.2.1 Under evaluation for IETF Review
4.2.2 Proposed for Approval
5. IAB News We can use
6. Management Issues
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
Telechat:
7. Agenda Working Group News
1402 EDT Adjourned
(at 2013-03-28 07:30:08 PDT)
draft-ietf-tsvwg-byte-pkt-congest
""" Bit-congestible vs. Packet-congestible: If the load on a resource depends on the rate at which packets arrive, it is called packet- congestible. If the load depends on the rate at which bits arrive it is called bit-congestible. """ It might be helpful to note that these are not mutually exclusive, i.e., that there are devices that have both of these properties. For example, a queue in a buffer that drains to a route engine. In such cases, the congestible property of the overall resource is the more constrained of the individual resources.
Thanks for a really well written document! - Is the discussion about Non-malicious transports on p12 really about transport protocol designers or about implementers? Seemed more like the latter to me, but the text reads more like the former. - 3.2, are HTTP GETs really small these days? Many are not (e.g. thanks to cookies and other crappy headers). I'd also wonder about SIP with all its headers too. That might be an argument for your approach here too - for some protocols, messages that are important for performance may start out nice and small, but success and inevitable crudifying might well make those larger over time, so preferring smaller packets in the n/w might mean you get worse over time for exactly those protocols where its important to not get worse over time (the ones that succeed).
I have done a basic review of discussion and the document; I'm awaiting for a possible Gen-ART review on this document, and if one arrives, it may update may position.
I think Sean is on to something regarding status and the Updates metadata, but I think it's even more interesting than he indicates. 2309 was an IRTF document, which explains why it is Informational. If we have now gotten to the point that the recommendations in 2309 really are all requirements (except for the preferential treatment of small packets), doesn't that justify bringing that document into the IETF stream in addition to this one, or incorporating the recommendations of 2309 into this document? If this is only updating the research in 2309, then it is appropriate to keep this document Informational as well. But if this document deserves a "higher" status, it sounds like (at least the content of) 2309 does too. Please explain. I'd also like to hear a bit about the status itself. BCP is usually for recommendations of policy and operational guidelines. But the things in this document sound more like protocol recommendations, and things that we'd get implementation experience with over time, not things that should instantly jump to the "done" level of a BCP. Is there a reason this isn't going for the standards track instead of BCP? It sounds like protocol to me. Finally, if this is to be a BCP, I wonder if it should be folded into BCP 41 and not made a new BCP all by itself. I'd like to hear if you think this is part of the set of overall "Congestion Control Principles" or has some reason to stand alone. An explanation of how BCP 41 (RFC 2914), RFC 2309, and this document fit together would be quite useful.
2.1, 2.2, and 2.3: The recommendations are all given in the form of RECOMMENDEDs, SHOULDs, and SHOULD NOTs, yet there is no indication when these choices might *not* be taken, and in fact 2.1 makes it sound like there is "no other choice". Is there a reason these are not put in the form of MUST, etc.?
I feel the flames at my heels, but I have to ask or maybe state what I think going on with the updates/intended status mix (likely no action required by the authors): This document is obsoleting a particular portion of RFC 2309 and then specifying a new BCP around the new recommendation? The rest of RFC 2309 is staying at informational? Is there harm in publishing this without including the updates header or maybe with obsoletes 2309 instead?
draft-ietf-forces-lfb-lib
Minor XML typo: OLD: <load library="BaseTypeLibrary", location="..."/> NEW: <load library="BaseTypeLibrary" location="..."/>
Apologies in advance, I'm very very ignorant of ForCES. (That'll be clear as you read below;-) - I think the security considerations section should reference 5811 which I think specifies the MTI security for forces. That specifies SCTP/IPsec, which made me wonder how much use that actually gets. - Is it possible to brick an FE by loading in a instances that create an infinite loop? If so, that'd be worth a mention in the security considerations maybe. - I was surprised not to see mention of WiFi/802.11 here and wondered if/how wireless ports might differ from wired and whether or not that ought be represented somewhere in this document. (Is ForCES just not for such devices? That's ok if so, I just wondered and haven't read other ForCES RFcs.) - p6, "LFB Class and LFB Instance - LFBs are categorized by LFB Classes. An LFB Instance represents an LFB Class (or Type) existence. There may be multiple instances of the same LFB Class (or Type) in an FE. An LFB Class is represented by an LFB Class ID, and an LFB Instance is represented by an LFB Instance ID. As a result, an LFB Class ID associated with an LFB Instance ID uniquely specifies an LFB existence." Huh? What's an "existence"? I found this definition unclear fwiw but I think I get that each instance has a class and that the instance and class identifiers together provide a way to uniquely identify an instance. - p6, definition of "ForCES Protocol" says: "This document defines the specifications for this ForCES protocol." but then at the end of the definitions you say "The LFB Class Library is defined by this document." which seems odd. Is the first one a cut'n'paste error? - s3, intro, 1st para, 1st sentence: 5810 isn't the framework, 3746 is or else something has the wrong title;-) - 4.4: I didn't check the schema, I do hope that someone's checked it vs. their code etc. - section 5, last para before 5.1: does this mean that nodes (FEs) MUST ignore XML elements/attributes that they don't understand/expect? If so, saying it that way would be better IMO. - 5.1.1.1 introduces the term "singleton" without defining it, and its not clear to me what it does mean. (That might be because that term has a specific meaning in DTNs e.g. as defined in rfc 4838, that's different but not entirely different and that that confuses me:-) - 5.1.2.1 last para: "This document does not go into the details of how this is implemented; the reader may refer to some relevant references." That doesn't seem very helpful. - 5.1.2.2, typo: "The default value for is 'false'" (occurs more than once) - 5.1.2.2, 3rd last para: how does the FE generate an error for something it doesn't implement if the rule is that FE's ignore unknown XML elements? I'm confused now between this and the text just before the start of 5.1. - 5.3.3, shouldn't you add references for ECMP and RPF? (Good to do even if you're not fully doing the work here, since you do talk about 'em a bit.) - 5.3.3, I'm not clear how e.g. you'd need another LFB to support ECMP but yet but yet 5.3.3.1 says "An ECMP flag is defined in the LPM table to enable the LFB to support ECMP." - 5.4.1, says "Note that all metadata visible to the LFB need to be global and IANA controlled." I'm not sure what you mean by that.
A nit: Section 3.1., paragraph 11: > * Fragments datagrams when necessary to fit into the MTU of the > next network. It is not the 'next network' but rather the MTU of the next link/interface, as the NE instance cannot know the MTU restrictions of the whole network path.
I'm holding a Discuss to ensure that questions from IANA are answered. Mail on
this topic came from Pearl Liang on Feb 8th and March 18th. Let me know if I
missed a response; I may not have been on all lists at the time.
I would also suggest that the document use RFC 5226 terms for the private
ranges. For instance, in Section 10.2:
Metadata ID 0x80000000-0xFFFFFFFF
Metadata IDs in this range are reserved for vendor private
extensions and are the responsibility of individuals.
=>
Metadata ID 0x80000000-0xFFFFFFFF
Metadata IDs in this range are reserved for vendor private
extensions and are the responsibility of individuals, i.e.,
used according to the Private Use [RFC5226] policy.
A Gen-ART review by Meral included one small editorial suggestion (below). Have the authors considered this change? --- Section 11, the following sentence can be written : "The ForCES protocol document [RFC5810] includes a comprehensive set of security mechanisms and which implementations are required to support, and which deployments can use to meet these needs. " Suggestion: "The ForCES protocol document [RFC5810] includes a comprehensive set of security mechanisms that implementations are required to support to meet these needs."
No objection, but it is a pity that the example is IPv4 and not IPv6 given that IPv4 is in sunset.
draft-ietf-mpls-gach-adv
The authors said they would address the GenArt review from Martin Thomson a week before the telechat, but that seems to have fallen through. My assessment is that the changes will not be major, so there is no need to defer AD evaluation. This Discuss is a place-holder for the work.
The GenArt review by Martin Thomson raised some points that still need to be discussed. <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>
First, thanks for section 6. It was a nice surprise and is nearly but not quite there:-) I hope these discuss points will be easily resolved. (1) 6.1: Why is there no mandatory to implement HMAC hash function? Without that, you won't get interop. I suggest making HMAC-SHA256 MTI. HMAC-SHA1 would also be ok. (I'd also suggest just deleting the other options unless you've a specific reason to want them here.) (2) 9: How do you know that application specific data won't require confidentiality? I can imagine keys being sent using this, as has been done in Diameter. If you don't want to define a confidentiality service then I suggest that you say that application or GAP TLVs that require confidentiality (such as cryptographic keys) MUST NOT be sent in clear using the GAP protocol because someone will write a draft that does just that otherwise. (3) 6.1: If using karp key tables, how do the SendNotBefore, SendNotAfter and similar values from that map to the Timestamp+Lifetime used here? I think you need to say as implementers will make different assumptions otherwise. (I don't care much which of the reasonable options you choose.) (4) 6.2: you need to say what is in the V for the auth data TLV as input to HMAC. Options are to include L zeros as the value or for the V octets to be missing on input to HMAC. See also the secdir review [1] which makes the same point. [1] http://www.ietf.org/mail-archive/web/secdir/current/msg03795.html
- Only specifying how a sender sends but not saying what a receiver does on receipt seems to result in a fairly incomplete protocol. I don't see that the usual "its MPLS, we manage these devices" response works for that. This isn't a DISCUSS though since I guess I could accept you saying "the receiver just puts the TLV into a local database, but its seems fairly weak really. - I assume that there are no congestion issues that are likely to arise with the G-ACh due to the use of this protocol? - section 2: What does the first sentence of the last para of section 2 mean? I can't parse it anyway. - section 3, p6: "An error MAY be logged" - does that mean locally or create some new n/w traffic? If the latter, isn't that a potential DoS vector? - s3, p7: MI definition - how is MI expiry handled? You don't say (here). If that uses the Lifetime from the application specific body, then what if there are many of those? - s3, p8: Why is an editor's note still present? - s3, p8: Lifetime of 16 bits means max is ~18 hours. That seems oddly short for configuration data for presumably heavily managed links and liable to make implementation more complex since the same stuff will have to be sent more than once every day. - s3, p8: The "source-channel-application tuple" seems ambiguous to me - I guess you mean expires-at is the higher level Timestamp+Lifetime, right? Why not say that? - 4.1: can a source IP address represent a multicast group or does it have to be "unique" to the sender? - 4.2: this seems like a nice potential DoS vector too. (Say if caculation of some TLV for which I ask consumes resources for the sender of a GAP message containing that TLV.) - 4.2: So length=0 means "send all" but app-id=0x0000 means "who's there"? That seems clunky to me fwiw. I also don't get how the last para of 4.2 is fully specified. - 5.1, Saying the MI is set at the "sender's discretion" seems wrong, given you earlier said it has to be unique within some (not that well) defined scope. - 6.1: "secret string" - please don't say that as it might be interpreted to mean something human readable/memorable which results in far weaker security. Please say "secret value" or "secret octet string" and for bonus points say that if this is human memorable then its basically pointless and that implementations MUST be able to handle essentially random binary values of the appropriate length. - 6.3: Please refer to RFC2104 which is where HMAC is defined. Following the bad practices from other RFCs that repeat the definition is really not a good plan. - I'd encourage you to say that use of GAP authentication is RECOMMENDED? That way, it may get used. Otherwise, I suspect that it may be ignored, and we may be sorry about that later if it turns out to just be another fig-leaf.
No general objection about this document except that the any impact of congestion is not mentioned and not handled. 1) Congestion Control issues: First a question: Can the GACH take all of the capacity of the link or at least of the share of the link of the associated data channel? Second question: What is the capacity of such a GACH in general? Now the issues in detail: Section 2., paragraph 11: > The rate at which GAP messages are transmitted is at the discretion > of the sender, and may fluctuate over time as well as differ per > application. Each message contains, for each application it > describes, a lifetime that informs the receiver how long to wait > before discarding the data for that application. This is error prone, as the sender can send GAP messages at a high rate, potentially congest the link and cause also damage to other applications. Especially, if multiple applications send their GAP messages at the same time, for whatever reason. Section 5.1., paragraph 8: > In some cases an application may desire additional reliability for > the delivery of some of its data. When this is the case, the > transmitter MAY send several (for example three) instances of the > message in succession, separated by a delay appropriate to, or > specified by, the application. For example this procedure might be > invoked when sending a flush instruction following device reset. The > expectation is that the receiver will detect duplicate messages using > the MI. This looks like a potential issue in terms of multiple applications intend to send burst of data. I can see the goal to have some sort of reliability in GAP message delivery, but sending a burst of data (i.e., three in a row is a burst) will cause trouble at some point. In general, it might look easy to avoid any congestion control in this setting, but I can see that this lack of congestion control is putting the GACH at risk due to an overload situation. 2) Timing issue: Section 3., paragraph 13: > Message Identifier (MI): Unique identifier of this message. A > sender MUST NOT re-use an MI over a given channel until the > message lifetime has expired. The sole purpose of this field is > duplicate detection in the event of a message burst (Section 5.1). Does this mean that the sender can immediately reuse the MI after the timer has expired at the sender's side? If yes, there is the chance for a race condition, as the receiver side might not yet have expired the timer. I would propose to add some safety margin, i.e., so it the MI is expired on both sides.
There was a Gen-ART review that has not received a response. (As far as I can tell. But I may not have been on all lists or private e-mail, so I could have missed it.) Can the authors respond so that we can be sure the issues and editorial comments from Martin have been taken into account?
Review by Martin:
I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at
<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
Please resolve these comments along with any other Last Call comments
you may receive.
Document: draft-ietf-mpls-gach-adv-06
Reviewer: Martin Thomson
Review Date: 2013-02-15
IETF LC End Date: 2013-02-27
IESG Telechat date: (if known)
Summary: The document is almost ready for publication as proposed
standard. There is a major issue that should be easy to resolve.
Major issues:
Section 6.3 duplicates the description of HMAC provided in RFC 2104.
That is likely to cause a bug.
If you reference RFC 2104, then the only requirement is a clear
specification for what message is input to the HMAC. Currently, this
is buried. It is unclear if the input includes the G-ACh header
defined in RFC 5586 (it doesn't need to, but this needs to be made
explicit).
Filling the authentication part with 0x878FE1F3 seems unnecessary busy
work, but it's harmless as long as the hash function produces a
multiple of 32 bits of output.
Minor issues:
To avoid forward compatibility issues, reserved fields should come
with guidance that says: "Implementations of this protocol version
MUST set reserved fields to all zero bits when sending and ignore any
value when receiving messages."
In Section 4.4, how does the duration interact with the lifetime?
What happens when the duration is longer than lifetime such that the
TLV is expunged before the duration is up?
Section 5.2 states:
[...] If one
of the received TLV objects has the same Type as a previously
received TLV then the data from the new object SHALL replace the data
associated with that Type unless the X specification dictates a
different behavior.
This leads to different retention characteristics depending on some
arbitrary application-specific requirements. It also complicates
implementations. Is there a strong motivation for the "unless the X
specification dictates a different behavior" part of this statement?
If this behaviour is desirable, a note regarding what happens to the
composed TLV when some of the values that contribute to it might
expire might be necessary.
Regarding the last paragraph of Section 6.3:
This also means that the use of hash functions with larger
output sizes will increase the size of the GAP message as
transmitted on the wire.
If you want to prevent hash truncation, then use 'MUST'. Personally,
I see no reason to do so. It's a good way to get smaller messages,
with a corresponding reduction in the strength of the assurance
provided.
Nits:
Section 3 could use some subheadings to aid navigation (and referencing).
Section 3 describes the size of fields only through ASCII-art. It's a
fairly simple thing to add a bit count to the description of each
field. That includes the reserved fields, which have no descriptions.
I like the text in the editors note on page 8. Why is it not the
actual text already?
Sections 4.2 and 4.3 could probably use a note that notes that
retention of these TLVs doesn't make any sense. These could be sent
with zero lifetime, except that if these are sent along with the
Source Address TLV, that's not possible... unless you send multiple
application data blocks for the same application. Is that possible?
This DISCUSS overlaps a fair bit with Stephen's, but I think there are a few
differences.
Section 6 needs a fair bit of re-writing, but I think the net effect will be to
make it simpler. In particular, Section 6.3. should be eliminated entirely,
since it all it does is re-specify HMAC. Instead, Section 6.2 should simply
specify the inputs to HMAC, and that the output of HMAC is set as the
Athentication Data payload. Alternatively, just replace Section 6.3 with
something like the following:
NEW:
"""
The contents of the Authentication Data TLV are the result of computing the
HMAC value for the Authentication Keystring and a modified version of the GAP
message. In the below, we denote by HMAC(K,T) the HMAC function for the
appropriate Key ID.
MAC computation procedure:
Input: Message without Authentication Data TLV, Key ID
1. Add an Authentication Data TLV to the GAP message, with payload of equal
length to the HMAC output, filled with the value 0x878FE1F3, repeated enough
times to fill the payload. 2. Compute HMAC(K, Message), where K is the
Authentication Key String for the chosen Key ID. 3. Change the payload of
Authentication Data TLV to be the output of the HMAC. Output: GAP message with
Authentication Data TLV
MAC verification procedure:
1. Copy the value of the Authentication Data TLV to a temporary variable AD.
2. Overwrite the payload of the Authentication Data TLV with the value
0x878FE1F3, repeated enough times to fill the payload. 3. Compute HMAC(K,
Message), where K is the Authentication Key String for the chosen Key ID. 4.
Verify that the output of the HMAC is equal to the transmitted value AD. """
"""
This is accomplished by attaching a GAP
Authentication TLV and including, in the Authentication Data field,
the output of a cryptographic hash function,
"""
The phrase "hash function" is consistently used where "message authentication
code" or "MAC" should be.
"""
These parameters are not sent over the wire; they are
assumed to be associated, on each node, with the Key ID by external
means, such as via explicit operator configuration or a separate key-
exchange protocol.
"""
The document should either state that a given sender/reciever pair MUST have at
most one Key ID between them, or adapt the format of the Authentication Data
TLV to include the Key ID. In any case, Key IDs should be scoped to a
sender/receiver pair, to prevent a compromised key from one association being
used to spoof messages on another channel.
"""
At
present, the following values are possible: HMAC-SHA-1, HMAC-SHA-
224, HMAC-SHA- 256, HMAC-SHA-384, and HMAC-SHA-512.
"""
Does this mean that implementations are REQUIRED to support all of these
algorithms?
I'm concerned that there are some indeterminacies related to combinations of
the GAP TLVs in Section 4. For example, what should I do if I get both
Suppress and Request TLVs? Should I reply and then shut up, or just shut up?
Is it dependent on the order? It seems like the simplest way to address this
might be to require that Application-0x0000 TLVs MUST be processed in order.
So (Suppress, Request) would result in silence, while (Request, Suppress) would
result in the response being sent, then nothing.
""" The payload of a GAP message is a collection of Type-Length-Value (TLV) objects, organized on a per-application basis """ The use of the word "application" in this section is a little confusing. It would be helpful to note that this is an "application" in the sense of "something that uses GAP", not in the sense of "application layer". """ where the numbers are specific Type values. """ Which "numbers"? The ones following "TLV"? Suggested: "where the numbered values "TLV#" represent specific Type values. """ Upon receiving the second message, the receiver retains B-TLV1 from the first message and adds B-TLV7 to its B-database. """ This is the first mention of a "database", much less a per-application one, and there is no mention of databases anywhere else in the document, nor in RFC 5586. It seems there's an architectural assumption that needs to be stated here. """ Version: Protocol version, currently set to 0 """ Should this be "MUST be set to 0"? """ For TLVs not carrying static data the Lifetime is of no significance. """ How does the recipient know when a TLV carries static data? Application defined? Lifetime > 0? Suggest, "For TLVs not carrying static data the Lifetime is no significance. The sender of a GAP message indicates this by setting the Lifetime field to 0.". """ Otherwise, the Value field specifies the applications for which an update is requested, in the form of a sequence of Application IDs: """ This might benefit from some clarity on responder behavior. MAY the responder send a different list of apps? MUST all requested apps be present in the response? It would be nice to have a checksum-like mode for the Authentication Data, which provides some degree of integrity without a need for key management. If you define the above application, you could also have a TLV that follows the same procedures as Authentication Data, but with a fixed set of parameters (HMAC function and key).
- DISCUSS-DISCUSS This document specifies procedures for an MPLS Label Switching Router (LSR) to advertise its capabilities and configuration parameters, or other application-specific information, to its peers over LSPs, pseudowires, and sections. ... The data elements distributed by the GAP are application-specific and, except for those associated with the GAP itself, are outside the scope of this document. An IANA registry is created to allow GAP applications to be defined as needed. What's the point of this draft if you don't register the applications that can use this protocol? I was expecting a registry with the different capabilities. - DISCUSS Below is Scott Bradner's OPS Directorate review (part 1) This is an OPS_DIR review of draft-ietf-mpls-gach-adv-06. The ID describes using the MPLS Generic Associated Channel (G-ACh ) to permit an LSP endpoint to inform other LSP endpoints of information of its choosing. The specification is not all that detailed and leaves most of the actual functionality to the applications that wish to exchange information over the G-Ach. Unlike most IDs this one includes a " Managability Considerations" - having such a section is great concept - but this example is not all that useful. The section provides a number of MUSTs but does not provide justification for them - it would help the implementers and operators if there were some additional information provided to say why the MUSTs are MUSTs. This type of thing is also a problem (a more important one imo) in section 3 which includes SHALL NOT and MUST NOT but does not say why such a prohibition is important. The authors know how to provide such context, they do so in section 5.1, it would be help if they did the same wherever they insist that something MUST be done or MUST NOT be done.
- OAM Operations, Administration, and Maintenance Don't we have a definition somewhere in a recent RFC? - Below is Scott Bradner's OPS Directorate review (part 2) Aside from the above I do not see any operations-specific issues but I think there is a problem in section 2. The ID says: "The GAP itself provides no fragmentation and reassembly mechanisms. In the event that an application wishes to send larger chunks of data via GAP messages than fall within the limits of packet size, it is the responsibility of the application to fragment its data accordingly. " I would think that some mention should be made here about not sending too big chunks of data because of the risk of congestion. A large chunk of information, fragmented into multiple non-congestion responsive fragments, does not sound like a good idea and some warning would seem to be in order.
Agree with Richard's and Stephen's discusses.
draft-ietf-appsawg-acct-uri
I'm concerned that this appears to be a URN, not just any old URI, and (I think!) we expect URNs to uniquely identify things. But these identifiers really don't uniquely identify things—mellon@fugue.com, mellon@toccata.fugue.com, etc, are all in some sense the same identifier, even though the host part varies. Also, mellon@204.152.186.142 definitely identifies the exact same user as mellon@toccata.fugue.com. So this seems like neither a URL nor a URN. Really it's just a string with some syntax; why does it need to be a URI?
I have no objection to the publication of this document. I have one small observation that those with more clue than I might consider. It is probably a trivial concern, but isn't it the case that where previously the knowledge of access to an http scheme did not mean that I could guess the access to a mailto scheme, the use of a common 'acct' scheme removes a level of guesswork. While such obfuscation of access information can hardly be considered strong security, it did offer a tiny bit of additional security that is now removed.
Support Stephen's DISCUSS on this. There should be an algorithm for comparing "acct:" URIs. Otherwise, this looks very good.
The word "discussants" may not be familiar to some readers. "Participants in the discussion" might be a little clearer. """ Protocols that make use of ’acct’ URIs are responsible for defining security considerations related to such usage, e.g., the risks involved in dereferencing an ’acct’ URI and the authentication and authorization methods that could be used to control access to personally identifying information associated with a user’s account at a service. """ Alongside "authentication and authorization", it might be good to call out "confidentiality" as well.
I'll clear this when you tell me any one of the reasonable answers:-) How do you compare acct URIs? Are the host and userpart parts case sensitive? What about punycode? Is fred@example.com the same as %72red@example.com or %46red@example.com? Who defines that? I think maybe you need to do it here but if not then I think you need to say who does it where.
- abstract: when we say "user" are we restricting that to humans or can this be used for e.g. application servers or hosts or other things? - I was strongly reminded of kerberos principal names here, but they don't get a mention. Ought they? - Ought this document be held in the queue until webfinger is done, just in case the example at the end of section 3 (the "GET") turns out to need changing? (CHECK WF isn't a normative ref) - Does that abnf mean that %00@example.com is a valid acct URI? Hmmm... do you want that? I think you really ought add a security consideration that such URIs are liable to cause s/w problems, e.g. the usual comparison flaws in access control and possibly NULL end-of-string errors. - just checking: it is ok to allow spaces in acct URI userparts, right? Would it be worth saying that and that e.g. higher layer protocols will need to % encode those to %20?
Before I open the can of worms on the webfinger list, I'd like to understand the reasoning behind this being a URI. I've sent this to not only the author, chairs, and IESG, but I've copied the URI Expert. As far as I can tell, this is *purely* identifying URI; there is no concept that it will ever be "resolved" in a meaningful way. Normally, when we create such an identifier scheme, it's because we need the semantics of a particular kind of identifier in a place in a protocol that would otherwise take a URI. But looking at the webfinger document, the only place that this kind of URI will go is in the "subject" field of the JRD, which webfinger is defining and therefore needn't take a URI, or in the resource attribute of the URL to retrieve the webfinger information, which also needn't be a URI. Is there an instance in which this account identifier is going to be used in a context that *would* require a URI? I just don't see the point in creating a URI scheme whose only semantics seem to be, "It's an account of some sort that is only interesting to the thing you will hand this identifier". Please explain why this is an appropriate use for a URI.
There's this RADEXT draft: http://datatracker.ietf.org/doc/draft-ietf-radext-nai/ Don't these two drafts do some of the same things? Is there some way one can leverage the other?
I support Stephen's discuss.
draft-ietf-mpls-tp-use-cases-and-design
- The "transport" (packet transport) vs. "transport" (TCP) terminology divergence screams out here, even from the moment one starts reading the ballot writeup. The routing and transport ADs might want to do something about that sometime. (The transport ADs get 2x votes as to what to do of course:-) - Once I did start reading, my marketing-BS detectors started firing big time. (See the specific comments below.) To be honest, I think this kind of post-facto justification marketing-spiel is mildly damaging to the RFC series. Not enough to object, but enough to be objectionable. - 1.1, I dare you to invent expansions of 5G, 6G, 7G etc.:-) And looking at this section, I don't believe all the acronyms expanded are actually used, e.g. "AIS" seems to occur exactly once in the document. There are also seem to be acronyms that are expanded here that are used exactly once which seems like a waste of space. And lastly, expanding an acronym is not really the same as defining a term, and this section does the former and not the latter mostly. So overall, this section seems not so useful and more for forms sake or to make the document look "more serious" both of which seem undesirable to this reader. - 1.2, "many legacy transport devices are approaching end of life" - I'd love to see some references there, (since I think that's an interesting fact, if its a fact) but the phrase is also vague - do you mean the devices are reaching end of life or the product lines are? - 1.2, "MPLS family," "complements existing," "closes the gap," "efficient, reliable" and "emerged as the next generation transport technology of choice" all seem to me to be purely marketing terms and are all are used within one paragraph here. Phrases in subsequent paragraphs outdo this one. IMO the best IETF marketing materials involve code and/or technical detail of existing deployment details and we're really not the best place from which to launch this kind of text. (It turned me off the rest of the document anyway fwiw, I'd never have read the whole thing if I didn't have to ballot on it.) - 2.1, "becoming inadequate," "too expensive to maintain," without references those are merely truth by blatent assertion. "a natural choice" also grates. - 2.1, "most Service Provider's core networks are MPLS enabled" seems to scream for a reference - 2.1, "it reduces OPEX" seems similarly without factual backing, "improves network efficiency" begs for a metric and "reduces end-to-end convergence time" is talking about something I'm sure, but its not clear what. (At this point, I'm gonna stop calling out marketing text. There's too much and it'd take too long. And it turns out the only comments I would have had were negative "we don't do marketing" things anyway;-)
I'm fine with the current text wrt to "transport" (packet transport) vs. "transport" (TCP), as at least in my understanding the difference is clear in the text context of this particular draft.
I did not think Section 1.2 was very informative. A rewrite in a different style would have made it better. These last call comments from Russ Housley were also on Section 1.2: ---- I wonder if the direction of Section 1.2 can be revised to make it more of an engineering document. It currently says: In recent years, the urgency for moving from traditional transport technologies, such as SONET/SDH, TDM, and ATM, to new packet technologies has been rising. This is largely due to the fast growing demand for bandwidth, which has been fueled by the following factors: ... Please consider an approach that describes the the reasons behind the transition from the network operator and network user perspectives: Traditional transport technologies include SONET/SDH, TDM, and ATM. There is a transition away from these transport technologies to new packet technologies. In addition to the ever increasing demand for bandwidth, the packet technologies offer these advantages: ... The fact that IP networks are being used for new applications and that the legacy devices are getting old does not motivate the transition to packet technologies. The advantages that packet technologies offer for these new applications is the thing that needs to be highlighted here, even if it is just a list of bullets. It seems like the only sentence that addresses this point in Section 1.2 is: "It streamlines the operation, reduces the overall complexity, and improves end-to-end convergence."
Since I haven't seen any reply from the AD, authors, or shepherd to the comments from Stephen and the 3 B's (Barry, Benoit, and Brian), I'll DISCUSS for a moment just to get an answer; I intend to move immediately to NO OBJ or ABSTAIN: This does sound like a lot of marketing fluff. *Why* did the WG think this was important to publish? Why was there "strong support in the WG"? Why is this "a reasonable contribution to the area of Internet engineering which it covers?"
I tend to agree with Stephen's general sentiment regarding some marketing in the document. However, as the write-up mentions "This document has a strong support in the working group and has been well reviewed.", I will record "no objection"
I have to agree with Stephen that this really comes off more as a glossy brochure than as an IETF document. But given the shepherd's contention that there's strong consensus in the working group to publish this, I'll say, "Mostly harmless."
I agree with Stephen's point that there doesn't seem to be any benefit in publishing this draft as an RFC.
The document says: "Since IP/MPLS is largely deployed in most SPs' networks, MPLS-TP and IP/MPLS Interworking is inevitable if not a reality. However, Interworking discussion is out of the scope of this document; it is for further study." However not withstanding the text in 3.4 the issues of running MPLS-TP OAM over a non-MPLS-TP network fragment (in peer to peer, or client-server mode) are quite severe since the non-MPLS-TP network may break the network invariants that MPLS-TP OAM assumed, and this needs to be called out in the text. This really needs to be explained to the reader.
I think this is harmless but it definitely felt like a glossy brochure. When do I get an MPLS-TP t-shirt?
draft-ietf-roll-security-threats
The authors are working on the SecDir review by Stephen Kent and expect to address the issues before the document appears on a telechat. http://www.ietf.org/mail-archive/web/secdir/current/msg03848.html
I think this will be a hugely useful RFC but does have a few things that could be improved. My discuss points are small, and perhaps easily fixed, but also I think important since they relate to why we're still here after a couple of years. (1) 6.4 says that consistency with BCP107 is a SHOULD. That's a MUST, (regardless of how you define MUST:-) Note that BCP107 does say when you do not need automated key managment, and when you do. So if you really don't need AKM then you are still consistent with the BCP. It is not however ok to be inconsistent with BCP107. (2) 6.5.1 says: "This protocol-specific security mechanism SHOULD be made optional within the protocol allowing it to be invoked according to the given routing protocol and application domain and as selected by the system user. " If you mean optional-to-use, that's ok, but please say so. If you mean optional-to-implement, that's not clearly ok and I have more questions. Do you mean optional-to-use? If not, then BCP107 comes into play again for me unless the quoted text only relates to some mechanisms that counter Byzantine attacks, in which case you need to be clear that you're not saying that all integrity mechanisms are optional-to-implement.
I support Sean and Stweart's discuss points. Note that all my comments below are non-blocking. I'd be happy to chat about 'em and happier if we ended up agreeing but that's not needed for this document to progress as far as I'm concerned. Some general points first: - section 6: This says a lot of good things about what MUST/SHOULD/MAY be used to get better security. But it seems to say nothing about what MUST be implemented. I think you really need to make that distinction since the IETF is mainly concerned with the latter. If you don't make that distinction, then I fear we'll have to have that discussion on a case-by-case basis for each of the applicability statements and some of the value of this document will be lost. (But the case-by-case discussion is a viable way to do it too.) - I was a bit disappointed that this didn't go into more detail about RPL. The term "DAG" for example doesn't occur at all. However, its still useful enough as-is I hope. - Cryptographic protocols and implementations can suffer from side-channel attacks, many of those require the attacker to send 10's or 100's of thousands of messages in order to recover a piece of plaintext. That probably needs to be noted somewhere, and maybe have a recommendation that LLN nodes ought to react if they see many many errors in any cryptographic operation. That does however need to be balanced against the potential for DoS that might be created should a bad actor send many packets spoofing the source address of a victim node - we don't want the attacker in that case to be able to take the victim node off the network. Its just a hard problem, but one that this document probably ought bring to the attention of RPL implementers who care about security. and some specifics.... - 3.2: Non-repudiation is so the wrong term, but don't feel bad - that's true for almost all uses of the term;-) I think you ought get rid of that term entirely and maybe talk about evidence preservation or something (before you say its not much use in an LLN because of storage constraints:-) - 3.2: Same topic. I don't think its correct to say that you can't do logging. It is fair to say that you can't do comprehensive or complete logging, but I'd bet almost all devices that can do RPL could keep a medium sized circular log at least and many could (e.g. via SD card) actually keep quite extensive logs, which in my experience compress excellently. I do agree that log retrieval via the LLN is not so likely so if a device will never be visited or hasn't any other interface but the LLN, then it might not be worth logging, but you don't say that. Even if you need to get the log via a serial connection or JTAG its still very valuable in my experience if you need to investigate some (mis)behaviour. So overall, I'd suggest you don't write off logging as much as you currently do. - 3.4: I'm not sure what "misappropriated" is meant to mean. Please clarify. I interpreted it as "leaked out" btw. - 3.4: I don't think "legitimacy" is a useful term here and authenticity applies to messages more than participants - 3.4: "faithfully" means what? - 3.4: I was surprised you didn't say that battery depletion attacks are a potential issue for LBRs in the last set of bullets here. - 4.1.1: sniffing is an odd phrase here, maybe better to say eavesdropping - 4.2.1: you say attacks can affect convergence of the routing protocol - that might be worth elaborating on as its not clear to this (security geek) reader whether slow/no-convergence is really an attack or (just) an example of ineffecient routing. If you consider the more important audience for this draft to be routing folk, then maybe its ok and they know already. But to put it another way, I'm not sure that causing convergence to be slower by a number of seconds is so much of a threat except in a tiny proportion of LLNs. - 4.3.1: There's a gap between the text and the bullet list, its not clear that or how "HELLO flood attacks and ACK spoofing" with RPL can lead to the threat described. I don't doubt you're right, but the text doesn't explain it in a way I can get. - 4.3.3: Couldn't the routing protocol offer resilience features that act to help mitigate DoS attacks at other (lower or higher) layers? I don't see why not in principle but you appear to dismiss this. - 5.2.2: I don't get how comparison with historical data helps in practice, but I guess it could. Might be useful if there are some papers that could be referenced to help the reader figure out if they could use this kind of approach. - 5.3.1: I don't think "coerce" is right, maybe "convince" is better. - 5.3.1: Do ACKs really reduce the probability of attack success or rather reduce the impact? Seems more like the latter to me. - 5.3.4: Wouldn't limiting the capability of nodes to accept assertions about link or path quality be a counter to sinkhole attacks? That might be something to consider at protocol design time, but might also be something to consider when deploying, e.g. to disallow any peer from claiming to be an awful lot better than all other peers? - 5.3.5: It would have been great to see if there is any evidence as to the reality of wormhole attacks. I do sometimes wonder if these are more theoretical than practical (in terms of offering a real advantage to an adversary). - 6.1: s/improve vulnerability/reduce vulnerability/ - 6.1: It might be worth saying that one needs to be careful in deriving new confidentiality keys from new integrity keys. - 6.2: Just to make life more difficult, sorry;-) Logging of integrity check failures, if done at the wrong place could actually make a timing side-channel attack more feasible. - 6.4: I don't agree that key managment is not directly a ROLL security requirement. If you need some crypto then its a requirement, full stop. The question that arises is whether (and if so how) to meet that requirement for RPL, or since we are where we are, for specific RPL applicabililty statements. - 6.5: When you say that conforming to the system's target level of security is a MUST, I think you're mixing up what's mandatory to implement vs. mandatory to use. Some security mechanisms might well (and are often) MTI even though there are deployments where they do not need to be used (at present). And yes, that does lead to sometimes unused code paths, which is a pain. But being able to turn that on without e.g. updating firmware might still be the right answer. (And requiring a secured firmware update is probably more onerous anyway, though should also be done.)
Thanks for a well written and important document. I particularly liked the fact that you treated byzantine attacks and key management as a part of the analysis. A couple of small comments, however: Section 6.5.1 (Architecture) misses a few aspects on the comparison lower layer vs. routing protocol layer security. Everything that you say about it is correct, but one thing you do not talk about are the problems of providing the exact security services to upper layers that they need. Some of the typical issues include the need to access security information at the routing layer that may be unaware of the exact security parameters, peer identifiers, and other aspects that happened at a lower layer. Similarly, some applications may need security policies that are not easily expressed in crude lower-layer policy models (such as packet filter patterns). A lower-layer mechanism may be run hop-by-hop, whereas the routing layer mechanism may be end-to-end or even secure data objects rather than packets. The document also touches very little on deployment aspects of the potential security models. In my view, those are often the hardest to solve in a satisfactory manner. I liked the few words that Section 6.4 said about credential configuration, however.
History is sometimes amusing: I see that all of the capitalized words in section 6 were due to a comment made by Peter Saint Andre during evaluation of draft-ietf-roll-security-framework saying that things should be capitalized. Now you have ADs saying they shouldn't be. Isn't that wonderful? I don't think you need to go back to lowercase, but I think you do need to make one change: One way or the other, you need to remove the reference to 2119 at the top of the document. What you're doing in section 6 is not 2119 usage, and you've explained in section 6 what you are doing with those capitalized terms, so the reference to 2119 (and the template text at the top of the document) are just wrong. Yes, the idnits checker will complain. There is no requirement that your document pass idnits. You just tell the RFC Editor that this document is not using 2119 and you intended not to include the reference; end of story. I'm not going to bother putting a DISCUSS on the document for this unless Adrian really insists; I trust that you all can just take care of it. Now, that leaves Barry, Brian, and Joel's question about whether to capitalize it all. Like Barry, I'm not convinced it matters. You explained how you're using the terms, and I think that's fine. It may reduce confusion if you lowercase or choose other "magic" words, but I think that choice is entirely up to the WG.
This is one of those "I trust others" ballots: I trust the Sec ADs to be especially thorough on this document, and on a quick run-through I'm happy to let them handle the main issues. I also have to agree with Brian that using SHOULD and MAY in Section 6 in a way that varies from the meaning in RFC 2119 is, though it's explained, likely to lead to some level of confusion. That said, given what this document is describing and the fact that strict interpretation according to 2119 doesn't make sense, I think it will work well enough, and, in the end, I'm OK with it.
This is strictly a non-blocking comment... I am not a fan of the way that 2119 keywords are cannibalized in sections 6.x.
From a routing point of view I thought that this was well written, but I am concerned that the security reviewer had considerable comments which appear to be currently unresolved. Adrian points to: http://www.ietf.org/mail-archive/web/secdir/current/msg03848.html However this indirectly refers to a larger body of issues posted here: http://www.ietf.org/mail-archive/web/secdir/current/msg03712.html and the security reviewer only re-lists a subset of these in note 03848 noting that only the typos were addressed in the -01 version of the text.
It would be useful if the CIA model were defined (by value or by reference) much earlier in the document.
I have a whole bunch of problems with this draft, but understand that this is part of way to get out of the current not so great situation. Many of my comments lined up with the secdir review, but since Stewart called it out in his discuss I'll support him. Here's some of my own: 0) This draft boils down to this paragraph if I'm not mistaken: A ROLL protocol MUST be made flexible by a design that offers the configuration facility so that the user (network administrator) can choose the security settings that match the application's needs. Furthermore, in the case of LLNs, that flexibility SHOULD extend to allowing the routing protocol security requirements to be met by measures applied at different protocol layers, provided the identified requirements are collectively met. I'm absolutely fine with the first sentence. I'm even okay with the second sentence it gets done at the application layer all the time, but at the application layer they can all point to something that's all specified up and has MTI etc (think TLS). If we end up doing that here then something similar needs to end up happening. If use cases are so broad that they can't possibly pick an underlying security mechanism then you need to try again but with a smaller net. 1) s3.2: Adding "misuse" in the integrity strikes me as wrong. It's about determining whether data has changed. The examples used are about delayed or replayed messages which seem to be better characterized as availability. 2) s3.2: How on is non-repudiation going to apply to these tiny little assets? I see how I, a person, can repudiate that I sent a message and I can see how you, as a person, can repudiate something else. Are two nodes going to be claiming one sent something while the other will say no I didn't? I hear all the time we can't do this and we can't do that because these are so constrained but you're going to log and capture on-going messages - color me confused? I think you should say non-repudiation applies to people not to device-to-device/automated communications. 4) s3.4: Not this one is likely to be cleared after an email exchange or two because there's not much for the authors to do… This made my head pop because I thought the only security defined in the RFC 6550 has confidentiality baked in. So are we dumping the security solution in rpl? With regard to confidentiality, protecting the routing/topology information from eavesdropping or unauthorized exposure may be desirable in certain cases but is in itself less pertinent in general to the routing function.
1) s1/s3/3.1/3.2: The CIA model is one that's great, but in s3 your list
security services list starts off with "proper authorization for actions" and
then talk about authentication next. Clearly authorization and authentication
need to be added in to s3.2 -no? Any chance of just changing to the 5
(confidentiality, integrity, authentication, access control, and
non-repudiation) listed in ISO 7498-2? You ever get a stable international
reference. You don't need to have that awkward lead-in about non-repudiation
in s3.2. and you'd only need to add one availability? If you're going to stick
with CIA then please add authorization and authentication in s3.2.
2) s3: After reading the first paragraph in s3 and comparing it to the output
of the IAB workshop (RFC1636) I'm left wondering if it's doing the same thing.
RFC 1636 says:
Securing the routing protocols seems to be a straightforward
engineering task. The workshop concluded the following.
a) All routing information exchanges should be
authenticated between neighboring routers.
b) The sources of all route information should be
authenticated.
c) Although authenticating the authority of an injector of
route information is feasible, authentication of
operations on that routing information (e.g.,
aggregation) requires further consideration.
S3 closes with:
In the case of routing security the focus is directed
towards the elements associated with the establishment and
maintenance of network connectivity.
The word focus kind of threw me and later in s3.4 you list the fundamental
functions of a routing protocol. Is the threats or the things you're trying to
secure. And, as Steve pointed out in the secdir review most think of routing
security is ensuring the proper functioning of the routing protocol. And, you
say you're using definitions from RFC 4949 but the one for "security".
Later in the paragraph you have "authentication, and potentially integrity, and
confidentiality" kind of hangs there after authorization. Authentication,
integrity, and confidentiality of what? Also if you're going to do
authentication I guess you might not need integrity, but I'd sure like to know
how that happens.
Maybe some tweaks could solve all this:
Routing security, in essence, is about ensuring the routing protocol operates
correctly [insert reference if there is one]. It entails measures to ensure ...
and then (injectors was the IAB's word and maybe we can come up with a better
one - or we define a new term in the definitions section)
State changes would thereby involve not only authorization of injector's
actions, authentication of injectors, authentication, integrity, and
potentially confidentiality of routing data, but also proper order of state
changes through timeliness, since seriously delayed state changes, such as
commands or updates of routing tables, may negatively impact system operation.
3) s3/3.1: in s3 you say:
A security assessment can
therefore begin with a focus on the assets or elements of information
that may be the target of the state changes and the access points in...
and in s3.1 you say:
An asset implies an important system component (including
information, process, or physical resource),
But asset is also defined in RFC 4949 as:
$ asset
(I) A system resource that is (a) required to be protected by an
information system's security policy, (b) intended to be protected
by a countermeasure, or (c) required for a system's mission.
resource is better than component in my mind (see definition in RFC 4949) so
how about the following in s3:
A security assessment can
therefore begin with a focus on the assets [RFC4949]
that may be the target of the state changes and the access points in…
and in s3.1:
An asset is an important system resource (including
information, process, or physical resource),
4) Please provide a pointer to the concept of "control plane". Would RFC 6192
do as a pointer or maybe add definitions in the definitions section:
control plane: Supports routing and management functions.
forward plane: Responsible for receiving a packet on an incoming interface,
performing a lookup to identify the packet's next hop and determine the best
outgoing interface towards the destination, and forwarding the packet out
through the appropriate outgoing interface.
Also, are we just talking about control plane security here? If that's true
can we say way, way sooner - like in the abstract/introduction?
abstract:
A systematic approach is used in defining and evaluating the security
threats for the control plane.
and then else where as appropriate
5) s3.1: r/components and mechanisms/assets, points of access, and process
6) s3.1: It's worth reiterating that the Figure is just about the control plane:
All of this is done on the control plane. (assuming it is)
7) s3.1: The "route generation" process is missing from the Asset/PoA lists
shouldn't it be there?. Also there's a database but it's not listed is that
part of "memory"? Isn't "node" without a qualifier missing too?
8) s3.2: I thought this was about the control plane? Why does the availability
paragraph talk about forwarding "services"?
9) s3.2: The last paragraph has to be more tightly coupled to ROLL. I'm afraid
of a food fight between the various routing security groups that are doing work
in this space because they're not all implementing enforcement mechanisms for
the services described in s3.2.
10) s3.3: Please add sleepy node to the definitions section: maybe:
sleepy node: A node that is not functional, but immediately available.
11) s3.3: What does this mean and why:
In addition, the choices of security mechanisms are more stringent.
12) s3.3: Highly directional traffic: Are you trying to say that the LBRs are
higher valued targets and warrant something different than the regular nodes?
13) s3.4: misappropriated seems like the wrong word based on later sections.
Masquerade seems like what you're trying to protect against, but that's covered
by the peer authentication process.
14) s3.4: How about:
In conjunction, it is necessary to be assured of
o the authenticity and legitimacy of the participants of the routing
neighbor discovery process;
NEW:
In conjunction, it is necessary to be assured that
o authorized peers authenticate themselves during the routing
neighbor discovery process;
15) s3.4: I think you could drop eavesdropping and just say unauthorized
exposure
16) s4: We need to either define the threat sources or point to RFC 4953.
There's really only two outsiders and byzantine.
17) s4.1.1: r/sniffing (passive/passive wiretapping (reading
r/(evaluation/(e.g., evaluation
18) 4.2.2: identity misappropriation is really about peer authentication and
masquerading
19) s4.3.2: nice ascii art
20) s5.1.1: encryption does not counter deliberate exposure attacks.
21) s5.1.2: Passive wiretapping (“sniffing” to the authors) does not include
device compromise.
22) s5.1.3: TA is always a passive attack, so the description here “… may be
passive…” is wrong. Just strike may be passive.
23) s5.2.3: r/liveliness/liveness
24) s5.3.2: r/ energy store quicker/ energy store more quickly
25) s6: difficult to parse: The assessments and analysis in Section 4 examined
all areas of threats and attacks that could impact routing, and the
countermeasures presented in Section 5 were reached without confining the
consideration to means only available to routing.
26) s6.1: r/and improve vulnerability against other more direct attacks/and
reduce vulnerabilities relative to other attacks
27) s6.2: Can't do security but can keep logs ;)
28) s6.4: r/Security Key Management/Key Management
29) s6.5.1: r/diversified needs/diverse needs
30) I have to admit that I fully expect a consideration about sleep nodes'
friend grumpy node. He's likely to cause all the problems.
while the disclaimer on 2119 language in section six is fairly clear I'd rather see it simply not use it. e.g. switch to lower case leave the disclaimer more or less intanct.
draft-ietf-ospf-rfc3137bis
Thanks for this work which I support. I am balloting Yes, but hope you will address my comments. --- Can we take the oportunity to fix the Abstract since an OSPF implementation is not synonymous with a router? OLD This document describes a backward-compatible technique that may be used by OSPF (Open Shortest Path First) implementations to advertise unavailability to forward transit traffic or to lower the preference level for the paths through such a router. NEW This document describes a backward-compatible technique that may be used by OSPF (Open Shortest Path First) implementations to advertise a router's unavailability to forward transit traffic, or to lower the preference level for the paths through such a router. END --- This document needs a section marked "Changes from RFC 3137" so that it is easy for people to find out what has changed. It may be that you intend the Appendix to capture this, and that the -00 version of the I-D was the same as the old RFC. But this is not clear. Additionally, such change logs are usually deleted by the RFC Editor. And, finally, at this stage we don't need to know the sequences of changes from revision to revision: just the summary of all changes from the RFC. --- Section 2.1 OSPFv3 [RFC5340] introduced additional options to provide similar, if not better, control of the forwarding topology; the R-bit provides a more granular indication of whether a router is active and should be used for transit traffic. You have used one of my pet phrases. I have beaten-up about it sufficiently that I am now sensitive to its use and the ambiguity it provides. Does "similar, if not better" mean the solution in 5340 is not better (i.e. <= ) the solution you describe in this document. Or does it mean that the solution in 5340 is at least better (i.e., >= ) the solution you describe. When I use the phrase, I always mean the latter, but I can see why many non-native speakers (such as Americans ;-) get confused. Maybe just spell this out?
I was also wondering why this is informational rather than PS.
I would suggest that the document explain what has changed since the previous RFC. http://tools.ietf.org/rfcdiff?url1=rfc3137.txt&url2=draft-ietf-ospf-rfc3137bis-03.txt
Even with Stewart's explanation of why this is Informational, I'm still not convinced why this (or 3137) is not PS (or at this point IS). But I don't think it makes enough difference to change.
Thanks for addressing my DISCUSS-DISCUSS. For the record, I cut/paste it, along with the answer, below. Here is Fred Baker's feedback, from the OPS-directorate review (btw, no OP- related concerns): If I would recommend the IESG do anything in particular with the draft, it would be to promote it to Proposed Standard; RFC 3137 is an Informational document that modifies a Proposed Standard, which seems strange. Operational experience with RFC 3137 has been that it works as advertised, and in IPv6 networks the R-bit approach is superior. Answer from Stewart Bryant: Benoit, I have discussed with the authors and the chairs and they confirm that Information is the appropriate track. The OSPFv3 R-bit is part of the base OSPFv3 specification (RFC 5340) and hence all OSPFv3 routers should support it. The original motivation for making the OSPF Stub Router mechanism, as described in RFC 3137, an informational document is that it can be implemented purely as a local policy with the situations under which the policy is applied and the duration of application out of the scope. In other words, there is no change to the protocol or the behavior of OSPF routers other than the OSPF router temporarily designating itself a stub router. The OSPF WG has consistently applied the policy of documents requiring co-ordinated action being PS and local policy documents being Informational, and this draft conforms to that policy.
For bis drafts, it's often nice to list the differences between the old and new versions. Appendix A lists the changes from -00 to -03, but it's not clear that it's going to be retained? Is it and if it could they just the changes between 3137 and 3137bis? Should RFC 2178 be referenced as well? Curious why every other version of OSPFv2 is referenced except this one.
draft-laurie-pki-sunlight
It might be interesting to talk about what one of these logs would look like in practice—how big a log such as this would be for a typical CA, and what the size of a typical verification chain would be. In general I think this is a really interesting idea and I look forward to seeing some results from the experiment.
This document would benefit from several clarifications as to what security properties are being guaranteed, and under what operational assumptions. The document seems to assume, but does not state, something like the following operational model: 1. Certificate is published to log 2. Potential subjects of certificates examine the log for mis-issued certificates 3. If a mis-issued certificate is found, it is revoked If steps (2) and (3) don't happen for a given site, then there is no security benefit -- all inclusion in the log would mean is "this certificate exists", which is apparent enough from its being used in TLS. And in reality, (2) and (3) will not be followed for all sites. So the semantic of a certificate being included in a log is basically the following: "The subject of this certificate has been given the opportunity to ask for it to be revoked." The document should clarify this security model. Several specific edits on this overall theme follow. """ The intent is that eventually clients would refuse to honor certificates which do not appear in a log, effectively forcing CAs to add all issued certificates to the logs. """ This sentence makes big assumptions about reality and values. Suggest rewording in a more neutral way, e.g.: "If eventually, clients refuse to honor certificates that do not appear in a log, the CAs will effectively be forced to add all issued certificates." It would also be helpful to add a brief statement of why forcing CAs to list all certificates would be a good thing, since this might not be obvious to all readers. In the first paragraph of the Informal Introduction, several important terms are used without definition. -- "misissued" -- "publicly auditable" -- "correctness of each log" Clearly defining these terms will help clarify the security model. It's an overstatement to say that logs "ensure" that interested parties can detect mis-issue, since certificates might not be entered into the logs. Suggest "provide a channel for". """ Anyone can submit certificates to certificate logs for public auditing, however, since certificates will not be accepted by TLS clients unless logged, it is expected that certificate owners or their CAs will usually submit them. """ Assuming an operational model. Suggested text: "..., however, since unlogged certificates will not be accepted by TLS clients implementing this specification, it would be advantageous for certificate owners and CAs to submit them." """ "tbs_certificate" is the DER encoded TBSCertificate (see [RFC5280]) component of the Precertificate - that is, without the signature and the poison extension. """ Please clarify that this Precertificate dance is ONLY useful to subjects, not to clients. The SCT for a Precertificate is useless for TLS, since the certificate will be invalid anyway (via the poison extension). Given this obvservation, it's not clear why the Poison extension is removed from the Precertificate's TBSCertificate before inclusion in the SCT. """ TLS clients MUST reject SCTs whose timestamp is in the future. """ In light of the latency involved in steps (2) and (3), it would probably be good to recommend that if clients are using inclusion in the log as a measure of security, then they should impose some delay on SCTs. That is, not only should they reject SCTs from the future, they should reject SCTs that are less than a few days ago. """ Misissued certificates that have not been publicly logged, and thus do not have a valid SCT, will be rejected by TLS clients. """ s/TLS clients/TLS clients implementing this specification/ This protocol uses several OIDs from the Google vendor arc (1.3.6.1.4.1.11129). Should these be re-assigned somewhere under the IETF Experimental arc (1.3.6.1.3)?
It is customary for the authors to list their affiliation on the title page. """ When a chain is submitted to a log, a signed timestamp is returned, which can later be used to provide evidence to clients that the chain has been submitted. TLS clients can thus require that all certificates they see have been logged. """ This last sentence seems irrelevant. Do you mean that IF a client requires certs to be logged, THEN this allows them to verify? Suggest clarifying that this enables the requirement by providing a simple verification mechanism (vs. checking the logs directly).
Blech. I hate docs written in terms of code and APIs. Makes for brittle implementations. If this thing ever gets on the standards track, I hope it is rewritten with actual data format diagrams and protocol instructions. But as an experiment: Have at it.
The authors show no affiliation. Their email addresses indicate that they may all work for Google. For transparency reasons, if this is a company effort, rather than something the authors are doing on their own time for the good of the Internet, the authors should indicate their affiliation. If if this is a "proprietary Google protocol made public" that should be indicated to the reader.
draft-cardenas-dff
Oops, sorry for not entering a ballot sooner. I did a very thorough IPdir review of this draft for Ralph; when it didn't get finished before he passed the baton to me, I happily agreed to take over sponsorship of the draft—I think it's very interesting and useful work. Ulrich is working on addressing Martin's DISCUSS. The short answer is that the packets sent on these networks typically come nowhere near the MTU size, so this hasn't been an issue. But Ulrich agrees that this is a concern and would like to have a better answer than that in the draft. Regarding Dan's comment, he asked me for some native english advice on how to update the draft to make it clearer and I gave him some; I don't know if the update to the draft that he proposes to do will actually address Dan's concern, but he's planning to spin the draft with that update and then we can ask Dan if he's happy with the new text.
Thank you for addressing Alvaro Retana's Routing Directorate
review. There is one minor point carried forward in my Comment,
but I have cleared Alvaro's review from my Discuss.
But I have a substantial number of my own concerns. I believe that it
was a fundamental mistake to state or assume that this document did not
need careful review by the Routing Area as it has clear implications for
the behavior of packet routing.
---
This point is placed first as it is not actionable by the document
authors. It is intended for consideration by the sponsoring AD only.
There seem to be a large number of relatively meaty changes to the text
(although not to the protocol) since IETF last call (file went from 85K
to 90k). I don't see that these changes were discussed on the IETF list.
Since this is an IETF consensus document, do we know that the IETF has
consensus support for the changes?
---
Thank you for bringing this work forward as Experimental and holding
off asking for publication on the Standards Track.
I really like that you have put a section about further experimentation
high up in the document. I would also like that section to describe the
experimental environment a bit: how much should this be contained? what
are the risks of running it "on the Internet"? what happens if the
experiment "escapes"?
This last point is quite important because you are not using
experimental code points. While I see you have an "ignore and forward"
code point of IP_DFF, what are the implications of that behavior?
---
Section 1.2
While this protocol has been widely deployed...
and Appendix B
The implication here is that the protocol is being brought forward in
its current form for "rubber stamping". That is, that the document is
intended to describe a protocol version that has been implemented and
deployed, and then to encourage more experimentation. That is OK, but
normally, in those cases, we publish as "Company X's Foo Protocol"
rather than with the implication of IETF consensus on the protocol
itself.
For example, if I raised an issue that required a change to the protocol
we would have the situation where either the published RFC diverged from
the deployed base (making the text need to be changed) or push-back from
the authors about making the change.
Being branded as "Company X's Foo Protocol" does not stop the document
from being AD Sponsored with IETF consensus, but it means that the
consensus is behind the publication, rather than behind the technical
content. Given the relatively short review that the document got from
the community, this might be more in keeping with reality.
---
Section 2.2
I am uneasy with the use of a direct quote from Wikipedia.
- I have read the Wikipedia license terms and am not sure I
understand them. Does the IETF copyright license infringe the
Wikipedia terms?
- Wikipedia calls for attribution and credit to the original
contributor. Is that supposed to be applied when we make a direct
quote?
- Is the Wikipedia URL considered a stable reference?
I would prefer to see the authors write their own definition of what
they mean be "Depth-first search".
---
Section 3
As I note in my Comment, this section is very valuable for setting the
scope of the protocol and the associated experiment. It may seem
obvious to the authors, but it is important that this protocol not be
used in environments that are not within the scope of Section 3. Indeed,
doing so could be extremely damaging to the stability of the network. I
would welcome a clear statement in the Introduction about the care with
which experiments should be conducted and the environment in which they
must not be tried. I don't believe this would detract from the protocol,
but would just make it clear for implementers and deployers.
---
Section 3 contains the assumption
o Is designed for networks with little traffic in terms of numbers
of Packets per second, since each recently forwarded Packet
increases the state on a router. The amount of traffic per time
that is supported by DFF depends on the memory resources of the
router running DFF, on the density of the network, on the loss
rate of the channel, and the maximum hop limit for each Packet:
for each recently seen Packet, a list of Next Hops that the Packet
has been sent to is stored in memory. The stored entries can be
deleted after an expiration time, so that only recently received
Packets require storage on the router.
You state this as necessary for router state (which is true). But it
is also necessary if your forwarding model is going to save bandwidth.
Otherwise, forwarding packets as suggested by this protocol is actually
going to cost more bandwidth than the routing protocol reduction saves.
Phrased differently, all of packet processing is about finding the
correct tradeoff between state in the packet and state in the network.
Basically, this is moving a lot of state into the packet, and I worry
about how this will develop over time.
Consider a deployment that meets the assumption. Over time the amount
of traffic grows (as we know is the case in all networks ever deployed).
At some point you will hit a catastrophe! Either the devices can no
longer store the necessary information (I don't immediately find a
description of how the protocol handles not being able to store the
information for a packet it is about to send, but I may have missed it),
or the bandwidth used may climb a bit fast (possibly stressing a low-
capacity link more than expected).
I think the way to handle this is to put in processing for the case
where the amount of traffic (or the network density) increase over a
threshold. The protocol could, for example, apply back-pressure on its
data applications, or it could simply discard packets. And it should
report issues to the operator.
---
While I understand how using L2 "reliable delivery" reduces the need to
provide a certain amount of function in L3, I find the assumption in
Section 3 to be counter-productive. One of the motivations in Section
1.1 is
More frequent routing protocol updates can mitigate that problem to a
certain extent, however this requires additional signaling, consuming
channel and router resources (e.g., when flooding control messages
through the network). This is problematic in networks with lossy
links, where further control traffic exchange can worsen the network
stability because of collisions. Moreover, additional control
traffic exchange may drain energy from battery-driven routers.
This is a very good point. But you seem to address the issue by saying
that some mechanism (like L2 Acks) must be applied. Now, it seems to
this naif reader that sending an Ack for every data packet is an
increase in the traffic that negates the reduction in routing traffic.
Of course, this does cut into the previous point about "little traffic".
Taken to extreme, if there is absolutely no traffic, then there is no
need for routing exchange, and this protocol is ideal.
So, how is the deployer to work out whether the use of this protocol is
a good idea or not? Is this a point for experimentation? Is there more
precise guidance you can give related to specific routing protocols and
traffic demands / network density?
---
Section 4
This list is
ordered, first containing Next Hops listed in the RIB, if available,
ordered in increasing cost, followed by other neighbors provided by
an external neighborhood discovery.
I believe that you should say whether there is any way of ordering the
externally discovered neighbors, or whether it does not matter. Should
the order be consistent across each packet (subject to changes in
metrics that may impact the RIB)? A forward pointer to Section 11?
---
Section 4.2 and DUP processing
I have read the exchanges on this topic (see also the Comment, below),
but I continue to have concerns about the way this works:
As discussed, Section 4.2 notes that duplicate (same origin and sequence
number) packets which have the DUP flag set will not be considered
looping.
OK, but suppose that a packet gets the DUP flag set (from having failed
to get an L2 Ack earlier) and then starts looping, it will keep looping
and not be terminated by the loop suppression feature. Am I missing
something about the DUP flag being cleared again?
---
Section 4.2 - more about duplication
The document assumes that if a packet loops on an trial, then you
should stop trying alternatives and return the packet backwards. In
many easily constructed topologies, this will cause valid paths to be
missed.
---
Section 4.2 contains some scary throw-away text!
An external mechanism may use this information for increasing
the route cost of the route to the Destination using the Next Hop
which resulted in the loop in the RIB. Alternatively, or in
addition, the routing protocol may be informed.
This is so casual as to be absolutely dangerous! You don't go any
further into the issues or potential of what is going on here.
(A forward pointer to Section 12 would have helped a bit, but would
only serve to fan the flames!)
Since this is clearly not a fundamental part of the protocol you are
describing, and since you probably don't want to write substantial
text on the risks of metric modification in this way, maybe it would
be best to simply delete this text along with Section 12.
BTW Is "route cost" meant to imply you touch the value in the RIB or
the FIB? Section 12 seems to imply the RIB which is an "interesting"
overlap with the function of the routing protocol! How long is that
update supposed to last? What happens when a packet is successfully
forwarded on that link, should you decrease the cost again? And how
would you know you had increased it? What happens when there is a
routing update? Should the new cost value get flooded by the routing
protocol?
I really think you would be well-advised to cut out this piece of
speculative work.
---
As a general thought, this is effectively an "all routes explore"
protocol. It is fairly easy to see that in many fairly dense topologies
the hop limit will be exhausted before the right path is found. And you
do state that it is the intention that this protocol works for dense
topologies.
I do note that Section 3 states that "Certain topologies are less
suitable" and that this includes "topologies where the 'detour' that a
Packet makes during the depth-first search in order to reach the
destination would be too long."
Because of this contradiction, I think you need to quantify the issue
rather than skating over it. What sort of topology makes the DFF too
long? How dense is too dense?
By the way, the argument that using the installed route will likely work
and so we won't actually explore all routes cannot be used because if
the installed route was correct, we would not need this protocol.
---
Section 5 has
DFF MAY use information from the Routing Information Base (RIB),
specifically for determining an order of preference for to which next
hops a packet should be forwarded (e.g., the packet may be forwarded
first to neighbors that are listed in the RIB as next hops to the
destination, preferring those with the lowest route cost).
But Section 4 already said
This list is
ordered, first containing Next Hops listed in the RIB, if available,
ordered in increasing cost, followed by other neighbors provided by
an external neighborhood discovery.
And to confuse things, Section 11 seems to contain the definitive rules
and only uses RECOMMENDED.
Which is right?
---
Section 6.2
The Processed Set SHOULD be stored in non-
volatile memory and restored after a reboot of the router.
You need to separate the function you want to see from the way it is
implemented. But, anyway, why is this a "SHOULD"? Are there good reasons
why an implementation might decide to not restore the Set, and what
would be the consequences?
But let's look at two implications here:
1. The use of non-volatile memory is clearly a gate on throughput.
2. Do you store the information before you send the packet, or send the
packet before you store the information? One way or the other you
create a restart window that you need to address in the document.
---
Route-over issues
For route-over operation with a source in the DFF domain and a
destination outside the domain this protocol explicitly assumes that
the source S knows the address of the correct exit router R (Section
15). That may be reasonable in the 6lowpan case, but it is not
reasonable in the generalized case. How can you solve the selection
of exit routers?
For route-over, the figure 7 case, using a low capacity network as
transit for connecting external IPv6 nets is simply a BAD idea.
Whether the required ability for the ingress router to know the
identity of the egress router can be mete depends upon the
properties of the routing protocol. (For example, RIPv3 will never
tell you this.) This needs far more clarity in terms of a warning to
not do it, but ideally it would have a preventative mechanism built
into the protocol. [Hint: being a bad idea means "it will kill your
network and cause unpredictable effects in the connected networks."
Why not simply remove all discussion of transit networks? Or, better
yet, recommend against it.]
---
Section 7
Version (VER) - This 2-bit value indicates the version of DFF that
is used. This specification defines value 00. Packets with other
values of the version MUST be ignored by this specification.
1. The specification doesn't do active things and so cannot ignore
packets. Do you mean "...ignored by implementations of this
specification"?
2. What does it mean to ignore a packet? Drop it? That sounds like
migrating from one version of DFF to another will not work well.
---
Section 7 - Sequence number re-use
Sequence Number - A 16-bit field, containing an unsigned integer
sequence number generated by the Originator, unique to each router
for each Packet to which the DFF has been added, as specified in
Section 13. The Originator Address concatenated with the sequence
number represents an identifier of previously seen data Packets.
That means that the sequence number seen in the network is a function
of the total number of packets using DFF sent by the originator. That
is, it is not a function of the number of packets in a flow.
Are you sure that 2^16 is large enough to prevent re-use of an
identifier? Don't you need to make the identifier {src, dst, seqno}?
---
Section 10
Whenever Section 9 explicitly
requests it in case of such a delivery failure, the following steps
MUST be executed:
It seems to me that whenever failure is mentioned in Section 9 there is
always a requirement that Section 10 is executed. So what does this
sentence mean?
---
I think that Section 15 is adding more assumptions to Section 3. Can you
at a minimum put a forward-pointer in Section 3? (Yes, I see one already
exists, but it says "is optimized for" while Section 15 says "MUST be
limited".) What I would really like is for you to pull the scope limits
from Section 15 and place them in Section 3.
---
The Security considerations will, no doubt, attract attention from the
Security ADs. Possibly the concern for security in this protocol can be
reduced by the fact that it is an experimental protocol, but in that
case we probably need a Section 3 assumption that "there is no need for
strong security in the network".
With the weakness of the looping issues (mentioned above), the potential
for inducing additional returning, and the dependence on attackable
acknowledgment mechanisms, this protocol seems likely to be extremely
vulnerable.
Perhaps the assumption that one can use link layer security in low power
and lossy networks will get approval from the Security ADs (in which
case you will have done the ROLL WG a huge favour :-)
---
Are originators single-homed?
Their seems to be an assumption in several places in the document that a
originator will always have only one next-hop for a packet. There is no
reason for the assumption, and dropping the packet upon first return to
the originator will result in missing paths. Indeed, the preamble in
Section 4 clearly assumes that the source will have a list of next hops,
but then goes on to say:
If the Packet is eventually returned to the Originator of the Packet,
it is dropped.
If I am reading this right it is no big deal. It is not a big change to
fix the issue and allow all next hops to be tried.
Or perhaps "Originator of the Packet" is a host? Does that mean that you
intend hosts to also participate in the DFF protocol? Hmmm, reading
further this looks to be the case. So this mechanism is not just about
forwarding, but requires full participation by end systems. That really
isn't clear from the Abstract and Introduction.
Actually, it is probably a little more complicated! In some senses, the
originator will be the first node in the DFF domain. That is, when
traffic originates outside the LLN, you will not expect the nodes out
there to use DFF, so it will be the responsibility of the gateway.
Section 9.1 tries to get into this by talking about "entry into a
routing domain in which DFF is used". But it fails to clarify what
"originator" means in that context.
And this *very* messily clashes with the use of sequence number per
"originator". If originator is the source address of the packet (and
I can't see how it could be anything else unless you do address
swapping or encapsulation) and if the originator is not required to
generate the sequence number itself (and I don't think remote hosts are
expected to know about DFF) then if an originator is dual-homed into
the routing domain, you will have two routers generating sequence
numbers from the same base. I.e. you will have duplicate identifiers
for packets == disaster.
I hesitate to say this is a show-stopper, but it requires a very
significant constraint to the usability of DFF that must be clearly
documented. And possibly all that is needed is:
- a clear definition of originator
- a statement about tunneling over or into DFF domains *much* earlier
in the document
Ah, finally I found Section 13 that explains some of this. Can you
please bring the definitions forward in the text?
In his Routing Directorate review, Alvaro Retana asked...
> o Section 4.2 reads (last paragraph): "..if a router receives a
> Packet with DUP = 1 (and RET = 0) that it has already forwarded,
> the Packet is not considered looping, and successively forwarded to
> the next router.." I'm at a loss of why would the router forward
> the packet if it is a duplicate of one that it has already
> forwarded.
Ulrich helpfully responded...
> This is necessary for the following reason:
> Imagine a case where after a lost L2 ACK, DUP of the packet is set to
> 1 and a duplicate is created. The duplicate may now during its path
> again be duplicated if another L2 ACK is lost. However, DUP is already
> set to 1, so there is no way of discerning the duplicate from the
> duplicate of the duplicate. That is not much of a problem, other than
> that loop detection is not possible after the second lost L2 ACK on the
> path of a packet.
> However, if duplicates are simply dropped, it is possible that the
> packet was actually a looping packet (and not a duplicate), and so the
> DFS would be interrupted. The problem is to make sure that this "very
> instance" of a packet has passed a router before or not (sequence
> number is not enough; source route would be enough, but at a too high
> cost).
> In practice, we have not observed that to be a problem in deployments
> of thousands of routers in a network. There are some duplicates of
> packets, but not in a considerable amount.
I think this somewhat subtle point would benefit from a little more
explanation in the document (perhaps just include this text).
Maybe, as well, since this is experimental, it would make a useful point
for further research.
---
Section 3 is very useful in setting the scope of the document. I wonder
whether it would be useful to make some references to examples that are
in scope, and some statements about common network types that are
immediately assumed out of scope. For example:
o Assumes that the underlying link layer provides means to detect if
a Packet has been successfully delivered to the Next Hop or not
(e.g., by L2 ACK messages).
And we can note that 802.15.4 has provision for immediate acks, but many
layer two technologies do not even have an option for assuring delivery.
---
Isn't there a tension in Section 3 between the problem of network
density as expressed in:
o Is designed for networks with little traffic in terms of numbers
of Packets per second, since each recently forwarded Packet
increases the state on a router. The amount of traffic per time
that is supported by DFF depends on the memory resources of the
router running DFF, on the density of the network, on the loss
rate of the channel, and the maximum hop limit for each Packet:
for each recently seen Packet, a list of Next Hops that the Packet
has been sent to is stored in memory. The stored entries can be
deleted after an expiration time, so that only recently received
Packets require storage on the router.
And the intension to support dense networks as stated in:
o Is designed for dense topologies with multiple paths between each
source and each destination.
---
In Section 3
In networks with very stable links (e.g.
Ethernet)
I think "Ethernet" is too general a term. Many LLNs use a variety of
Ethernet.
---
I agree with other ADs that the mixture of design objectives and
deployment-limiting assumptions in Section 3 is unhelpful. Perhaps
separate them out into two sections and make the assumptions more
definitive as limitations?
---
The term "PAN" is used without expansion.
---
More thoughts on storage requirements and list processing...
Section 4 has:
For each recently forwarded Packet, a router running DFF stores the
list of Next Hops to which a Packet has been sent. Packets are
identified by a sequence number that is included in the Packet
Header. This list of recently forwarded Packets also allows for
avoiding loops when forwarding a Packet. Entries of the list
(identified by a sequence number of a Packet) expire after a given
expiration timeout, and are removed.
There is a problem with the meaning of "list" and "entry". You have a
list of lists :-)
Do you mean that the Next Hop is dropped from the list, or that the
packet's list of next hops is discarded? Should be easy to clarify the
text.
---
The start of Section 4.1 is abrupt and confusing.
This specification requires a single set on each router, the
Processed Set. Moreover, a list of bidirectional neighbors must be
provided by an external neighborhood discovery mechanism, or may be
determined from the RIB (e.g., if the RIB provides routes to adjacent
routers, and if these one-hop routes are verified to be
bidirectional). The Processed Set stores the sequence number, the
Originator Address, the Previous Hop and a list of Next Hops, to
which the Packet has been sent, for each recently seen Packet.
Entries in the set are removed after a predefined time-out. Each
time a Packet is forwarded to a Next Hop, that Next Hop is added to
the list of Next Hops of the entry for the Packet.
It takes a while to work out "set of what?" You could rephrase for
clarity.
---
Section 4.1
What is a "bidirectional neighbor"? One might assume that it a neighbor
to and from which data can be passed in a single hop? The text goes on
to talk about "one-hop bidirectional routes".
Is that the moral equivalent to being connected with a bidirectional
link? Probably not in the type of network you are building, so maybe
this term needs to be in the Terminology section.
---
Section 4.2
DFF requires additional header information in each data Packet by a
router using this specification. This information is stored in a
Packet Header that is specified in this document as LoWPAN header and
as IPv6 Hop-by-Hop Options extension header respectively, for the
intended "route-over" and "mesh-under" Modes of Operations.
1. I think you have "route-over" and "mesh-under" reversed!
2. The first sentence doesn't parse. But also "requires" is unclear.
I think that the information is needed on a per-packet basis by the
receiving router, and so it encoded in the packet header.
---
Section 11 has
A
smaller list MAY be used, if desired, and the exact selection of the
size of the candidate Next Hop List is a local decision in each
router, which does not affect interoperability.
It is true that it does not affect interoperability of the protocol,
but it does affect the ability to deliver data. So it is probable that
some guidance on the "MAY" might be valuable.
---
I am curious to see no discussion of the management of this protocol
or of networks using DFF. I would imagine that since "unexpected"
things may start to happen, diagnostics would be quite useful.
Overall, this is a nicely written document. Thanks! Couple of minor thoughts below. """ P_next_hop_neighbor_list is a list of Addresses of Next Hops to which the Packet has been sent previously, as part of the depth- first forwarding mechanism, as specified in Section 9.2; """ It seems like it would be possible to run this protocol without per-packet state, under two assumptions: (1) The set of neighbors is preference-ordered (2) Communication with neighbors is bidirectional The document seems to assume both of these. Under these assumptions, the router could simply take a packet that arrives on interface N (in the preference ordering) and transmit it on interface N+1. The only issue then is loop detection, which could be done by keeping a list of recently seen serial numbers, a much smaller piece of state. """ P_HOLD_TIME - is the time period after which a newly created or modified Processed Tuple expires and MUST be deleted. """ It's not immediately clear to me why this value is a constant. Might it be useful for implementations to vary P_HOLD_TIME, for example, reducing P_HOLD_TIME to save space when there are many packets in flight?
- The write-up seems out of date, says "reviews should be sought from 6lowpan, manet and roll" and that Ralph is the AD. Did those WG reviews happen? - Abstract: "DFF assumes...stuff" would be better as "DFF is designed for ...when stuff" or "The design of DFF assumes ... stuff" (Sorry, thats a total nit you can ignore if you want, not sure why it bothers me;-) - general: Other than for IETF organisational reasons, why is this IPv6 only? - general: you say how to process one packet in a router's queue. What if that router has many queued packets, are you saying that it ought pick one, do the DFF thing on that and only even look at a 2nd packet when the DFF process for the first one is complete. So I'm puzzled that you don't mention how a DFF-capable router handles queues. - general: If DFF has been "widely deployed" and you say that then I'd expect a reference that backs that up. - general: If this scheme has had significant academic review (and even the worst routing schemes have, so I assume this non-worst one also has:-) then I'd expect at least some reference to the academic literature. If, OTOH, this is a reasonable routing experiment for which this is no academic literature, then that seems even more noteworthy. In any case, when I finally got to appendix B, I saw some citations. I think moving those up earlier will help, as would increasing the diversity of the author list on the cited papers. To be honest, I read the text as being somewhat optimistic in terms or how widely deployed DTT might actually be. - section 8: What happens if the parameters differ in nodes within the same DFF routing domain? - section 13: Starting the sequence number from 0 seems like a bad plan (for DoS resilience). Starting from a random number would be better. - 14.1.2: Is OptTypeDFF experimental or what? I'm not sure what's needed/correct there. - section 15, I guess this means that the determination of workable routing domain boundaries is also part of the experiment really. Better to say that if so in 1.2. And I think it is so myself. And FWIW, I don't think the text in this section is that clear - you might be better off to just say that routing domain egress/exit nodes (D and R2) need to know or figure out somehow that they are such nodes.
I have no general objection to the publication of this draft. It just struck me that DFF adds an IPv6 extension header to an existing packet or encapsulates them for further tunnelling without any mentioning of MTU issues. Adding bytes to an already existing packet is always subject to MTU issues, e.g., if the packet is already 1500 bytes big and you add another n (with n > 0) you have an issue with the MTU. Please add a description of the MTU issue and also some recommendations what should be done by DFF in that case.
There has been a discussion with Gen-ART reviewer Dan Romascanu, and all his issues have been addressed, with the exception of one new question that was posted in e-mail on March 27th. Please take a look at that issue and respond.
I'm balloting NO OBJECTION on this, but reluctantly as I share Stephen's concern: The writeup is not up to date and the now cognizant AD has not issued a ballot yet. I also wonder why this didn't come out of any WG, even though it was ostensibly reviewed in several. But there is nothing to indicate that it is problematic to run this experiment, so I won't bother ABSTAINing, let alone DISCUSSing it.
Stephen's first comment, and Pete's comments have me covered. I'd particularly like to see an answer to the question about reviews from manet, roll, and 6lowpan (though I do see Adrian's comment that a routing directorate review was done, so maybe that covers at least some of that).
I support Martin's DISCUSS on the MTU issue with adding information to packets in transit. I would think that another metric of interest for experimentation is the overall number of transmissions needed to successfully deliver a packet using DFF as compared to the number needed when a routing protocol is employed. This would need to incorporate the number of control messages sent as well, in some manner.