Early Review of draft-ietf-idr-sla-exchange-05

Request Review of draft-ietf-idr-sla-exchange
Requested rev. no specific revision (document currently at 13)
Type Early Review
Team Routing Area Directorate (rtgdir)
Deadline 2015-05-26
Requested 2015-04-27
Authors Shitanshu Shah, Keyur Patel, Luis Tomotaki, Mohamed Boucadair
Draft last updated 2015-05-26
Completed reviews Rtgdir Early review of -05 by Bruno Decraene (diff)
Opsdir Early review of -10 by Joe Clarke (diff)
Rtgdir Early review of -10 by Ron Bonica (diff)
Tsvart Early review of -10 by David Black (diff)
Assignment Reviewer Bruno Decraene 
State Completed
Review review-ietf-idr-sla-exchange-05-rtgdir-early-decraene-2015-05-26
Reviewed rev. 05 (document currently at 13)
Review result Not Ready
Review completed: 2015-05-26



I have been selected as the Routing Directorate reviewer for this draft. The Routing Directorate seeks to review all routing or routing-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Routing ADs. For more information about the Routing Directorate, please see ​


Although these comments are primarily for the use of the Routing ADs, it would be helpful if you could consider them along with any other IETF Last Call comments that you receive, and strive to resolve them through discussion or by updating the draft.

Document: draft-ietf-idr-sla-exchange-05
Reviewer: Bruno Decraene
Review Date: 22/05/2015
IETF LC End Date: 02/03/2015
Intended Status: Standards Track

Summary:  I have significant concerns about this document and recommend that the Routing ADs discuss these issues further with the authors. 

(Or the chairs as I see that this document has not yet been passed to the ADs)

In general, the readability of the document is acceptable but could be improved, both from a language and technical precision point of view. (examples below).
However, there are some normative parts of this Standard Track specification that I could not understand.

M1) The Introduction states
"In a multi-vendor network, translating SLAs into technology-specific and vendor-specific configuration requires to consider specificities of each vendor.
There does not exist any standard protocol to translate SLA agreements into technical clauses and configurations and thus both the steps of out of band learning of negotiated SLA and provisioning them in a vendor specific language can be complex and error-prone."
- I guess some would use NETCONF/YANG to address this. It could be questioned why BGP has been preferred. Especially since:
	- in the VPN context (using a lot of QoS and the uses cases described in the document) many customer connections use static routing rather than eBGP. Hence this specification would not be enough to distribute SLA and would require another protocol.
	- QoS is only one part of the configuration effort. Why using different protocols to configure different aspects?

- This may be related to draft-l3vpn-service-yang and the L3VPN Service Model WG (l3sm). May be some form of coordination would be beneficial.

M2)"The exception is where a BGP speaker, in the middle of an update path to the destination AS, aggregates prefixes. We will refer this middle BGP speaker, that aggregates routes, as an Aggregator. Aggregator is then required to insert original NLRI details in the optional advertiser field"
If you refer to the use of AS_SET, RFC6472 recommends against the use of AS_SET. So, I'm not sure that there is a need to add complexity in this specification in order to handle route aggregation.
If removed, "section 5.3 Aggregator" may also be removed.
 "Traffic Class Description
        Ascii Description of the Traffic Class"

Should it be UTF-8?

M5) SLA definition
 It would be good to define what a SLA is. Especially since this whole goal of the draft is to advertise SLA in BGP. Citing an individual draft [CPP] is not enough to have an agreed on definition, especially for a STD track RFC.
 Since QoS is not new in the IETF, there is probably a document defining it (or using a more popular terminology).
Looking in google, I don't really see matches for "IETF SLA" (outside of documents written by the authors).
Wikipedia seems to give a quite different definition, much wider than diffserv specific parameters which seems to be the main point of this BGP attribute:
"A service-level agreement (SLA) is a part of a service contract[disambiguation needed] where a service is formally defined. Particular aspects of the service - scope, quality, responsibilities - are agreed between the service provider and the service user. A common feature of an SLA is a contracted delivery time (of the service or performance). As an example, Internet service providers and telcos will commonly include service level agreements within the terms of their contracts with customers to define the level(s) of service being sold in plain language terms. In this case the SLA will typically have a technical definition in terms of mean time between failures (MTBF), mean time to repair or mean time to recovery (MTTR); identifying which party is responsible for reporting faults or paying fees; responsibility for various data rates; throughput; jitter; or similar measurable details.."

 "   Traffic Class Service (optional),
        16-bit          = type of the field
        variable-length = based on type of the service"
Please specify the content of the "variable-length" field. 
If it only contains the Data Type of the IPFIX Information Elements, I'm not sure how the encoding supports, on the receiving side, the skipping of unknown ElementID.
Given that I also don't see an end to end negotiation channel for the BGP speaker to known the capabilities of the BGP receiver, I don't see how the specification will support the introduction of new Traffic Class Services in the future.	

I don't see the relation between the QoS attribute and the NLRI.
- Is the QoS attribute only applicable to the NLRI advertised? If so what is the relation with destinationIP* advertised in the classifier Element? Should they be restricted to more specifics of the advertised NLRI?
- Also the QoS attribute may instruct "to drop entire BGP update message [Note that it is an indication to drop entire update message, not only QoS attribute]". This means that the NLRI will not be propagated, hence routed, anymore, which seems strange. To preserve routing of the NLRI, do the QoS attribute require to advertise a less specific prefix (with no QoS attribute) in addition? Or to use ADD_PATH to advertise the NLRI multiple times (with & without the QoS attribute).

Possibly same question for the relation between the QoS attribute and the AFI/SAFI of the BGP UPDATE. Is the QoS attribute to be understood in the context of the AFI/SAFI or not? e.g. if the classifier element is the ipDiffServCodePoint does it match all protocols or only the one of the AFI/SAFI?

M8) Error handling
Current text says that error handling MAY use attribute discard or MAY use treat as withdraw.
This seems underspecified as one implementation would be free to do nothing, while another could do session reset. This would open many BGP session reset in real networks.
Please specific what must be done. 
Besides, other part of the document provides some more specific/different error handling. e.g. "If there are more than one such Traffic Classes present then advertised SLA parameters MUST be ignored."
Finally, the spec needs to define when the new attribute is considered malformed.
On an editorial note, I would prefer a dedicate section related to error handling.

M9) security consideration may require some discussion.
"There is a potential for mis-behaved AS to advertise wrong SLA, stealing identity of another AS."
Agreed. But there are probably other attack vectors (e.g. modifying the attribute during propagation, setting parameters to instruct BGP to drop the message (as this seems alllowed by the specification)...)

"This resembles to problems already identified and resolved, in the routing world, thru reverse path forwarding check."
"Resembles" is not enough. "Resolved" is probably a bit quick.

"One proposal, inline to RPF, to resolve such threats is to have each BGP speaker node, in the forwarding path, perform reverse path check on source AS."
If this is a specification, it should be described in the document (quickly citing it in the security section is not enough).
It's also a bit short in term of specification. e.g. I don't see "source AS" in the forwarding path (neither in the packet nor in the FIB)

"Since we expect these messages to originate and distributed in the managed network, there should not be any risks for identity theft."
If you restrict the use of this specification/ATTRIBUTE in "managed network", this needs to be clarified from the beginning (and not at this very end of the document), and the specification should take measure to ensure that this attribute is not received from/leaked outside of this "managed network".
Defining "managed network" may also help, especially since the proposition involves multiple ASes and multiple organisations.
(otherwise, you need to handle the case when this attribute is used outside of "managed network" and therefore consider the security implications)

M10) IANA section is under specified.
e.g. you should:
- states the name of the registry that you want to create or update.
- states the name of the new entries in existing registries.
- define all your new registries. (e.g. you don't have ones for new QoS TLV subtypes (defined in §3.1), Optional Advertised id TLV, SLA event Type...)
- define the registration policy of those new registries.

Reading RFC5226 may help.

Minor (some not so minor):
m1) From an editorial standpoint, the document may benefit from an english language review.
  - Some sentences are hard to parse (at least for me). e.g. "The need to exchange SLA parameters between domains (Automated Systems (AS)), where in use-cases described in this document, BGP is a suitable protocol for inter-domain exchange [RFC4271][RFC4364].
  - Adding a full point "." at the end of each sentence may help the parsing.
  - IMHO some sentences could be rewritten to improve readability. e.g.
        highest order bit (bit 0) -
            It defines if update message MUST be dropped (if set to 1)
            without updating routing information base, when this is the
            last BGP receiver from the list of destination ASes this
            attribute is announced to, or MUST announce (if set to 0)
            further to BGP peers
        highest order bit (bit 0) -
			This flags defines how update message must be handled by the last BGP receiver in the list of destination ASes.
            If set (1) update message MUST be dropped without updating routing information base.
			If cleared (0) update message MUST be further advertised to BGP peers.
  On a side note, at this point in the document, it's not crystal clear what you mean by "update message". The QoS Attribute TLV? The QoS BGP attribute? The BGP UPDATE message? In general, in the document, please use the protocols names of the messages/fields.
  - "SLA sub-type specific value field details." I guess you mean :s/specific/specifies. 
OLD: Remaining bits are currently unused and MUST be set to 0
NEW: The lower-order seven bits of the Attribute Flags octet are unused.  They MUST be zero when sent and MUST be ignored when received. 
(Proposed text is a copy/past from RFC 4271. You are free to use another text but please specify the behaviour on the receiving side as we have seen BGP session reset in the Internet which a much clearer sentence.)

m3) That's not specific to this document, but I would find useful to have the related implementation report draft be referenced in the informative reference section.

m4) "sub type Length" Please specify exactly what part of the message is covered by the length (as some IETF spec use the length of the value field, while some other use the length of the type+length+value fields.
m5) "32-bit source AS (Advertiser)" The word "advertiser" may be misleading. (cf draft-hares-idr-update-attrib-low-bits-fix). RFC 4271 uses "Originating speaker" (SIDR seems also to use "Origin".
Multiple occurrences in the draft.

m6)  "0 = ignore Source and Destination AS list from this Value field.
            Instead refer to Source and Destination AS as defined by BGP
I'm not sure what is meant by the second sentence. Please use the specific names of BGP messages and fields.

m7) "format of the SLA message"
Giving names and number to figures could be considered.
So does adding the memory axis:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
"    Optional advertiser id total len
        16-bit Source address identifier (optional)."
I read this as the field "Optional advertiser id total len" contains a 16-bit Source address identifier.
While this field probably contain the length of "something". Please check/clarify. 

    "Optional Advertiser id TLV
        4-bit type"
You need to specify the size of the "Length" field. Especially since you introduce a somewhat unusual size of the "Type field" and some people may assume that the size of the "length" field is of the same size (4-bit), while some others may believe its the usual 1-octet. 		

m10)"    Destination AS count
        32-bit destination AS count to take variable length AS list."
I guess you mean:
number of destination ASes
This field indicates the number of destination AS present in the Destination AS list

m11) "SLA Id"
The text under "SLA Id" mixes text related to "SLA Id " and text related to  "Content". Please split the text.

m12) I don't see a description of the filed "Content as per SLA Event"

m13)  "    SLA Length
Please specify what is covered/measured by this length field.		

"    Direction
        0x1 = incoming, from destination AS towards source AS
        0x2 = outgoing, from source AS towards destination AS"
I find the terms "incoming" and "outcoming" a bit misleading. e.g.
the direction "from source AS towards destination AS" seems to be:
- outgoing in the source AS
- ingoing in the destination AS
m15)    "Traffic Class Descr Length
        08-bit, size of the length"
proposition  :s/size of the length/ length of XXX

m16) In section 3, I don't see the specification of the REQUEST SLA even type.
At the end of the document, it's said that "discussion of REQUEST message, for this purpose or any other purpose, is considered out of the scope of this document." In which case, you should probably not specify a REQUEST SLA even type.
"Given IPFIX [RFC5102] has well defined identifier set for a large number of packet attributes, IPFIX IANA registry is "


; chosen to specify packet classification attributes."
Sentence is hard to parse, which is an issue for a normative part.	
The reference should probably be listed in the reference section.

"However, since not all identifiers from IPFIX would be applicable to this proposal, only a limited set identified here can be supported by BGP SLA exchange. Any new element identifier, in future, added to the IPFIX IANA registry does not automatically mean supported for this proposal."

- This probably calls for a IANA registry to identify which element identifier can be used.
- Text should clarify that the list of accepted identifiers is defined in the subsequent list (having no name and no number).
m20) section 3 is hard to read.
- IMO the document/section 3 would benefit from an section presenting an overview of the solution
- section 3 have a single subsection (3.1) hence the interest of using subsection is limited. Given the size of section 3 (10 pages), to improve readability I would suggest the use of multiples subsection.

      "The minimum policed unit (m) and maximum packet size (M)
      parameters have no relevance for the purpose of SLA exchange.
      Thus they MUST be ignored."

Why specifying and sending such parameters in BGP if they MUST be ignored by the receiver?	  
" This rate indicates the minimum rate, measured in bytes of Layer 2 (L2) datagrams per second,"
I'm not sure why the Layer 2 size is used rather than the layer 3 size. As a consequence, you need to send additional information (L2_OVERHEAD) which may be not needed otherwise.
Draft cites RFC 2212 as the source of this TRAFFIC_CLASS_TSPEC parameter, and RFC 2212 use the IP datagram size.

"4.  Originating SLA Notification

   The QoS attribute to advertise SLA sub-type MUST be added by the
   originator of a BGP UPDATE message."
I guess you don't mean that advertising this new attribute is mandatory. So please rephrase (e.g. at least :s/MUST/MAY)   

m24)  " If a BGP node is capable of processing QoS attribute, it optionally MAY process the message."
   What message? The BGP UPDATE?
m25)   "BGP node MUST drop SLA related sub-type from the QoS attribute, if
   none of the AS from the destination list is in the forwarding path."

   There is no AS in the forwarding path. Please rephrase.
m26)   "5.2.  BGP Node not Capable of Processing QoS Attribute

   If the BGP node is not capable of processing QoS attribute, it MUST
   forward the QoS attribute message unaltered."

This section is completely useless. It should either be removed or at the minimum should not specify a behavior. e.g.
OLD: it MUST forward
NEW: as per RFC4271, it will

or should define what is meant by "processing QoS attribute". (my reading is "does not recognize")

m27) "If advertised QoS Attribute, inside an update message, is with a flag set indicating to drop that message, a receiver MUST drop message if it is the last receiver, in update path, that message is advertised to."
This is not extremely clear. Especially for a "MUST" behavior. Please rephrase using the protocols names of the messages/fields.

m28)"If the advertised SLA is from the next hop, in the reverse path, the receiver may implement advertised SLA for the whole link, the link could be physical or virtual link, associated with the next hop. "
I don't understand. Please rephrase. (e.g. which next-hop?, reverse path of what?)

"If NLRI advertised in update message is not of the next hop,"
I don't understand. Please rephrase. 

   "For cases where if earlier messages have not reached the intended receiver yet, a re-signaling is required.  A receiver may intend to request an SLA message from the originator in such case.  Since BGP messages are considered reliable, it is assumed that advertised messages always reach intended receivers.  Thus discussion of REQUEST
   message, for this purpose or any other purpose, is considered out of the scope of this document."
Some parsing issues.
The text seems to self contradict:
- "a re-signaling is required"
- "Since BGP messages are considered reliable, it is assumed that advertised messages always reach intended receivers."
   "There are well-defined recommendations that exist for traffic class mapping between two technologies. "
   Please provides references.
"AS2 can advertise the same or a subset of that SLA to AS3 in the context of tunnel's ip address."
Which tunnel are you refering to?

N1) ID Nits reports 1 error (Obsolete normative reference)
N2) In a BGP context, "AS" stands for "Autonomous System" and not "Automated System"
"     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       ~              Traffic Class Elements count/values              ~
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ "
I feel that the figure could be updated to more accurately represent both fields (length).	Something like   
       | Traffic  Count|      Traffic Class values                     |
       +-+-+-+-+-+-+-+-+                                               ~
       |                                                               |
       ~                                                               ~
       |                                                               |

N4) In general for all figures, it's easier if the name of the legend / descriptive paragraph match the name in the figure.
"Class Desc Len" in figure versus "Traffic Class Descr Length" in the legend
"Advertiser id TLVs" in figure versus "Optional Advertiser id TLV" in the legend
"Event" in figure versus "SLA Event Type" in the legend
"IPFIX IANA registry is "


; "
May be added to the reference section.

N6) There is a mix of usage of "octet" and "byte". For consistency, only one should be chosen ("octet" IMHO)

N7)[CPP]      I-D.boucadair-connectivity-provisioning-profile"
why not citing RFC 7297 instead?



Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.