Ballot for draft-ietf-idr-bgp-open-policy

Comment (2022-01-19 for -20) Sent

Firstly, thank you for this document -- if deployed, I think it will go a very long way towards reducing accidental leaks.

I have a few comments, feel free to address or not.

1: It seems like there should also be a code-point for "Complex" - I guess that something similar could be inferred if this is negotiated and one of the other codepoints isn't sent, but having it explicit seems cleaner.

2: The document says things like "It **prevents** ASes from creating leaks,..." and "The purpose of this attribute is to **guarantee** the once a route is sent, ..." - it feels like this is overselling things. As an example, the OTC attribute could be stripped (unintentionally or maliciously) and it would be included in a leak, etc. Again, this is only a comment, but it seems like more weasel words would make the document stronger.

3: "If a route with the OTC Attribute is received from a Customer or RS-Client, then it is a route leak and MUST be considered ineligible" - fine, but it seems like there should be a bit more than just "considered ineligible", for example, this should be logged somewhere, or implementations should allow the operator to do the conceptual equivalent of "show route leaked" or similar....

Again, thank you for a really useful document,

Comment (2022-01-12 for -19) Sent

I’m happy to see this document proceeding forward. Thanks for all the work that went into it, and into the shepherding. I do have a few points I’d like to raise.

1. Section 4 talks about policy in several places. As you know, “policy” is a term with quite a bit of baggage in BGP, and without further qualification, it’s likely to be interpreted as “routing policy configured by the router’s operator”. I wondered at first if your choice of “policy” was deliberate, to imply that the specification is not expected to be hard coded into the implementation, but rather configured (or not) by the operator? But, the whole point of the spec is to move away from using policy configuration to protect against route leaks — requiring configuration to enact the import and export “policies” given in §4 would fly in the face of the document’s raison d’être. Furthermore, you close §4 with “the operator MUST NOT have the ability to modify the policies defined in this section” — and that does help, but not before considerable potential confusion is created by the unfortunate choice of terminology.

So, I think you mean that import rules (1)-(3) and export rules (1)-(2) MUST be hard coded into any compliant implementation. I strongly suggest you find a term other than “policy” to express what you mean. For example, “procedures” seems like a good replacement. You could say something like “the following procedures apply to the processing of the OTC Attribute during route reception” and “the following procedures apply to the processing of the OTC Attribute during route transmission”. (You could also try to put it in terms of RFC 4271, but it might be a bit painful.) I just grepped through §4 and a simple replacement of “policy” with “procedure” throughout seems like it would be almost sufficient.

I was ready to make this a DISCUSS until I came to the final sentence of §4. That sentence does clarify matters sufficiently to reach the minimum bar, but I think the document would be even more usable if further edits are made along the lines above.

2. Section 4 says

The OTC Attribute may be set by the egress policy of the remote AS or
by the ingress policy of the local AS.

First, my earlier comment about “policy” applies here as well — you could maybe say “on egress from the remote AS or on ingress to the local AS”. Beyond that, I suggest changing the “may” (which is admittedly lower-case”) to a “might”, which has even less risk of confusion with any kind of normative meaning. Presumably what you mean is actually that if the remote AS is noncompliant with this spec, the local AS will have to set the attribute, and this is a feature, not a bug.

3. In Section 4 you write,

The same OTC Attribute which is set locally also provides
a way to detect route leaks by an AS multiple hops away if a route is
received from a Customer, Peer, or RS-Client.

I don’t understand this sentence, as written. I think maybe it needs to be several sentences. I don’t think it’s needed, or even desirable, to explain in detail in this document how you might make use of the OTC Attribute for troubleshooting, but if there’s a different document that does explain it, an informative reference might not hurt. I’m pretty sure I remember this being covered in one or more IDR and/or GROW presentations, but I don’t know if it got written down beyond slide sets.

4. In §1 you say “This document provides configuration automation using BGP Roles”. The document, per se, doesn’t provide any such thing of course — an implementation of the specification would provide such. “This document specifies” would be more correct I think.

A second point related to the same sentence is that it feels like a poor fit to refer to what you do in this spec as configuration automation, a term more usually associated with automatic, typically database-driven, generation and management of (router) configurations. In the second ¶ of §1, you identify configuration as the bad, error-prone, legacy way of preventing route leaks. So maybe something like, “This document specifies a means of replacing the configuration-based method of route leak prevention, described above, with a more automated method using BGP Roles,”

5. In your Terminology Section (§1.1) you define a number of terms that are defined again, immediately, in §2, namely Provider, Customer, RS, RS-Client, and Peer. I think the definitions provided in §2 are better than those in §1.1, and sufficient unto themselves, and that you could remove the first ¶ of §1.1.

6. You refer to transit and non-transit providers in many places throughout the document. Although these seem to some of us like well-known terms of art that need no reference, I have the feeling that may not be true for our entire community, or worse still, different people might all “know” what it means but with different definitions, and so it might be desirable to provide a citation.

7. Have you given any thought to Autonomous System Migration Mechanisms (RFC 7705)? Mostly, I just have a free-floating unease here, because with the hacks described in RFC 7705, a given router may represent itself as any one of several different ASes, and your spec does ASN embedding and enforcement. Most likely there would be no problem, since the egress rules in §4 suggest the OTC attribute is to be attached as part of route transmission; therefore, a router might be expected to attach whatever value is appropriate based on the ASN it’s currently representing itself as. It might still be worth adding a note cautioning implementors about this, though, since implementations tend to do things for the sake of efficiency that spec authors aren’t expecting. One optimization pattern is to pre-build updates and copy them to many different transport connections; in such cases the OTC value might be prepopulated and the implementor might need to give extra attention to the exception case where a particular transport connection is representing a different ASN from the router's "real" ASN.

Comment (2022-01-19 for -20) Sent

Nice work on this.

The shepherd writeup says this document creates a new Expert Review registry and suggests some experts, but the document doesn't actually create any registry as far as I can tell.  Is there something missing in the document, or is the shepherd writeup in error?

Comment (2022-01-19 for -20) Not sent

Thank you to Alexey Melnikov for the SECDIR review.

I agree with Warren that the following text is too strong:

* Section 4 

   The purpose of this attribute is to guarantee
   that once a route is sent to a Customer, Peer, or RS-Client, it will
   subsequently go only to Customers.

Maybe "The purpose of this attribute is to convey ..."

Comment (2022-01-18 for -19) Sent

Thank you for the work put into this document. The technique could indeed be useful, let's simply hope that not only the well-configured BGP speakers will use it ;-)

Please find below some non-blocking minor COMMENT points (no need to reply).

Special thanks to Susan Hares for the shepherd's write-up including the section about the WG consensus. 

I hope that this helps to improve the document,

Regards,

-éric

Is RFC 7908 really a normative reference ?

-- Section 3.1 --
"An eBGP speaker MUST NOT advertise multiple versions of the BGP Role Capability" Should there be a description of what to do when a eBGP speaker receives multiple conflicting roles on a session ? Or hint to section 3.2 for more details ?

Yes (for -19) Unknown

Yes (for -20) Not sent

No Objection (2022-01-17 for -19) Sent

Thanks to Alexey Melnikov for the secdir review and re-review; I echo
his comments about the usefulness of the mechanism and the quality of
the security considerations section.  Thank you for this document!

I'm rather sympathetic to John's remarks about "policy" vs (e.g.)
"procedures".

This document establishes the generic-sounding "BGP Role" capability and
a registry for the corresponding values, but the focus of the technical
mechanisms presented here is to avoid route leakage.  Might there be
other future consumers of BGP Role (perhaps including use of Role for
iBGP to get in-band confirmation of the relationship between peers) that
are unrelated to avoiding route leakage?  Is there anything in this
document we might change to facilitate such future extensions?

Section 6

Subsections for the individual actions/registries being acted upon/in
might aid readability.

Section 8.1

RFC 7908 is referenced only once, in a location that doesn't obviously
induce a normative requirement.  It does seem like a good thing to read
before implementing this document, though, so maybe the takeaway is to
add more references to it rather than demote it to informative.

NITS

Section 1

   Existing approaches to leak prevention rely on marking routes by
   operator configuration, with no check that the configuration
   corresponds to that of the eBGP neighbor, or enforcement that the two
   eBGP speakers agree on the relationship.  This document enhances the

I suggest adding an adjective to "relationship" (maybe "topology"?).

   BGP OPEN message to establish an agreement of the relationship on
   each eBGP session between autonomous systems in order to enforce

Similarly here, though maybe "nature of the relationship" in this
instance.

Section 2

   A BGP speaker may apply policy to reduce what is announced, and a
   recipient may apply policy to reduce the set of routes they accept.
   Violation of the above rules may result in route leaks.  Automatic
   enforcement of these rules should significantly reduce route leaks
   that may otherwise occur due to manual configuration mistakes.

If I understand correctly, "violation of the above rules" refers to the
enumerated rules a few paragraphs prior, and specifically not to the two
applications of policy mentioned in the preceding sentence.  Assuming
that's correct, then it seems useful to clarify the transition between
sentences, perhaps "use of such policies that violate the rules listed
above may result in route leaks" or "when these policies violate the
rules listed above, route leaks may occur".

Section 5

   An operator may want to achieve an equivalent outcome by configuring
   policies on a per-prefix basis to follow the definitions of peering
   relations as described in Section 2.  However, in this case, there
   are no built-in measures to check the correctness of the per-prefix
   peering configuration.

"built-in" is perhaps needlessly vague; we might say "in-band" or "no
measures in the protocol" instead.

Section 7

   Removing the OTC Attribute or changing its value can limit the
   opportunity of route leak detection.  Such activity can be done on

I think s/of/for/

No Objection (2022-01-20 for -20) Sent

DOWNREF [RFC7908] from this Proposed Standard to Informational RFC7908.
It doesn't look like RFC7908 needs to be a normative reference, so
citing this informationally should address this.

Document still refers to the "Simplified BSD License", which was corrected in
the TLP on September 21, 2021. It should instead refer to the "Revised BSD
License".

Thanks to Gyan S. Mishra for their General Area Review Team (Gen-ART) review
(https://mailarchive.ietf.org/arch/msg/gen-art/bnQ6_d1J8TSN6zw-579JvK8bxkQ).

-------------------------------------------------------------------------------
All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

Section 1. , paragraph 4, nit:
-    This document specifies a means of replacing the operator driven
-                                                             ^
+    This document specifies a means of replacing the operator-driven
+                                                             ^

Section 4. , paragraph 11, nit:
-    the ingress of the local AS, i.e., if the remote AS is noncompliant
+    the ingress of the local AS, i.e., if the remote AS is non-compliant
+                                                              +

Section 3.1. , paragraph 7, nit:
> , if the BGP Role Capability is sent but one is not received, the BGP Speake
>                                     ^^^^
Use a comma before "but" if it connects two independent clauses (unless they
are closely connected and short).

Section 4. , paragraph 8, nit:
> fied in this document are NOT RECOMMENDED to be used between autonomous syste
>                               ^^^^^^^^^^^^^^^^^
The verb "RECOMMENDED" is used with the gerund form.

Section 4. , paragraph 13, nit:
> epresent itself as any one of several different ASes. This should not be a p
>                               ^^^^^^^^^^^^^^^^^
Consider using "several".

Section 7. , paragraph 2, nit:
> will limit route propagation in an unpredictable way. 8. References 8.1. Norm
>                              ^^^^^^^^^^^^^^^^^^^^^^^
Consider replacing this phrase with the adverb "unpredictably" to avoid
wordiness.

These URLs in the document can probably be converted to HTTPS:
 * http://www.iana.org/assignments/bgp-parameters/bgp-parameters.xhtml#bgp-parameters-2

No Objection (2022-01-18 for -20) Sent

Hi,

Thanks for this document.  I'm no expect on BGP, but I found this document pretty easy to read and understand.  Not surprisingly, I'm always supportive of drafts that aim to minimize configuration errors and operational problems.

I have a a few comments though:

(1) Did you consider explicitly defining a code point for the "complex" role?  Today, because of the desire for backwards compatibility, it is not possible for a peer to distinguish between the case that (i) no role has been provided by the peer because it doesn't support roles, and (ii) no role has been provided by the peer because it thinks that it has a "complex" peering relationship.  Unless I'm missing something, adding an explicit role for "complex" would allow the peer to distinguish between these two cases and potentially capture more errors (e.g., complex role is only valid with peer's complex role).  In theory, enabling "strict" checking might also be able to cover this case, it is feels less intuitive (i.e., enable strict, but don't advertise a role to indicate and enforce a complex relationship).

(2) The obvious OPS/Mgmt AD question: Is the associated configuration for this feature covered in the IETF BGP YANG module, or extension/augmentation that is being worked on?  If so, then an informational reference to the configuration leaves associated with this feature might be helpful to readers, although one could perhaps argue that this would end up being a forward reference, and any associated YANG module configuration should have references back to this draft anyhow.

(3) As a nit, I notice that the sender can only include a single role but the receiver must tolerate multiple roles if they have the same value.  Perhaps everything else in BGP is tolerant of receiving malformed packets, and hence this is done for consistency, but I do wonder whether it isn't better to return an error to the peer instead.

Regards,
Rob