Ballot for draft-ietf-bess-evpn-inter-subnet-forwarding

Comment (2021-06-30 for -14) Sent

Thank you for the work on this document.

I have scanned the document for ART issues and found none. I only have two editorial comments.

Francesca


1. -----

FP: Please expand PE, DPI, FW (which maybe should be just replaced by forwarding), TTL, RD, on first use. PE also seem to belong in the terminology section, in my opinion.

2. -----

FP: nit - s/gratutious/gratuitous

Comment (2021-07-14 for -14) Sent

Thanks to the authors for their work in addressing my comments. Copying my (resolved) discuss points here for posterity.

----

I found this document difficult to review. Some of this might be due to the fact that I'm not an expert on EVPN, but I think some of the reason is that the document could be structured better and expressed more clearly. The only reason I'm not opposing progression of the document on the grounds that it's too unclear to implement is that I've been told, and accept on faith, that implementations *have* been successfully written starting from the spec, which implies it's implementable -- I guess by people who are expert in EVPN already, it wouldn't be implementable by me.

In any case, I do have some points I would like to discuss, that are more actionable.

1. I agree with Robert Wilton's comment on -09:

```
One question I have is whether it is possible to have a deployment where some devices support synchronous mode and others support asynchronous mode. Am I right in presuming that this is not supported and if so is this capability signaled in any way? Or is the expectation that this would be controlled via deployment choice of network device, or though configuration management?
```

This issue still exists in -14. I think it should be addressed in the document. Similarly, I agree with Warren Kumari's comment, also on -09:

```
I would strongly recommend that the authors read the OpsDir review at: https://datatracker.ietf.org/doc/review-ietf-bess-evpn-inter-subnet-forwarding-09-opsdir-lc-jaeggli-2020-07-06/ , especially the: "it would be helpful if section 4 would be more explicit for non-implementors on when symetric or asymetric modules would be chosen, as it stands the variation basically reads like the enumeration of the features of various implementations." comment (which I fully agree with).
```

It seems both of these comments could -- and should! -- be addressed by adding a few paragraphs talking about these topics. This could be done either in §4, as Warren suggests, or in some other section (e.g. you could add an "operational considerations" section).

2. Section 7.1

I’m guessing this question isn’t unique to this document, but since this is where I encountered it, I’ll ask: it seems as though the described mobility procedures are vulnerable to a condition where a particular (IP, MAC) appears at two different NVEs at the same time. If this condition exists (either innocently, or maliciously) what prevents the source and target NVEs from continually attempting to claim the (IP, MAC) from one another, flooding the network with updates all the while?

(This applies to 7.2 as well.)

Since this seems like a potential security issue, I'm including it in my DISCUSS.

----

Below are a number of questions and comments that I hope might help improve the document. I haven't chosen to make them blocking by including them in my DISCUSS; nonetheless I would appreciate replies to them.

1. I agree with the comments by several of the other reviewers, that there are just too many gratuitous acronyms in this document. They aren't the only thing that makes it hard to read, but they certainly contribute. I'm disappointed to see this hasn't been addressed between versions -09 and -14. It would have been a small matter of search-and-replace to go through and expand most of the acronyms.

2. Section 2

```
R1: The solution must allow for both inter-subnet and intra-subnet
traffic belonging to the same tenant to be locally routed and bridged
respectively. The solution must provide IP routing for inter-subnet
traffic and Ethernet Bridging for intra-subnet traffic. It should be
noted that if an IP-VRF in a NVE is configured for IPv6 and that NVE
receives IPv4 traffic on the corresponding VLAN, then the IPv4
traffic is treated as L2 traffic and it is bridged. Also vise versa,
if an IP-VRF in a NVE is configured for IPv4 and that NVE receives
IPv6 traffic on the corresponding VLAN, then the IPv6 traffic is
treated as L2 traffic and it is bridged.

R2: The solution must support bridging for non-IP traffic.
```

R1 is a little tortured, where you add all the caveats about “treated as L2 traffic”. Seems to me like it would fall out more naturally if you had simply introduced the concepts of routable and non-routable traffic, where routable traffic is that for which a suitable IP-VRF exists. That would also have the pleasant effect of making R2 say “… must support bridging for non-routable traffic” instead of “non-IP traffic”, which is technically incorrect (since per R1 you might have non-routable IP traffic).

```
R3: The solution must allow inter-subnet switching to be disabled on
a per VLAN basis on PEs where the traffic needs to be backhauled to
another node (i.e., for performing FW or DPI functionality).
```

What’s “switching”? The document is about routing vs. bridging, which do you mean? I think you mean “routing”. IMO you should get rid of the word “switching” and replace with something less ambiguous, e.g. “routing”. (Both here and the one other place in the doc where you use “switching”.)

Also, I think you don’t mean “i.e.”, I think you mean “e.g.”. The meaning of “i.e.” is “in other words”. The meaning of “e.g.” is “for example”. The best way to avoid these problems, IMO, is to simply write out what you mean, so in this case write “(for example, for performing FW or DPI functionality).” (And oh by the way, you haven’t defined or expanded FW or DPI, please do so.)

3. Section 4

```
o references to ARP table in the context of asymmetric IRB is a
logical view of a forwarding table that maintains an IP to MAC
binding entry on a layer 3 interface for both IPv4 and IPv6.
These entries are not subject to ARP or ND protocol.
```

This passage shines a spotlight on the fact that “ARP table” as it’s used in this document is a misnomer, since it’s a table that is not (necessarily) populated by ARP. I don’t propose that you change the nomenclature, since it’s firmly established even though wrong — but it might be worth adding the first sentence or one like it to your Terminology section.

4. Section 4

Figure 2 depicts BT2 being present on the ingress PE, but the text makes it clear that in the symmetric mode that this figure depicts, BT2 doesn’t actually need to be there. Wouldn’t it be clearer if you didn’t show it?

5. Section 4

I have a hard time parsing this text:

```
Each BT on a PE is
associated with a unique VLAN (e.g., with a BD)
```

So, 1 VLAN —> at least 1 BT (1:many)

```
where in turn it is
associated with a single MAC-VRF
```

So, 1 MAC-VRF —> at least 1 BT (1:many)

```
in the case of VLAN-Based mode or a
number of BTs can be associated with a single MAC-VRF in the case of
VLAN-Aware Bundle mode.
```

So, 1 MAC-VRF —> at least 1 BT (1:many)

Since this is stated as an exception I guess that means you meant the preceding two (that I parsed as 1:many) are actually supposed to be 1:1? If so I think this needs a rewrite (it probably does regardless, for clarity).

6. Section 4.1

When you write “Internet standard bit order“, do you mean “network byte order“? Although even network byte order appears to be non-applicable, since the values are shown with an explicit byte order.

I realize the definitions are merely pasted from RFC 5798 and that ship has sailed, but unless you can explain what “(in hex, in Internet standard bit-order)” is supposed to mean, I suggest removing it. (Alternately and less desirably, make it explicit that you’re providing a direct quotation of RFC 5798.)

7. Section 5.1

You say the Encapsulation Extended Community and Router’s MAC Extended Community have to be sent, but you say nothing about the required values. For Router's MAC, §8.1 specifies the required value, I suggest a forward reference to it. For Encapsulation, the closest I was able to find to a place where this is specified was section 9.1.1, but that's only an example. There really needs to be some place where it's spelled out. A bare minimum would be to cite RFC 9012 §4.1, but that just provides the syntax -- you really should say something more about how to decide what value to send. For that matter, it could be what valueS to send -- is it legal for a NVE to advertise multiple Encapsulation Extended Communities? You don't say it isn't, and there are potential reasons to do so.

8. Section 5.2

```
o Using MAC-VRF Route Target (and Ethernet Tag if different from
zero), it identifies the corresponding MAC-VRF (and BT). If the
MAC- VRF (and BT) exists (e.g., it is locally configured) then it
```

You use “e.g.” so I presume there might be other reasons the MAC-VRF and BT might exist even if not locally configured?

```
imports the MAC address into it. Otherwise, it does not import
the MAC address.

o Using IP-VRF route target, it identifies the corresponding IP-VRF
and imports the IP address into it.
```

You don’t provide any conditional language in this bullet about “if the IP-VRF exists”. Why is that caveat required for MAC-VRF but not for IP-VRF?

9. Section 5.2

```
The inclusion of MPLS label2 field in this route signals to the
receiving PE that this route is for symmetric IRB mode and MPLS
label2 needs to be installed in forwarding path to identify the
corresponding IP-VRF.
```

I was unable to make head nor tail of this paragraph. I suppose §5.4 is where the behavior is actually specified, so in a way it doesn’t matter (although maybe a forward reference would help).

10. Section 5.2

```
If the receiving PE receives this route with both the MAC-VRF and IP-
VRF route targets and if the receiving PE does not support either
asymmetric or symmetric IRB modes, then if it has the corresponding
MAC-VRF, it only imports the MAC address. Otherwise, if it doesn't
have the corresponding MAC-VRF, it must not import this route.
```

If it doesn’t support either asymmetric or symmetric IRB modes, then doesn’t that mean it doesn’t implement this specification at all? In that circumstance, how do you expect your “must not” to be respected?

11. Section 5.3

```
If host B's (MAC, IP) has not yet been
learnt either via a gratuitous ARP OR via a prior gleaning procedure,
a new gleaning procedure MUST be triggered
```

Since you’ve used MUST here, you MUST provide a reference to where the “new gleaning procedure” is specified.

Also, has not been learnt by whom? The procedure must be triggered where?

12. Section 5.3

The second paragraph, that begins "Consider a subnet A", is tremendously confusing to a first-time reader (or at least to this first-time reader). I realize you probably think you're being helpful by providing a worked example, but as I read through it, it was the opposite of helpful. This is especially true because §5 and its subsections is about "Symmetric IRB Procedures" -- and the paragraph in question provides no procedures.

Some options to improve the situation --

- Remove the paragraph entirely.
- Preface the paragraph with "as an example to show why advertisement as RT-5 is required,"

13. Section 5.4

```
o global mode: VNI is set to the received label2 in the route which
is domain-wide assigned. This VNI value from received label2 MUST
be the same as the locally configured VNI for the IP VRF as all
PEs in the NVO MUST be configured with the same IP VRF VNI for
this mode of operation.
```

What action is to be taken if this MUST is violated?

14. Section 6.1

```
For asymmetric IRB mode, Router's MAC EC is not needed because
```

Please either expand “EC” or add it to your definitions section. (Also applies to 5.1)

15. Section 6.2

```
o If only MAC-VRF route target is used, then the receiving PE uses
the MAC-VRF route target to identify the corresponding IP-VRF --
i.e., many MAC-VRF route targets map to the same IP-VRF for a
given tenant. In this case, MAC-VRF may be used by the receiving
PE to identify the corresponding IP VRF
```

Do you mean “in this case, the MAC-VRF *route target* may be used…”?

16. Section 6.2

```
If the receiving PE receives the MAC/IP Advertisement route with MPLS
label2 field and it uses symmetric IRB mode
```

This entire section is entitled “asymmetric IRB procedures“. Why is there specification language regarding symmetric procedures in it? (I’m pretty sure this is not the only place this kind of problem appears.)

17. Section 7.3

```
On the source NVE, an age-out timer (for the silent host that has
moved) is used to trigger an ARP probe. This age-out timer can be
either ARP timer or MAC age-out timer and this is an implementation
choice. The ARP request gets sent both locally to all the attached
TSes on that subnet as well as it gets sent to all the remote NVEs
(including the target NVE) participating in that subnet. The source
NVE also withdraw the EVPN MAC/IP Advertisement route with only the
MAC address (if it has previously advertised such a route).
```

Wouldn’t the source NVE only withdraw the route after a timeout had expired? As you have written this paragraph, in case the silent TS has not moved, the following would happen:

```
Time t: age-out timer fires, ARP probe is sent
Time t: NVE withdraws route advertisement
Time u > t: TS receives ARP probe, sends ARP reply
Time v > u: NVE receives ARP reply
Time v: NVE re-advertises route
```

Presumably this churn isn’t what you intended.

18. Section 9.2

How does the NVE learn what subnets are behind its attached TS?

19. Section 9.2

What about if TS4 wants to reach SN1? How does it know where to send the packet? (I suppose the answer may be the same as for #18.)

Comment (2020-07-13 for -09) Sent

Bigger stuff:

I'll see Benjamin's DISCUSS (meaning I support it) and raise him the two seemingly normative references to [TUNNEL-ENCAP], which is also undefined.

As someone not from the Routing Area, I found this a little hard to read because it becomes quite dense with acronyms in some places.

Is Section 13 intended to remain in the final published version?  Is it needed?

In the Abstract, please expand all acronyms on first use in the abstract, per the I-D Guidelines.

Lesser stuff:

Thanks for the glossary in Section 1.  I note that, although defined in the glossary here, the terms "DGW", "Ethernet A-D route", "GRE", "GW IP", "IPL", "ML", and "VXLAN" are not used anywhere else in this document.

There are lots of places where a single hyphen is used to break up a sentence, such as to introduce an example.  These should be em dashes, or commas, or maybe semicolons, but not simply hyphens.

A few nits:

Section 2:

* "... all the inter-subnet forwarding are performed ..." -- s/are/is/

* "... connected to the same PE, wanted to communicate ..." -- remove the comma

Section 3:

* "A MAC-VRF can consists of one ..." -- either drop "can", or use "consist"

Comment (2020-10-14 for -11) Sent for earlier

I support the DISCUSS ballot position of Erik Kline

I support the DISCUSS ballot position of Alvaro Retana

I support the DISCUSS ballot position of Ben Kaduk

Not much to add to the feedback of my peer ADs.

** Please respond to the SECDIR feedback (and thank you Chris Lonvick for doing it!)

====
Thanks for addressing my COMMENT.

Comment (2020-07-14 for -09) Not sent

I'm balloting NoObj - initially I had this as a DISCUSS, but others already have my issues in their positions, and so I will let them carry it :-)

Like other, I found the document a challenging read - it is very full of acronyms, run on sentences and misplaced commas. While one can figure out the meaning, it's hard to keep the big picture in mind when having to re-read sections.

I would strongly recommend that the authors read the OpsDir review at: https://datatracker.ietf.org/doc/review-ietf-bess-evpn-inter-subnet-forwarding-09-opsdir-lc-jaeggli-2020-07-06/ , especially the: "it would be helpful if section 4 would be more explicit for non-implementors on when symetric or asymetric modules would be chosen, as it stands the variation basically reads like the enumeration of the features of various implementations." comment (which I fully agree with).

Comment (2021-06-30 for -14) Sent

No objection as I didn't notice any transport related issues.

However, idnits check returned some outdated references see below - 

   == Outdated reference: draft-ietf-idr-tunnel-encaps has been published as
     RFC 9012

   == Outdated reference: A later version (-05) exists of
     draft-ietf-bess-evpn-irb-extended-mobility-03

   == Outdated reference: A later version (-11) exists of
     draft-ietf-nvo3-vxlan-gpe-10

and downref

   ** Downref: Normative reference to an Informational RFC: RFC 7348

   ** Downref: Normative reference to an Informational RFC: RFC 7637

Some more nits I found as I read through:

-- Section 7 : s/expexted/expected, s/moblity/mobility

Comment (2020-07-14 for -09) Sent

Thank you for the work put into this document.

Please find below a couple of non-blocking COMMENTs (and I would appreciate a reply to each of my COMMENTs).

I hope that this helps to improve the document,

Regards,

-éric

PS: as a side note, I found that this document uses too many acronyms even for short words (e.g., "SN" instead of "Subnet"). There are also very long sentences that, when combined with acronyms, make reading difficult.

== COMMENTS ==

-- Section 2 --
About "to bridge non-IP and intra-subnet traffic and to route inter-subnet IP traffic": suggest to clarify the text when the IP-VRF is IPv6 only, then, (I assume) that IPv4 packets will be bridged and not IP-forwarded (and vice-versa).

-- Section 4.1 --
Suggest to replace "then the IRB interface MAC address MUST be the one used in the initial ARP reply or ND Neighbor Advertisement (NA) for that TS." by "then the IRB interface MAC address MUST be the one used in the initial ARP reply or ND Neighbor Advertisement (NA) or Router Advertisement (RA) for that TS" because routers MAC addresses are also advertised by Router Advertisements.

-- Section 5.1 --
Should also mention NDP when writing "(via an ARP request)" in the first paragraph.

In the same vein, please add "NDP cache" to "Furthermore, it adds this TS's MAC and IP address association to its ARP table".

As I am not an expert in EVPN, I am puzzled by the math about the Length field "either 40 (if IPv4 address is carried) or 52 (if IPv6 address is carried)."

-- Section 5.2 --
This section also only mentions IPv4 ARP table, please add IPv6 NDP cache.

-- Section 6.1 --
Same comments as for section 5.1

-- Section 6.2 --
Same comments as for section 5.2

-- Section 7 --
Good to state "Although the language used in this section is for IPv4 ARP, it equally applies to IPv6 ND."; even if I would have preferred to use by default IPv6 ND ;-)

Please note that in IPv6 there are often at least TWO IPv6 addresses per MAC (one link-local fe80::... and one global); so, "In the following subsections, it is assumed that the MAC and IP addresses of a TS have one-to-one relationship (i.e., there is one IP address per MAC address and vice versa). " is obviously never the case for IPv6. I understand that the rest of the paragraph explains how to handle the case but it could be easier to treat IPv6 in a separate sentence.

-- Section 7.1 --
While about mobility, this section appears to be also applicable to Duplicate Address Detection but is unclear on what to do when the same IP but different MAC (i.e., an actual IP address collision). Or is it covered in other documents?

== NITS ==

-- Section 1 --
"BD and subnet are equivalent terms" while in the rest of the document "IP subnet" is often used. If "subnet IP" and "subnet" are synonyms, then I suggest to keep using one for consistency or at least mention that "IP subnet" and "subnet" are the same concept (or explain the difference if they are not identical).

Comment (2021-02-23 for -13) Sent for earlier

I think there might still be complications for TSes with multiple IPv6
GUAs (which can be very normal is in line with BCP 204), but I guess
time will tell and other docs / future work will have a chance to
resolve any issues encountered.

Yes (for -09) Unknown

No Objection (2020-11-09 for -11) Sent for earlier

[Thanks for addressing my DISCUSS.]

No Objection (for -09) Not sent

No Objection (2021-06-10 for -14) Sent for earlier

Thank you for addressing my discuss point!

No Objection (for -09) Not sent

No Objection (2021-07-13 for -14) Sent

I agree with John's point that the presentation of the material in this document
is very dense and likely doesn't lend itself to easily writing interoperable
implementations.

This document uses RFC2119 keywords, but does not contain the recommended
RFC8174 boilerplate. (It contains some text with a similar beginning.)

Found terminology that should be reviewed for inclusivity; see
https://www.rfc-editor.org/part2/#inclusive_language for background and more
guidance:

 * Term "traditionally"; alternatives might be "classic", "classical",
   "common", "conventional", "customary", "fixed", "habitual", "historic",
   "long-established", "popular", "prescribed", "regular", "rooted",
   "time-honored", "universal", "widely used", "widespread".

-------------------------------------------------------------------------------
All comments below are about very minor potential issues that you may choose to
address in some way - or ignore - as you see fit. Some were flagged by
automated tools (via https://github.com/larseggert/ietf-reviewtool), so there
will likely be some false positives. There is no need to let me know what you
did with these suggestions.

Section 2. , paragraph 6, nit:
-    traffic is treated as L2 traffic and it is bridged.  Also vise versa,
-                                                                ^
+    traffic is treated as L2 traffic and it is bridged.  Also vice versa,
+                                                                ^

Section 7. , paragraph 6, nit:
-    Depending on the expexted TS's behavior, an NVE needs to handle at
-                         ^
+    Depending on the expected TS's behavior, an NVE needs to handle at
+                         ^

Section 7. , paragraph 7, nit:
- 7.1.  Initiating a gratutious ARP upon a Move
-                          -
+ 7.1.  Initiating a gratuitous ARP upon a Move
+                         +

Section 5.5. , paragraph 2, nit:
> ese subnets. The reason for this is because ingress PE needs to do forwarding
>                                  ^^^^^^^^^^
The word "because" means "for the reason that" and thus introduces redundancy.

Section 9.1. , paragraph 5, nit:
> e actual implementation may differ. Lets consider data-plane operation when T
>                                     ^^^^
Did you mean "Let's" (let's = let us; lets = 3rd person singular of "let")?

Section 9.1. , paragraph 6, nit:
> n the egress NVE, if the packet arrives on Ethernet NVO tunnel (e.g., it is
>                                 ^^^^^^^^^^
The usual preposition after "arrives" is "at", not "on". Did you mean "arrives
at"?

Section 9.2. , paragraph 2, nit:
> e actual implementation may differ. Lets consider data-plane operation when a
>                                     ^^^^
Did you mean "Let's" (let's = let us; lets = 3rd person singular of "let")?

Document references draft-ietf-idr-tunnel-encaps, but that has been published
as RFC9012.

Document references draft-ietf-bess-evpn-irb-extended-mobility-03, but -05 is
the latest available revision.

Document references draft-ietf-nvo3-vxlan-gpe-10, but -11 is the latest
available revision.

No Objection (2020-07-16 for -09) Sent

I agree with the other ADs that this document seemed terse in places and won't repeat those same comments here.

One question I have is whether it is possible to have a deployment where some devices support synchronous mode and others support asynchronous mode.  Am I right in presuming that this is not supported and if so is this capability signaled in any way? Or is the expectation that this would be controlled via deployment choice of network device, or though configuration management?

Regards,
Rob