Skip to main content

Minutes interim-2022-idr-02: Mon 10:00
minutes-interim-2022-idr-02-202201241000-01

Meeting Minutes Inter-Domain Routing (idr) WG
Date and time 2022-01-24 15:00
Title Minutes interim-2022-idr-02: Mon 10:00
State Active
Other versions markdown
Last updated 2022-02-15

minutes-interim-2022-idr-02-202201241000-01

IDR Interim Meeting on Jan. 24 2022

Meeting materials:

https://datatracker.ietf.org/meeting/interim-2022-idr-02/session/idr

Notes:

https://notes.ietf.org/notes-ietf-interim-2022-idr-02-idr

1. BGP routes with color (BGP-CT, BGP-CAR) - 1 hour

1) Update of Problem Statement in SPRING [Joel Halpern]

Two problem statement drafts to parallel the solution drafts. Had hoped
to have one. Asked authors to form a design team to try to come up with
one document. Joel is observing the calls, but is not a participant and
is trying to stay neutral. Meetings are regular - once or twice a week.
They have agreed on some items, but have not posted these in a draft.
There are a number of items they have not agreed upon. They hope to
have a draft they can circulate soon for a common problem statement.

They thus don't currently have a common problem statement to help IDR
understand, "What problem are they trying to solve?"

Jeff: Would you be willing to comment on issues that are not achieving
consensus?

Joel: I have not confirmed with the authors in the design team as to
what they agree upon. I don't think think it would help this discussion,
so I'd rather not.

2) An update on BGP CT [Kaliraj Vairavakkalai]

[BGP CT presentation.]

From problem statement slide: "Resolve traffic over intra/inter-domain
tunnels of a certain TE characteristic, including best-effort tunnels".
"Intent driven service-mapping."

Solution constructs:

  • Re-use RFC 4364 and 8277 encodings. NLRI encodes RD:tunnel-endpoint.
    New AFI/SAFI.
  • Transport class route target identifies TC it belongs to. (Transport
    class is effectively the color.)
  • Transport RIB is where routes for a given TC are installed, and
    service routes resolve over these Transport RIBs.
  • Deals with problematic resolution schemes (e.g. BGP SR-TE-Policy)
    where a unified resolution scheme of composite keys color:ip or
    ip:color wasn't always acceptable.
  • Desired resolution scheme via a "Mapping Community"; e.g. the color
    Extended Community.
  • Covers existing RFC 4364 inter-as scenarios.

Transport classes can be used to implement network slicing.

Protocol obsevations:

  • Mapping community permits resolution over strict color tunnel types.
    Also, permits fallback to "best effort".
  • Rewrites doable between domains that don't agree on same semantic
    simply by replacing the route-target.
  • RD used to permit distinguish between different transport-class routes
    for same TunnelEndpoint when propagating across route-selection pinch
    points.
  • Can also use add-paths.
  • RD allows uniquely identifying originating PE across multiple
    domains for troubleshooting.

  • (Commentary) Uses existing encodings for how labels and nexthops are
    carried. In future, Kaliraj would like WG to consider having
    forwarding information scoped at the "nexthop level". It's a
    per-nexthop entity; label, other parameters. Other drafts that share
    that idea, including multiple nexthops in same route, labels in the
    nexthop for different applications. I think we remove this information
    from the NLRI.

  • Since route targets are used, RT-Constrain (RFC 4684) can be used.

Re-use of RFC 4364 RT-Constrain machinery:

  • Existing machinery works "out of the box".
  • Avoids multiple loopbacks on Egress-PE along with problems of path
    hiding at RR/ASBR.
  • TunnelEndpoint can be an Anycast address in multipole domains.
  • "RD is an identifier of convenience."
  • Treating color as an attribute (adjective) rather than a noun
  • Helps where domains have different numbering of color values. You
    can rewrite attributes, rather than NLRI.
  • (Comment from Keyur in chat about RT-C needing upgrading. Jeff
    responds that as long as it's an extended community, we have
    proposals for doing that in RT-C. For example, ESI in EVPN. Also
    proposals for doing RT-C vs. non-extended community attributes.)

Implementations shipping since JUNOS 21.1. Uses IANA allocated code
points.

CT vs. CAR observations:

  • CAR "where is my color" - prefix or LCM.
  • RT-C can't be used for CAR without always using LCM without changes to
    RT-C protocol changes.
  • CAR tries to leverage ORF-like mechanisms.
  • CAR NLRI combines key/non-key fields. Kaliraj's opinion is that this
    is a bad idea. Special care needed for withdraw cases along with
    increased implementation complexity.
  • Route reflector case, it'd be nice if it was address family
    agnostic. That's possible if NLRI doesn't have non-key fields.
  • (Notes from chat between Keyur, John Scudder, Jeff Haas - if we have
    clear key fields in NLRI, it's possible to have a generic route
    reflector that doesn't need per SAFI intelligence. John Scudder
    indicates that having a general purpose mechanism for this would be
    beneficial.)
  • (Note from chat from Swadesh, since CAR has clear key/non-key field,
    this is already doable.)

  • CAR SAFI used as transport family between PEs and service family
    between CE-PE in VPN-CAR. Cases is overloaded.

  • Like VPN-CAR, will there be one for each other application? E.g. VPLS,
    EVPN, etc. CT solves problem with new attributes taht can work with
    all existing service families.

CT vs. CAR - route selection example:

  • "How to find the best path for color 100 for tunnel 1.1.1.1".
    "Implementation needs to walk the full table, all entries, to find
    appropriate entry (due to LCM overriding color)". Longest-prefix match
    not helpful.
  • How do you implement fallback?
  • "We had to deal with these issues in other proposals" (SR-TE-Policy)

More observations:

  • CAR requires addpath for ebgp-peering. This isn't a defined feature.
  • Scaling method proposed in CAR draft, hierarchical transport,
    icnreases the recusrive resolution and ecmp nexthop load on ingress
    PEs. This may not be practical on lower end devices.
  • Scaling method in CT, MPLS namespaces, works with no changes on
    ingress PEs. Only BN/SH need to be upgraded.

**End of presentation. **

Questions:

DJ:

Dhananjaya Rao (DJ): The observations presented reflect lack of
understanding to CAR solution. Many inaccurate statements. A number of
speculative assertions made from the position of the presenter. We did
respond to a number of comments on the list. Responded to more of them
yesterday.

Kaliraj: Could you be specific about the questions here? (CT vs. Car
proposal observations slide, #10)

DJ: CT draft itself. Question is whether complexity of re-using IP-VPN
import/export model is actually needed at the underlay layer. LU has
been used for a couple of decade. Thinks CAR addresses it much more
consistent way similar to LU. W>r.t. RD, consider it problematic. If
there are two ABRs originating a route for a local-PE, presence of RD
prevents the use of multipath within that local domain at the ingress
border node. This is because the ingress treats them as different
routes. Breaks local convergence when there is a failure at the egress
ABR. Slows convergence. When this was pointed out the response was "you
can ignore the RD". This proves the RD is not needed. You can have two
routes pointing to same dataplane LSP - which one do you propagate
upstream? One, other, both? Hacky workarounds to solve a very basic
problem. CAR doesn't have this issue, similar to BGP-LU.

Kaliraj: If you're familiar with L3VPN, L3VPN does multipath in the VRF
after stripping the RD.

DJ: VPN multipath happens where it's needed, the PE. Here's we're
talking about hop by hop transit. You're trying to impose the same
design there. They're not equivalent.

Kaliraj: RD is designed to help you get past pinch points. VRFs strip
the RDs for the multipath calculation. You are taking the case where the
deployment is using ASBRs to distribute the BGP routes. There is a case,
just like LU, of egress PE originated routes. In that case you won't see
this problem.

(Related notes from chat: add-paths for ebgp, needed if you don't have
RD, is not a supported feature. See
draft-pmohopat-idr-fast-conn-restore)

Srihari: Some comments on the list still not answered. My email, along
with Kaliraj's presentation, has concerns that hopefully you'll address
in your presentation.

DJ: Will try to address some of the concerns in the presentation.

Jeff Haas: Let's have the presentation of BGP CAR before further
discussion.

3) An update on BGP CAR [Dhananjaya Rao]

(43 minutes in)

Presentation:

  • Color represents intent. See multiple existing drafts about color.
  • CAR route could be originated at egress PE, ASBR, etc.
  • A color (extended community) on a service route selects how this
    works; like SR-TE.
  • New SAFI
  • Multiple encapsulations
  • Efficient/extensible NLRI
  • Multiple color domain support.
  • Color provides more than one instance of a route. Directly comparable
    to BGP-LU.
  • "Don't need to use VPN constructs like import and export ot achieve
    this."
  • NLRI key: Endpoint:Color
  • Color is consistent across devices within a "color domain"
  • NLRI optimized for the common single color domain case.
  • Identical routing semantics as bgp ipv6/v6/lu. Routes stored in color
    adj-rib, route selection.
  • No need for VPN import/export for each underlay hop.

  • ECMP aware paths at every hop.

  • CT procedures require "hacky workarounds" to bring diverse paths
    together.

  • Subscription model for E,C.

  • Consistent with SR Policy data model
  • Compatible with multiple transport models. Can use CAR and also SR
    Policy (slide "Seamless BGP CAR and SR Policy co-existence with E,C
    model")

Path avaialbility and domain local convergence (this slide generated
quite a bit of jabber chat):

  • Example uses ASBR originated routes
  • RR uses add-paths.
  • ASBRs set next-hop-self, so path pruning for multiple paths from RR is
    expected. ECMP in Domain 2 is thus hidden behind 211/212 in topology
    diagram.
  • When a contributing E,C (231/232) fails, it's localized due to the NHS
    behavior at Domain 2 ASBRs.
  • Claim is BGP-CT wouldn't have these properties.

Extensible, future-proof NLRI Encoding:

  • SAFI carries key/non-key info.
  • Introduced a key length for transparent propagation through route
    reflectors.
  • TLVs for non-key fields. Not unusual; see BGP-LS.
  • Per route unique data in NLRI non-key TLVs, rest in attribute.
  • Provides packing efficiency in Updates.

Encapsulations:

  • Multiple encapsulations supported
  • Signaled via non-key TLVs
  • MPLS labels, Label index, SRv6 SIDS, etc.

  • Separate values for different encapsulations

  • Beneficial for co-existence, migration, etc.
  • Avoids originating multiple routes for different encapsulations.

CAR Next-hop resolution:

  • Resolution is recusrive and color aware; E,C via N,C.
  • (N,C) provided by other color-aware mechanism; SR Policy, IGP
    flex-algo, CAR itself.
  • Resolution supports fallback to alternative colors or best effort
  • Traverse domains with less diverse intent or over color-unaware
    islands.

  • Resolution also mapped via traditional mechanisms; RSVP-TE/IGP/LU.
    Supports brownfield.

Multiple color domains:

  • Network domains where color-intent mappings are different.
  • Uses local-color-mapping extended community. Optional, only used when
    going across a color domain.
  • Color ext-comm on service route also get re-mapped in parallel.

  • CAR NLRI (E,C) is immutale, preserved end-to-end. Eases tracking of
    route.

  • E (Prefix) is unique in inter-domain transport network; e.g. PE. Makes
    (E,C) unique end-to-end even if color is local to a color domain.
  • DJ suggests single color domain is the most common use case (99%). In
    chat, Srihari disagrees.

VPN CAR:

  • CE provides intent.
  • Different capabilities between PE and CE

There are two implementations of CAR.


Jeff Haas: There are many discussion in the chat.

Jeff: Car is up 12 authors. IESG RFC procedure is 5 authors, think about
how you want to reconcile that.

Srihari Sangli: Slide 9 (Path availability and domain local
convergence.) node 211 has two next hops for (E3, C1), will it make a
local decision on which one to choose?

DJ: Yes, 211 advertises one prefix using next-hop-self.

SS: How does 211 forward? Local decision based on label it gets packet
from. It can't choose different intent paths on that received label to
231,232. For example, different flex-algo paths. At ingress, E1, can't
choose which intent path to use.

DJ: The path computation is for the intent of one path. E3,C1. End to
end path from E1 to E3 is for one intent. For that intent, for MPLS
you'd have an end to end LSP. When there is a failure, we'd get local
convergence that suppresses that churn toward E1. Perhaps you're saying,
if we had a different intent, what would we do? We'd have a different
route; e.g. E,C2.

DJ: Any path needs avaiability and reliability. If there is a failure of
any single node, you need to have an alternative path. Path computation
is always local.

JH: Node 212 will have two paths via add-paths. 212 will next-hop-self,
but it provides the ECMP at 212. If you wanted to have the additional
paths propagated, that'd require add-paths at the node (e.g. 212)?

DJ: That's up to the operator. If you wanted different forwarding,
that'd be different intents and you'd have two different routes with
color.

Swadesh: It'd be different intent. E1 should be receiving two different
colors. We should not mix this by using an RD.

JH: Perception of what color means as it routes across the entire
domain, you've given a point that highlights the points of the
proposals. You're seeing diversity of forwarding being encoded in
different colors rather diversity of same color.

Swadesh: You want to go via 231, that's an intent. If you want a
different intent to go via 232, that's a different color. RD would have
different local convergence problem.

SS: Assumption is that E1, provide will know all of the exit points and
therefore will chose the intent. Thus means needs to map the topology
across the various domains.

Swadesh: That's your desire, to go via either 231 or 232. Why would a
core router choose that? That's the requirement you are bringing to
this. BGP-LU doesn't have any such thing. I'm sure where you're getting
the use case (for this type of steering).

SS: If you're comparing to BGP-LU, it provides very straightforward
reachability. Now we're bringing intent into this, intent that can be
changed and mapped to different color values. I'd thus need visibility
for multiple domains.

Swadesh: You're saying the intent is being carried via 231 and 232 - you
have to choose anyway.

Kaliraj: In use case specified, 231 route and 232 route with different
colors. At E1, you'd have different transport ribs. In a given transport
rib, you'd have either the 231 route or the 232 route.

Swadesh: Intent is to go via 231? (Yes)

Kaliraj: There is no confusion.

Swadesh: You can't provide for local convergence. Our problem with the
RD model. It's opaque. In transport, you're trying to put E3,C as
something opaque. At 231, 232 you'd strip the RD. [indistinct] What's
the point of carrying two routes upstream which have the same LSP
downstream.

Kaliraj: per-transport class per-prefix allocation mode. That's how
local convergence is achieved when ASBRs are advertising the routes. In
the example we show only one mode. In the draft we discuss that route
can be originated either at the PE or at the ASBRs. Just like BGP-LU.

Swadesh: That means you'd be using same RD.

Kaliraj: RD has different role than color. Just for uniquely propagating
the routes. Color is separate, we don't want mix them together.

Swadesh: RD is creating opaqueness on a global tunnel endpoint. More
memory usage..

Kaliraj: Provides better reaction to events due to better visibility.

Swadesh: Unique RDs at 231/232 pushed .

Jeff: Time check. Have to stop.

Jeff: Where the route origination happens, matters. We'll have to take
this to the mail list, ideally topic by topic. Getting clarification in
the absence of a unifying document is needed.

Keyur: For DJ and Swadesh,please make sure the concept of intent is put
into the draft.


Other notes of interest from chat:

  • The key/non-key mechanism, if used, probably should be generalized
    rather than having it as a one-off in the CAR draft. (JGS, JH).
    Kaliraj suggests perhaps different MP_REACH_NLRI.
  • BGP-LS and flowspec provides an examples of how tricky TLV in NLRI can
    be. (JH)
  • Operational "niceness" for tracking routes is different in CT vs. CAR.
    CT tracks to PE, CAR tracks to original intent. (JH)
  • In response to "CT carrying state in three places", encpoints, color
    are distinct things and may vary on a domain by domain basis. (JH)
  • RD used by CT has room for a color (JH)

  • Discussion about Kaliraj not liking the current way label stack is
    already in NLRI and we had to do RFC 8277 to address bugs in it. Would
    prefer to not compound the issues and would like to clean things up
    instead.

  • John Scudder notes slide decks have inappropriate "confidential"
    tagging and requests that they be resubmitted without it. Sue concurs.

2. BGP Autoconfiguration - 0.5 hour

Three proposals under consideration: LLDP, draft-minto, L3DL.

What scopes do we want to consider for adoption, L2/L3/both? Which draft
in those spaces.

Adoption discussion will happen in the list.

Brief review of problem space, noted the design team worked through
requirements. DT draft was refreshed a few days ago.

1) draft-minto-idr-bgp-autodiscovery [Jeyananth Jeganathan (Minto)]

Presentation from slides.

  • Link local multicast to avoid media dependency.
  • TLV based.
  • Periodically advertise transport information with life time.
  • Refreshes on interesting events
  • Loosely coupled with BGP.
  • Strictly for service discovery.

(PDU discussion)

  • Base TLV, specific TLV for BGP service discovery
  • message id for debugging
  • Changes: Refresh request to support faster discovery after restart
  • Sender driven lifetime. Config change can drive refresh.
  • Local address to support iBGP.
  • Single address changed to address list.
  • Support MAC for helping bootstrap reachability to BGP loopback
    addresses for iBGP.

2) BGP Autoconfiguration LLDP Discovery [Acee Linden]

  • Proposal has been around for 3 years.
  • Targeted toward data center environments.
  • Supports peering on loopback addresses.
  • Support explicit signaling of parameter changes.
  • On LLDP in IETF OUI that IANA has already assigned.
  • BGP Group IDs haven't been standardized, maybe standardize a few.
  • LLDP is periodic broadcast at Layer2. Each message replaces prior
    message's information.
  • No native authentication, relying on MACsec.
  • LLDPv2 isn't required, but would be a benefit to this proposal.
    (LLDPv2 is under development in IEEE). It would provide for multiple
    PDUs, and better incremental updates.
  • Reachability to loopback is beyond scope of discovery - it's expected
    to be boostrapped by something else; e.g. OSPF or BGP.
  • Nokia has rudimentary implementation. Another switch vendor (unnamed)
    may have an implementation (not Cisco)

Jeff: Juniper had done a prototype implementation of LLDP as well, but
not shipping it.

Jeff: some comparison between draft-minto and draft-lldp-discovery.

  • Lifetime of LLDP messages is the lifetime of the discovery.
  • Timers, how to come up fast?
  • How to tear down state.

Acee: Since lldp replaces on update, then removing the state from the
LLDP packet is removing discovery. It should go into the draft. Or
config state with no values.

Minto: What's the interval configuration? keychain name? loopback
address deployment?

Acee: LLDP has its own advertisement interval. Keychain name needs
common keychain configuration, but it's optional. Hadn't considered
other protocols used for discovery.

Minto: You mentioned ospf is required.

Acee: No, not required. Any way of doing this - perhaps even BGP itself.
We're simply not making this part of the disccovery protocol.

Jeff: Discussion about what to adopt needs to happen. Exactly one
domain? Solution per domain? L2 discovery is problematic in some cases,
LLDP document discusses that. Switch in the middle. It can also tie us
to specific link types. Discussion?

Acee: I think there are other applications of LLDP that deals with
switch in the middle.

Acee: Can LLDP solve the switch in the middle problem?

Jeff: Different MAC.

Randy: L3DL can solve the switch in the middle problem. Uses standard
MAC solution.

Jeff: Will bring the adoption discussion to the mail list.

(from chat)

Donald Eastlake: That's what I was going to say. Switch in the middle is
no problem - just use a different MAC multicast address

Randy: @donald: it's specifically in the l3dl draft

3. BGP Flowspec v2 - 0.5 hour

Presentation:

  • Most of the considerations have to deal with partial deployments.
  • Key/non-key issues.
  • Capabilities for incremental deployment? (Jeff's document)
  • Examples with a deployment that may have FSv1, FSv2, partial feature
    coverage in each.
  • One option is if you don't understand it, perhaps you keep it in the
    flowspec[v2].
  • How tightly do we tie actions to individual match criteria? Do we make
    this decision of the working group adoption call?

Kaliraj: Not sure what actions you're talking about. Perhaps redirect to
ip? Think we should scope this based on the forwarding nexthop
behaviors. BGP multi-nexthop might permit us to scope this together.

Sue: We've gathered all of the match work from prior proposals together.
As for actions, if you can supply more feedback on that, it'd be
helpful.

Kaliraj: Actions not in the NLRI is good.

Jeff: the interface-set draft has some challenges. Need to incorporate
the learnings from that. Similar to CAR optional entries, non-key data.
The behavior need to change on a hop by hop basis, then it cannot be put
in NLRI key field. Already have some canonicalization issues in flowspec
today like this. Where we go there as part of partial deployment...

Sue: there are two ways: extended community (FSv1), wide community. The
wide community has more space for extended actions. (Wide community is
heading toward WGLC) If you think it might not fit into a wide
community, please let her know.

Kaliraj: Community carrying the actions would be fine, but
(multi-)nexthop attribute perhaps better match.

Sue: If you have a proposal, send it and we can talk. Will likely rip
out NLRI based one.

Jeff: Requirement at functional level is match in the NLRI key fields,
actions need bundling. Extended communities are easy to manipulate, but
they don't have relationships to each other. As we've been adding
actions, how they interact in combinations is problematic. Ordering,
perhaps conditional logic. We have functional requirements, wide
communities may be just an option.