Skip to main content

Minutes IETF114: sidrops
minutes-114-sidrops-00

Meeting Minutes SIDR Operations (sidrops) WG
Date and time 2022-07-27 14:00
Title Minutes IETF114: sidrops
State Active
Other versions plain text
Last updated 2022-08-15

minutes-114-sidrops-00
SIDROPS 114

Session: Wednesday July 27 2022, 14:00 - 16:00 (UTC)

Session recording: https://www.youtube.com/watch?v=FVYH7O0jN4Q

1) Agenda bashing and Chair's slides - [5 minutes]

Warren Kumari: IÕm the OPS Area Director, and my term ends in March. I still
enjoy being the OPS AD but I would like for others to run. So, if you would
like to know more about this role please talk to me.

2) Igor Lubashev/Sriram Kotikalapudi- [20 minutes]?Source Address Validation
Using BGP UPDATEs, ASPA, and ROA (BAR-SAV)?

https://datatracker.ietf.org/doc/html/draft-sriram-sidrops-bar-sav-00

Slides:
https://datatracker.ietf.org/meeting/114/materials/slides-114-sidrops-source-address-validation-using-bgp-updates-aspa-and-roa-bar-sav-00

Keyur Patel: (as a working group member): I think itÕs a great presentation.
Two points I will make: It will be great to see how this solution solves the
problem for iBGP cases while you have eBGP cases. Most of the problems will
probably also require you to have a control within eBGP, itÕs quite possible in
this scenario that across iBGP a solution may not be honoured, or may not be
enforced.  But what is critical is if we can zoom into iBGP cases that would be
phenomenal. ?Igor: ItÕs very valuable to also look at iBGP. One of the things
that we want to do on the internet is that if one network will not enforce it,
maybe the next network should have a chance to still do it. Obviously, all the
SAV filtering is best done close as possible.

Jeff Haas: I offer two observations for you. The core enhancements from 8704 is
on one part you can add stuff to your source address validation from additional
BGP data thatÕs not being used for forwarding, so being able to seed it from
other sources is great. The presentation is an excellent example of that. We
talked about ROAs as one example, and IÕm glad to see this is going forward.
But the second thing ties into  the slide you are displaying here (slide: The
problem: some stats),  about the economics; why havenÕt we seen more of this
stuff. ItÕs one part the tooling for adding 8704 is not out there, but the
bigger one is Source Address Validation in hardware is predicated and burning
FIB resources to do the extra lookup to see if you can actually do Source
Address Validation. ItÕs cheaper than doing firewalling, so from that
perspective itÕs a wonderful thing. But itÕs still additional cost in your FIB
in cases where you have SAV covered by what you are using for forwarding you
basically get it for free, just at the cost of extra forwarding lookups. Every
other thing we are looking at here, and as you started expanding use cases,
youÕre looking at effectively doubling or tripling the size of your FIB to be
able to implement this functionality. So, part of what youÕre fighting against
is the economic cost of something thatÕs there for security that isnÕt actually
selling moving bits around. ItÕs actually helping stop you moving bits. Igor:
Economics is definitely a driver,  probably more for some networks than others.
 Some networks would actually benefit. We see that 15% of the networks chose to
implement something. So there is some value there. Bur for anything we do, itÕs
important that it is as economical as possible especially for the small
networks, because that is where the Source Address Validation is done best,
especially for the first movers.

Ben Maddison: There are three separate things. Firstly, to continue from the
point that Jeff made, certainly all of that is true,  but one caveat to that
is, speaking from the networks that I operate, we typically run out of
interfaces long before we run out of packets per second. That interaction with
the hardware is not necessarily a dealbreaker, even for reasonable large
networks that have a similar sort of shape.  The second point is to reiterate
what I mentioned in SAVNET (a new WG), is that using RPKI objects in this way
breaks the fail-open semantics that they have in the current use case and I
think we need to think quite hard about that. I think that there are other use
cases where we want something thatÕs like a sticky RPKI object. I think that is
going to be required when we replace the IRR, so I think there is other useful
work that would need the same thing. The third point is, your example of
pruning the customer cone using the ASPA is problematic semantically. Because I
can imagine a scenario where a customer wants to use a transit link purely in
the outbound direction and never intends to advertise any inbound reachability
over it. And therefor doesnÕt include that adjacency in their ASPA, but expects
the return traffic to continue working. And using it to prune the Source
Address Validation and filter in that way breaks that assumption. If we want to
do that, we have to be very careful how we document it, so it doesnÕt end up
being a nasty surprise at four in the morning. ItÕs a corner case, but a valid
corner case.  All of those points accumulated,  leave me wondering if we want
to use RPKI objects for Source Address Validation I think I would prefer
looking at defining new objects with those precise semantics, rather than
trying to shoehorn the stuff we already have in this hole. I think itÕs worth
while to do this potentially, and IÕll be happy to spend cycles on trying to
get it done,  but I donÕt love the idea of reusing the existing objects. Igor:
With regards to your comment about the implementation, yes, you are right. The
devil is in the implementation. If you have a temporary loss of cache, your
implementation should expect this to happen and not start to deny a lot more.
To your last comment: Absolutely, having a purpose built signal is much less of
a hack than using another signal that was not built for the purpose. I see it
as a trade-off between doing one more thing that is new, versus a signal that
already exists. Well, ASPA doesnÕt really exist, so itÕs an opportunity.

Koen van Hove: I will also continue on the point that Ben made, regarding what
you expect the failure condition to be, because we have seen that RPKI
publication points donÕt meet this 100% uptime availability. At any time there
is some publication point that isnÕt quite working as it should. So youÕre not
retrieving the ROAs or in the future the ASPA objects. For what I understood
that would mean that a lot of traffic is being blocked, what indeed goes
against the fail-open nature of the RPKI. What happens if part of your cache
goes??Igor: You cache your information, and you assume it's still valid for 24
or 48 hours. And only after 48 hours you remove it. Other people might come up
with better heuristics. Geoff Huston: In looking through this, what I
understand youÕre trying to do is to take the simple case of Source Address
Validation, which is a stub network where you have enforced symmetric traffic
and youÕre trying to synthesise that form of stub by taking an arbitrary subnet
of networks and say what is the total amount they can advertise in that
arbitrary subnet. Which is a closed connectivity domain and youÕre looking at
the union of all possible source addresses.  But youÕre using a tool -the ASPA-
which is a policy based tool that has directionality involved. YouÕre not using
a connectivity tool that doesnÕt care. In my head there is a lingering doubt
that when you try and assemble that closed, connected customer cone - which is
a cone of connectivity,  all the possible source addresses are now known,
because of ROV etc, therefor I can now apply this ACL, your implicit assumption
is that the policy directionality is the same as isomorphic to straight
connectivity. I donÕt think thatÕs the case. If that is the case, there is a
problem there. IÕm not reassured that itÕs not true. Igor: Everything that has
been done since 2000, is looking at the signals and BGP is the same thing. It
has directionality. ASPA is just another signal like that and that is
information we do have and we donÕt have others right now. What can we do,
given what we have, because thatÕs all we can do immediately. When we build
something new, it has to have the properties that itÕs cheap for early adopters
and they get immediate value. We think, by getting ASPA and ROA information, we
can enhance the state of the art we have.

Geoff Huston: An overly restrictive view of the prefixes coming from this
bounded set of networks will create filters that are too enthusiastic. The
problem with going down this path, is the operator push-back where an otherwise
perfectly valid packet gets discarded because of some automated tool then
becomes an operational cost. ThatÕs the underlying concern when I review this
work. YouÕre starting from a small set thatÕs constrained in stead of from a
large set that is maybe overly liberal, but at least it encompasses all that
connectivity. There is more work to do here, obviously, but policy and
connectivity donÕt always align. Igor: What SAV is doing, is that itÕs trying
to expand the number of prefixes that it will find to put in your permissive
list. ItÕs trying to make the list more permissive. Maybe this goes against the
original purpose of ASPA. It think it should be explored, another ASPA record
type. Geoff Huston: I suppose we agree that this is not ready yet, but itÕs an
interesting path to take.

Keyur Patel (speaking as a working group member): You talk about customer cone
and relationships. IsnÕt this problem wider than that? The attack can also
happen from a peering AS. Igor: Yes, the algorithm works on peering interfaces
and customer interfaces. When you look at a peering interface youÕre trying to
discover their customer cone.

Ben Maddison: I think the failure mode Keyur is hinting at, is when you have a
combination of a regionalised peering and a partial transit service offered by
that peer or even to one of your own customers in a different region. Then is
breaks this kind of closed traffic relation. And you end up not discovering
potentially valid sources, even under the expanded algorithm, because those
parts donÕt show up. But I donÕt think thatÕs related to the RPKI stuff, but
with the algorithm assumptions. Igor: There are two things there. One: If those
networks actually bother to create an ASPA record, they would list themselves
as providers, so it would discover it. In a particular location where the
customer is, you would likely receive BGP messages coming from that interface.
Ben Maddison: There is a gap there, because of the fact that you donÕt have to
talk about lateral peering and ASPA records. That the paths donÕt show up in
BGP because of policy, the adjacencies donÕt show up in the ASPA because they
are peering s and as a result that gets left out of the cone. IÕm sure there is
a gap there that we would need to look into. It only happens in the partial
transit case. Igor: We can look at it. ItÕs an interesting case.

3) Tom Harrison - [10 minutes]?Signed
TAL?https://datatracker.ietf.org/doc/html/draft-ietf-sidrops-signed-tal

Slides:
https://datatracker.ietf.org/meeting/114/materials/slides-114-sidrops-draft-ietf-sidrops-signed-tal-10-00

There were no questions

4) Tim Bruynzeels - [20 minutes]

Delegated CAs and Repositories

Slides:
https://datatracker.ietf.org/meeting/114/materials/slides-114-sidrops-delegated-cas-and-repositories-00

Chris Morrow: Yes please, standardise.
Tim: Ok, in that case IÕll send something to the list.

Alexander Azimov - [20 minutes]

ASPA - 08+

Slides:
https://datatracker.ietf.org/meeting/114/materials/slides-114-sidrops-aspa-01.pdf

Ben Maddison: I think I said everything on the mailinglist. To reiterate: I
think itÕs less work for the Relying Party maintainers if there are fewer
objects. I think that the overwhelmingly more common cases that networks have
the same -or very close to the same- transits on both address families. One of
the things I tried to point out in the recent discussion, is that for an
operator that is used to a user interface where it presents this model that
both the address families are the same topologically, the day that this
operator then needs to read these objects to see what is actually being
transmitted on the wire, is much less surprising if it doesnÕt diverge too much
from that mental model. The other thing to point out is that there has been a
lot of progress in new implementations over the last few weeks, and all of
these are based on the version 8 profile. I think rolling that back is quite a
lot of work for a fair number of people. IÕm quite strongly in favour of the
version 8 change.

Alexander; What are you suggesting to do with the RTR specification?

Ben Maddison: Certainly the formats in the RTR protocol and the asn1 diverge,
but I donÕt feel thatÕs a huge problem.  I think that the consideration for the
RTR protocol is to make things as convenient as possible for routers to use it
in policy decisions. And I think the existing format is mostly fairly well
suited for that. Whether we do this translation when it arrives at the router
or whether itÕs being processed by the Relying Party, I think the profile
change can co-exist with the existing RTR specifications pretty comfortably.

Alexander: We can split policies based on the RPKI cache but donÕt you think
that for debugging things can become really complicated.

Ben Maddison: I think youÕre right, but with the structure to use in
provisioning tools and conflict management tools, itÕs inevitably different
from what is convenient for the router to store in its internal data
structures. So the translation has to happen somewhere. Probably the best place
for that translation to happen is the RP because the compute is cheap there and
my feeling it that itÕs the least surprising place to occur. Because youÕre
going from one protocol to another.

Alexander: In the mean time, IÕm reading the chat and I see Randy Bush is still
opposing this change and heÕs authoring the RTR draft. I donÕt know how to make
everybody happy about the ASPA object style. Maybe we should ask the chairs to
be more involved in the process.

Ties de Kock: We have an internal implementation of the version 8 profile and
IÕll explain why I prefer the version 7 profile in hindsight. In the end, the
user interface that users will be presented may or may not align closely to the
objects that people create. ItÕs all about making sure that the right objects
are created and that there is no confusion in these objects themselves. What we
realised after implementing, is that we could create quite a lot of edge cases
in the content of version 8, where the content semantically overlaps and you
need to take a union there within the objects and covering this with a proper
set of test objects was very hard. This is the main reason why I prefer the
version 7 object. Even though I really like the idea of having a single signed
object per AS. IÕm just afraid that covering all these cases where IPv4 and
IPv6 overlap or not  could lead to interesting edge cases.

Ruediger Volk: When I look at the proposed change of the profile, it looks like
version 7 is much less complex than version 8. I guess some of the implications
Ties mentioned, are related to the more complex data structure. On the other
hand, my understanding is that a more complex data structure is not used in any
significant way to express more functionality. IÕm not happy about moving to
version 8. I was expecting the work on 8210-bis would not have to be re-done.
For the question if we are actually delaying the creation of the operational
system, if a re-run of 8210-bis is required, I would strictly oppose the idea.
Moving on with version 8, IÕm very unhappy with added complexity. My main
concern is that I donÕt want to see development of the operational system being
delayed. But complexity is going to have costs and should be avoided.

Tim Bruynzeels: This discussion started with a desire to use up less space and
the AFI limit came to be as an additional thought in the process. So the first
proposal I made then, was to have a single AS object with two distinct lists
for each address family, then the address family limit was introduced as a way
to compress even further, and then the idea came to be that this might actually
reflect what people want to do. All in all, this can express the exact kind of
data as you can express now with version 7. In that sense, it really is a
matter of preference. I want to second what Ruediger said, I would hate for
this discussion to delay deployment and experience with ASPA. If it would come
to that, IÕm willing to change my implementation to follow whichever profile
the WG finds acceptable. To comment on the data format versus 8210-bis, if you
look at ROAs,  you can have multiple prefixes in a single ROA object. You donÕt
have this structure in your router. You would have to validate multiple ROA
objects and make a union of everything. That is what gets sent to the router.
And similar, whatever the profile is, the translation can happen at different
levels. It can happen in the RP, as is currently done in ROAs already. It can
also happen in the UI where I can present users with an interface that allows
them to provide a common list and my software can figure it out. ItÕs trivial
to do that. My main message is that I want this to work.

Randy Bush: What is presented to the user in the UI is arbitrary. In either
schemes you can present separate or joint in the UI, it makes no difference.
What is on the wire is almost never seen by the operator. A few X.509 keys
actually look to the garbage on the wire. When I want to see whatÕs being
published, I donÕt look into the repository, I donÕt look on the wire, I look
at my router. What is on the router, is separated, IPv4 and IPv6. The 8210-bis
change, while tecnically and procedurally possible, is not what we want to do
because we want to keep the burden of any hacks north of the router. The whole
purpose of 8210-bis is to minimise the load on the router. And the router chews
up IPv4 and IPv6 separately. Like it or not, IPv4 and IPv6 topologies are not
congruent. This is especially seen in Asia. Is there any operator in this
meeting who actually uses multi-protocol BGP so they have IPv4 and IPv6 in a
single configuration with their peer, or is it all the rest of us, who have
separate sessions for IPv4 and IPv6. ItÕs not pretty, but this is the reality.

Rob Austein (relayed by Warren Kunmari):?I am extremely uncomfortable with
requiring transit on different AFIs to be on the same path when we well know
that sometimes they are not. Maybe I misunderstood the question. Chris Morrow:
It sounds like a lot of this discussion should be on the mailing list. Ben
Maddison: In response to Ruediger: I donÕt think we're arguing here about more
or less complexity in the system as a whole. We're discussing where the
complexity should be dealt with. But we need to fix this issue sooner rather
than later.

Alexander: I'm not the person to declare consensus. Version 7 is simple and can
fly faster. Nevertheless letÕs continue the discussion on the list. I will try
to summarise this discussion on the list.

Sriram Kotikalapudi: Can you go to slide 9? When you have a provider that has
no tier 1, we said in the draft they can use AS0 ASPA and that's fine, I think
we also said that an IXP route server will also register an AS0 ASPA, thatÕs
also fine. A bit more tricky, and not in the draft yet, is about if you have a
transit provider who happens to be present as a client on an IXP RouteServer,
in that case, they should register an ASPA with the RS as the provider. Do you
agree we should include that in the draft.

Alexander: It doesn't matter if they are Tier1. If you think something is
missing from the document, add it. Sriram Kotikalapudi: I'll discuss some other
things as well.

Randy Bush: If it's too complex, maybe you should take that as a warning. IXPs
that put their ASN in the path, are against specifications. It's not worth it,
please reduce complexity.

Alexander: IÕm doing my best. Although, I need to stress that they are in the
wild.

Ben: It's true that a transit-free-network at a non-transparent RS should
create that RS as one of its providers , but that is wildly uncommon. In the
interest of simplicity, the document should emphasise that a non-transparent
IXP RS is just a transit provider, although it forwards on mac addresses and
not IP addresses.

Alexander: Thank you for the comments. I think we're on the same page here.

Sriram Kotikalapudi: It turns out that the non-transparent IXP is less complex
than the transparent IXP but both of them can be taken care of in the draft
without too much complexity. It appears to me that if the RS is transparent,
itÕs maybe worthwhile to just add an ASPA by the client. We need to think about
that carefully.