Skip to main content

Minutes IETF113: sidrops
minutes-113-sidrops-00

Meeting Minutes SIDR Operations (sidrops) WG
Date and time 2022-03-25 11:30
Title Minutes IETF113: sidrops
State Active
Other versions plain text
Last updated 2022-04-15

minutes-113-sidrops-00
sidrops 113

Friday 25 March 2021
11:30 UTC
https://www.youtube.com/watch?v=5e211IkY_ic

Agenda
1. Chair slides - Chris Morrow
Chris handed over the mike to Allison Mankin, ombudsteam IETF. Allison
explained that the ombudsteam received quite a number of cases from people who
are concerned and feel uncomfortable with the discussions in sidrops. Allison
reminded us of the Code of Conduct. There is a need for more respect and care.
Warren added that he never had any code of conduct issues in any of the working
groups where he is an AD for. Lars-Johan Liman: I would like to add one other
recommendation that has saved me a few times, and that is when you feel upset
about something that someone has written, and you write an answer, do that, but
donÕt send it. Let is sit there for a couple of hours, go back, have a cup of
coffee, read it again and you may find you might want to change a few words
here and there, or change the entire tone of the message. Warren: Yes, and if
the main discussions are between two or three people, and there is more than
two or three e-mails a day, you might want to say stop and take a step back and
ask yourself if you really should be having this discussion right now. Allison:
We, as the ombudsteam will continue to monitor these cases, so do us a favour
and keep reading the code of conduct and think about how to be good to each
other. Warren: If someone is feeling stress or attacked, feel free to mention
it to the chairs and feel free to mention it to me.

2. Job Snijders - Update on Resource Signed Checklist (RSC) and rpki-client
https://datatracker.ietf.org/meeting/113/materials/slides-113-sidrops-sidrops-rsc-00

Jeff Haas: A quick operational question; Is there any long term concern about
adding a large number of objects to the RPKI system and impacts on the various
applications that use it? Job Snijders: It is very important to know that RSC
files are distributed outside the global RPKI repository system. To illustrate
what that means exactly: ROAs or CRLs, or manifest files are distributed inside
the global repository system, so if you use rsync or RRDP, those are the files
you pull into the system, but RSC files are not distributed through that means.
They are distributed in a one-to-one fashion. So I could generate an RSC file,
e-mail it to you and the global participants in the ecosystem would never know
that I generated one and sent it to you. So there is no burden on the global
system.

Warren Kumari: What is the relation between this draft and ÒRPKI has no
identityÓ, it feels like there is a close relation. Job Snijders: The
relationship has been noted in the RSC draft itself. The RSC draft references
the Òno identityÓ draft and the RSC draft explains that RSC files cannot be
used to confirm identity. All it does is it confirms that somebody has
possession of the private keys and the resources with which itÕs signed are
subordinate to the certificate authority. So from my perspective there is no
conflict.

Ruediger Volk: Referencing back to Jeffs question about what the load is on the
general distribution mechanism, I was surprised by your request for revocation
tooling, which quite obviously will have a need on the distribution system. Job
Snijders: The only load on the global system is if you revoke an RSC, the
serial is appended to the CRL of that CA. So, per RSC that youÕre revoking
youÕre adding a few bytes to a CRL. But then again, the RSC files could be
short-lived. This is something weÕll have to figure out in the wild.

3. Ben Maddison - Discard Origin Authorization (DOA)
https://datatracker.ietf.org/doc/slides-113-sidrops-sidrops-doa/

Jeff Haas: I have not read your draft. My question is: Is this intended to
address RTHV adjacent AS. Ben Maddison: That the purpose of the Peer-AS ID
field. The default behaviour is that this will not allow transit for RTVH
routes, but if you add transit of one of your providers to that list of peer-AS
IDs, thatÕs a signal to the receiver that you have authorised that transit and
it should matched and accepted.

4. Ignas Bagdonas - BGPsec performance scalability
https://datatracker.ietf.org/doc/slides-113-sidrops-sidrops-bgpsec-scalability/

Sriram Kotikalapudi: We did some studies with caching the signatures that have
been verified during the signature verification on the updating cache segments
of the AS path, the signatures that you have verified and next time the same
update or another update that has a common AS path segment with the previous
one you can make use of the cache so that is another way of improving the
performance. Perhaps you have taught of it? Ignas Bagdonis: Yes, I did. This is
actually contrary to the recommended practices of using elliptic signatures.
You can do this only if your random number is stable and that leaks your key.
ThatÕs not the right thing to do. Caching is possible, and rearranging a few
things here and there, you can cache and thatÕs the point. However, signature
signing and verification for AS path longer than, in this particular instance 4
or 5 hops, becomes less computationally expensive than calculating the hash.
And that is the problem. So youÕre not limited by the performance of elliptic
curve as such, you are limited by the overall performance of the memory system.

Job Snijders: You asked: Do we care? I can indicate, just like in IEPG, I do
care. And I do think now is a good time to start work on this. I think version
zero will give us valuable operational feedback on how it works in the wild,
provided that BGPsec router key publication becomes easily accessible to
operators, and from there migrating to a performance enhanced version seems a
very logical and organic way to further the development of this protocol.

5) Sriram Kotikalapudi - ASPA Verification Algorithms: Enhancements and RS
Considerations
https://datatracker.ietf.org/meeting/113/materials/slides-113-sidrops-aspa-verification-procedures-01

Ben Maddison: IÕm a little confused by the route server handling. What is
broken prior to this update. There is no difference from the perspective of
AS4, if the route server is transparent, it can be ignored all together. And if
itÕs a non-transparent route server, then itÕs indistinguishable from a transit
provider. All that needs to happen in order for AS4 to correctly detect this as
a leak, is AS1 needed to create some ASPA with any content, as long as it
doesnÕt have AS3 in it. And thatÕs what the previous version of the algorithm
said. I find the additional corner cases a complication rather than a
simplification. Sriram Kotikalapudi: If you focus on, here weÕre looking at
AS4, from AS4 points of view, what you said is correct. It doesnÕt need to know
about the presence of the route server being transparent or not, it needs the
ASPAs that the tools RS clients should have with any ASN. But if you look at
from the point of view from AS3, when AS3 RS client is invalidating AS, then it
helps for it to see that AS1 has registered an ASPA including the RS ASN in it.
And that is one reason toÉit is not necessary to include the RS ASN in the
ASPA, like you said, because they are already assuming that the non-transparent
route server is a rarity. However, in cases when the RS is a non-transparent or
even otherwise, it helps the route server client. To AS4 it doesnÕt matter, but
for the route server client AS3 it matters to some extend and to have the ASN
of the route server in the ASPA. Ben Maddison: I think itÕs important to
realise that AS3 knows itÕs speaking to a route server. I donÕt think that
having corner cases in the protocol helps anyone. I think itÕs more
complication. And the validation that AS3 applies, can take into account its
local knowledge. I think this makes the validation procedure harder to
understand. Alexander Azimov: The problem appears in a slightly different
topology. Imagine, on this drawing, that AS4 is a customer. And it received a
prefix from AS3, and in this case, if AS1 hasnÕt signed AS2 as its provider, it
will treat such a route as a route leak. So, one hop away from the
non-transparent IXP, we cannot distinguish if it is a route leak or if it is a
non-transparent IXP. Ben Maddison: Are you talking about the case where the IXP
route server is non-transparent? Alexander Azimov: Yes. And a slightly
different drawing where AS4 is a customer of AS1. Ben Maddison: In that case,
we embrace the fact that the route server is a transit provider. Alexander
Azimov: I agree with you. In this direction, I had a plan to change the
document. Sriram Kotikalapudi: To what you already said; for the
non-transparent case, it is not so rare. Ben can you send an e-mail to the list?

6) Job Snijders - RPKI & Certificate Transparency
https://datatracker.ietf.org/meeting/113/materials/slides-113-sidrops-sidrops-ct-00

Ties de Kock: I like the idea of CT, and I see a lot of value in applying CT to
EE certificates, like the Resource Signed Checklist because itÕs very hard to
observe objects and to know what was actually published and unless you have CT
on EE certificates, you cannot show an important attack in RPKI, which is the
omission of objects from the view that you present to somebody. It is critical
that EE certificates are included. Job Snijders: It would be cool if EE
certificates can be included in the CT log infrastructure, and IÕm not
excluding that path, but to reduce the scope and get somewhere, I think itÕs
great if we start with CAs. And maybe add more to it. Ties de Kock: I donÕt see
much benefit in removing the code path where for a CA certificate, you submit
it to the log, incorporate the RSC and for EE certificates you donÕt. WeÕd have
to prototype this. If you want the Relying Parties to check CT, they will need
to check the attestations that are in the CA certificates. Which means that
when you want to create a CA certificate, you need to get enough responses from
qualified logs (at least in the web context) and that implies that log
availability causes an upper bound on CA availability, and more brittleness in
the RPKI scares me a lot, being an actual CA operator. How do you think about
this risk? Job Snijders: I consider RP implementations, at this point of time
out of scope. UPs are believers and they just absorb RRDP and rsync.
Separately, we would create verifiers, maybe based on existing RP coded bases,
and they would absorb the logs and maybe use net monitoring alerts. An RP in
the RPKI context is a believer, not a verifier.

Koen van Hove: In WebPKI, the end game of CT, is that if a CA really
misbehaves, we remove it from our trust, and we no longer trust this CA. What
do you see as the end game for RPKI? There is currently no alternatives for the
RIRs. Job Snijders: If an RIR misbehaves, I will remove them from mu truster.
Koen van Hove: So the goal is to see if an RIR misbehaves? Job Snijders: The
first call is to engage with the RIR and confirm with them the situation and
request an RFO. But if the same type of incident happens over and over again,
or if there are systematic issues, it could motivate some operators to remove
temporarily or permanently cease that Trust Anchor. So the goal of transparency
is to be able to hold organisations accountable. Distrusting the root is the
end of the process.

Russ Housley: I have real problems with this work. My concern is that RPKI,
unlike WebPKI, was constructed so that the CA is authoritative for the
resources that it issues. For the WebPKI, all of the roots are able to do
anything with any aspect of the name space. When we started working on this,
the Internet Architecture Board suggested IANA ran 0/0 and the RIRs would be
subordinates. To accommodate easier transfers amongst the RIRs, each of the
RIRs became equal roots for 0/0. I argue that you donÕt need this (CT), if you
go back to the first model. Job Snijders: I should clarify, in the RPKI
ecosystem there are 22.000 CAs. The ones I think CT should apply to are the
RIRs that have the 0/0 certificates and their intermediate operational
certificates. The moment this bounces to an LIR, they can only harm themselves.

Ruediger Volk: Following Russ, some of the basics of WebPKI and RPKI are very
different. You should keep in mind that in RPKI, CA and identity are not really
the same thing, so looking at WebPKI is not the most valuable thing to do.
Establishing tracking mechanisms and monitoring for what is in the RPKI, is
important. For resource holders, an independent signalling of what the global
view of their resources is. WebPKI is directing you to bad tracks. I donÕt want
to dismiss this effort overall, but I donÕt think itÕs heading in the right
direction.

Ben Maddison: We need to distinguish better between signing events and
publication events. They happen close together in time, but they are not the
same. This is about signing events that are not visible though any theoretical
version of the publication system. CT is about the signing events. Also, I know
and trust that my RIR does things with the right intentions. But that is not
where the chain of trust needs to stop. I need to be able to demonstrate to a
third party when the RPKI causes some substantial outage for one of my
customers. Having some version of CT allows me to use this in a more robust
fashion.

7) Koen van Hove - RPKI off the beaten happy path
https://datatracker.ietf.org/meeting/113/materials/slides-113-sidrops-rpki-off-the-beaten-happy-path-00
(1.52.16)
Job Snijders: I think that Publish in Parent is a technique that helpt the
entire ecosystem. One of the fears is that a sibling CA of yours can do
something that somehow knocks you out, and for instance in the partial RPKI
data example you listed that if CA 3 has an issue, CA 4 disappearsÉa lot of
scenarios are alleviated if Publish in Parent is used. For this reason and
other reasons, we as ecosystem should strive to encourage the default setting
is ÒPublish in your ParentÓ. Because it makes life easier. If you publish in
the parent, the parent can -out of band -apply some restrictions. Like with
e-mail, in the SMTP protocol itÕs not encoded that I can only send you up to 10
MB, but if I try to send you a 10MB e-mail, your mail server might say itÕs too
large. This is local policy, that each parent repository can apply local policy
as it sees fit. Koen van Hove: You make a good point, but then you get a first
class citizen (publish in parent) and a second class citizen as Delegated. I
think that thatÕs a consequence of that solution that people need to be aware
of.

Tim Bruijnzeels: I think this is an example of a number of issues that may
occur, and the question how we should deal with them. Also, to be a bit more
specific, my feeling is that there are things to be discussed with regards to
these suggestions Job just made. I think there might be work there, but the
current reality is that parent CAs can only be reactive. I think we should look
at more pro-active measures. If we want to do things with repositories, that
implies that we need to look at the publication protocol. It also implies that
we may need to think about what trusted repositories are. The current reality
is that we can only be reactive.

Ruediger Volk: Thinking of bad characteristics of rsync, it has been identified
as a danger spot and is being replaced. For the volumetric attacks that you
described, I think they will blow up earlier than they hit the routers. What
you really should be checking is your first slide, where you only told the ROAs
that certain CAs are supposed to publish, and you did now show what resource
sets the CAs were holding. The issue that you constructed, depended on the
unusual idea that the delegation of resources was not hierarchical and
overlapping between siblings. The monitoring and tracking system should show
that. And the policies, that this should not be happening when running your
registry, have not been formally raised, but very well understood. Koen van
Hove: I want to point out that this was based on a real-life example which is
currently in the APNIC and IDNIC relationship. It doesnÕt happen a lot, but it
happens. Ruediger Volk: That boils down to Russ his previous remark, about
having single or multiple roots. And not having clear and formal policies about
how the resources are managed under the overlapping roots.

Ben Maddison:I think this is an important problem. We need to be clear on what
the action we take is. IÕm not convinced that any of the actions we take
against the potential DoS-es that exist should be changes to a protocol. I
think we need to be much clearer on how Relying Parties are dealing with
placing limits on their willingness to traverse trees and lots of directories
and objects. ItÕs not necessarily the case that they need to implement the same
protection mechanisms. But it would be good if there was some collaboration
between Relying Party implementers to document what recognised attack vectors
there are and how they deal with them. This could be an informational document.

Jared Mauch: This reminds me of the early days of usenet news. You would have
these files and they would get transmitted over this protocol. One of the
companies that decided to build commercial software to run a usenet news
server, found out that leveraging the underlying operating system was actually
inefficient. The data is just still data, it doesnÕt mean an implementation
should just look at: Do we abstract this out and store it in our own internal
data store, optimised for that use case. Maybe that historical context is of
use in this. We should be looking beyond the actual file systems. Koen van
Hove: I agree, and for RRDP a lot of implementations already do that. The rsync
protocol makes it more difficult to achieve that. But rsync is still a
requirement.

Ties de Kock: We had the initial reports about these issues and we resolved
parts of these attack vectors. It got a lot harder to do a Dos attack on all
Relying Parties worldwide. However, you also showed that if you want to attack
a specific instance, you can still so that. Because itÕs really hard to set
these limits in a way that in a recursive case, which still needs to be quite
wide, because some CAs are quite wide, you cannot abuse it. So, in my opinion
we need some work on that in sidrops. So that at least relying party instances
can detect it when the administrative domain changes when traversing the tree.
For some entities it may be logical that they have an extremely large
repository, while for a non-RIR has less objects in there. I think we should
continue investigating this issue.

Warren closes the meeting and thanks the speakers.