Skip to main content

Minutes for SIDR at IETF-95
minutes-95-sidr-1

Meeting Minutes Secure Inter-Domain Routing (sidr) WG
Date and time 2016-04-06 17:00
Title Minutes for SIDR at IETF-95
State Active
Other versions plain text
Last updated 2016-04-29

minutes-95-sidr-1
AGENDA:

TWO sessions:

Monday, April 4, 2016 (ART)
14:00-15:30        Monday Afternoon session I        Atlantico B
Wednesday, April 6, 2016 (ART)
14:00-16:00        Wednesday Afternoon session I        Buen Ayre A


Timings inserted as noted in Meetecho recording


Session 1:

Monday, April 4, 2016 (ART)/wg/sidr
14:00-15:30        Monday Afternoon session I        Atlantico B

Timing: 00:01:35 into Meetecho recording

(Sue Hares taking minutes into etherpad)

1)  Administrivia & Draft status                                       1400-1410

    Presenter: Chairs                                                           

   - Mailing list: http://www.ietf.org/mail-archive/web/sidr/index.html
   - WG Resources: http://tools.ietf.org/wg/sidr/ 
   - Minute taker? - Sue Hares 
   - Jabber Scribe? Jared /sigh  
   - Blue Sheets- please sign 
   - Agenda Bashing
   
   Draft status please see the chair's slides.  
Discussion: None 

2)  Existing WG Drafts               

Timing: 00:09:40 into Meetecho recording                                  

a)  RPKI Repository Delta Protocol                                     1410-1430
    RPKI Repository Delta Protocol
    draft-ietf-sidr-delta-protocol-02
    https://tools.ietf.org/html/draft-ietf-sidr-delta-protocol-02 

    Presenter:  
    
Discussion: Oleg Muravskiy, RIPE NCC 
        Discussion: 
        Jeff Haas, Juniper: I am not a current operator, but I did work on IRR 
        infrastructure years ago. 
        When you have snapshots vs deltas, when kept long term (secondary server)
        It gives the ability to see history. 
        This is benefit to research. 
        Oleg: We will save it for 17 hours. 
        
        Jeff: I would suggest to provide daily snapshots and provide deltas per day.
        
        Chris Morrow, Google: Why is the benefit of total size versus snapshot. 
        
        Oleg: The amount of data is the benefit of not keeping all the data. 
        
        Chris: It does not matter to me.  Hold deltas for 30-60 days. 
        
        Jeff: If you have an operation oops on the existing file store.  If you 
        have an oops, and a monthly plus  deltas, you can't reconstruct things if you are
        missing one day's worth of data .  If your backup is always one day
        old, the amount of data you could lose if very small.  With lifetime of
        certificates,  you want to keep the window small.
        
        Oleg: There is no reason to keep the past data for protocol or the 
        data and the information. Goal is provide current state as fast as possible.
        
        Tim Bruinznezels, RIPE NCC: Could keep data, but not in this protocol.
        If you reboot your validation server, and you 
        may have to wait while you upload the refresh.  What is the valuable 
        amount of data we save so that we do not have to download the data. 
        
        Rudieger Volk, DT: Are you concerned about the network transfer volume?  
        Difference I am seeing is a handful of packets needed to initiate the connection
        This is in the noise area? 
        
Randy Bush (IIJ): We have two issues regarding: 1) how long we will keep the data, 
and 2) when parties will not use old data. we found the second useful. 
rsync has a much narrow slide (1 hour) versus this 8 hours.  We don't want to use this
protocol for historical analysis.  Give me more motivation to mess with this data?  
Oleg: More options - we do not keep them for specific time but limit on size.
Randy Bush (IIJ): Not predictable and will decay over time. as RPKI grows, etc.
Oleg: how long to keep files for people to download, how large data footprint would be.
How much useful vs not so useful (certs vs manifest/CRL)
Tim B (RIPE): Manifests are there to detect replay.  If we can trust https:// and
identity of server instead, we can reduce the 48M from the RIPE server. 
Currently we use the default Java trust anchors, and additional manual trust anchors 
manually.
Sandy Murphy, Parsons: Some of your comparison was in terms of the relying party 
download.  The original concern was how much load on the server. Are you satisfied 
that RRDP is reducing that load.
Oleg: This is correct.  The RRDP reduced the data on the server.  Now, we are focus 
on the client. 
Sandy: You are making decision based on the current state - which is not the full 
deployment.  
Oleg: You are correct, we are in the beginning of deployment.  We will scale linearly. 
Sandy: You have implemented with RDDP and so has RPKI.net.  Anybody else?
Oleg: No.
Randy (IIJ): DRL is doing interoperability test.  I am running a child and running 
2 caches for testing.  We had interesting problems with TLS in our test.  You really
want a running server to be running TLS 1.2.   
Declan Ma (ZDNS): What about RFC6486, it does not mandate behavior for missing manifest. 
Oleg: This is the RRDP protocol that publishes the draft.  For our validation, we 
ignore anyting that does not validate 
Chris Morrow:  Your chief change would be to make the manifest/CRL to be updated less
frequently.  The manifest should be what you expect other people to see and use.
Turning the crank on you machinery more often would tell you when something is broken. 
ROAs and certifictes would be a small number (10%... for example out of 600K 10% is 6000)
(gain or loss of a customer) 
Turning the crank on the manifest 1/hour may be useful. 
DT Network might gain 10 people per hour.  - so turning it more often is good.
Oleg: This machinery revolved is not to check the machinery, but to handle changes. 
Oleg: These items I show in this chart (43 MB). 
Rudiger Volk: This data only needs to change when an object changes.  When you start 
to optimizing this process, you should understand what you optimizing.  Here, we are 
looking for inconsistencies that rsync causes in rsync data. 
Tim: We can discuss on the list.  We should have a short lived (medium) to protect 
against attacks.  
If we have https, and we can trust the data. 
Then, we can turn the crank 1-3 times per day. 

Timing: 00:52:27 into Meetecho recording 

b)  Validating an RPKI Repository                                      1430-1450
    RPKI Certificate Tree Validation by a Relying Party Tool
    draft-ietf-sidr-rpki-tree-validation-00
    https://tools.ietf.org/html/draft-ietf-sidr-rpki-tree-validation-00

    Presenter: Oleg Muravskiy 
    Discussion: 
        Oleg: What scope should this document have? 
        Rudigier: From each implementation, I would like to have an explanation about 
        what the customer gets.  That needs a common effort to systematically map the
        problem space and the solution space.
        Sandy: Did you have a list of the decisions you made? if so, please give us 
        the list of categories. 
        Oleg: We had these decisions by category. 

Timing: 00:55:35 into Meetecho recording 

c)  Validation Reconsidered -03                                        1450-1510
    RPKI Validation Reconsidered
    draft-ietf-sidr-rpki-validation-reconsidered-03
    https://tools.ietf.org/html/draft-ietf-sidr-rpki-validation-reconsidered-03

    Presenter: Tim Bruijnzeels

    Discussion: 
Randy Bush (IIJ): I am not against this proposal. There are almost no grandchildren in
existence.  (I am about to have one in the RPKI.) 
My issue that we have spent lots of time talking about  
My issue is that we have real issue with real failures with life times. 
Russ Housley: Slide 6 was what we had been looking for.  
How does this impact the validation algorithm in RFC 3779? 
Tim: I cannot speak about RFC3779 in general.
I hope the document explains how this would be applied in the RPKI. 
I would welcome feedback. 
Randy (IIJ): Rob says the changes are minimal to RFC3779. 
Is openSSL happy to taking changes deep in validation - not asked.
Doug Montgomery: 
I did not read the latest draft?  
If 192.168 is part as the ROA - are you treating the ROA atomically?
Tim: I would advise against fate sharing. 
The ROA would be invalid as a whole, they would fate share. All resources have to
be in the EE-Certificate.  Change to say "on the *Valid* set of resources in the
EE certificate.
The ROA would be rejected in this daft. 
The way is to generate different ROAs for different prefixes. 
Jeff: I think it is to have a looser validation - you want people to keep their
house clear and all that.
in the IRR, you cannot see what part is valid.   But in ROA chain, you can see
what part is valid.
The ROA does provide the the ability to see the hierarchy.
One question, if the ROA is for a /24 and that shrinks to /25.  Is a portion of it 
still valid? 
Tim: We keep this as simple as possible. So no, that would not be possible. 
Sandy: We are going to have to review other published RFCs to compare it. 
Since we have seen people selling off address space, the transfer of a longer prefix 
(/25) as part of a /24.
You will need to consider this example. 
Tim: We have had people who together hold a larger prefix, and then the people split 
it into  smaller prefixes. 
We found people want to split the ROA, but they cannot because there is only one 
signature on the ROA. 

3)  Other Work, Not WG Drafts

Timing: 1:13:45 into Meetecho recording 

a)  TA Applicability Statement                                               1510-1530
    RPKI Multiple "All Resources" Trust Anchors Applicability Statement
    draft-rir-rpki-allres-ta-app-statement-00
    https://tools.ietf.org/html/draft-rir-rpki-allres-ta-app-statement-00

    Presenter: Andy Newton
      Discussion: 
Andy: We desire to publish this as an RFC. 
Sandy:  What happens if the working group consensus is a change?
Andy: We would publish this somewhere else. 
Rudieger: Are you talking about the trust anchors and access to trust anchors? Or the
content of the root certificates.
If the trust anchors are pointing to root certificates that overclaim as you say, 
then there is a requirement that what the overclaim is and what they mean.
You might recall
I reported in Yokohama that the Root certificates do overlap, was 2000 particular 
values.  
For the relying parties, the implied inconsistencies and overlap - please explain 
what the content is and why you are doing it.   What is semantics of root certificates.
Tim: We think this is in the draft. 
We trust each other, and we have overlaps. 
There is a large overlap between root certificates. 
You can see this in the certificates issued below. 
Rudieger Volk: As a receiving party, I am careful to detect overlaps and inconsistencies. 
If there are overlaps, I demand that you explain these. 
I also request that you minimize. 
Randy (IIJ): A spin on what he saying 
If I am a rigorous relying party, I need an algorithm to programatically determine what
you are doing.
Some RIR - were going to root at 0/0. 
Terry Matterson: When I think back to RPKI, we wanted unique areas. 
If we are changing this. That says PKI is the wrong idea.
Randy (IIJ): There is a problem of transfer and make before break. Ruediger wants a 
data structure (like stat files) that will tell if a duplication is correct.
Andy: We are providing this in stat file 
Chris: Even in the case of a single root.  We need to have a double accounting to 
have a make-before-break. 
So, we need to have a system to managing this pont. 
The data structure is what is missing. 
Andy: Current RPKI will allow make-before break. 
Randy (IIJ): Rudieger wants to know when the duplicate is valid. 
Rudieger:  Customer will put in their contract that a competing certification will 
not be certified. 

[going away: 2:26pm BA time] 

Session 2:
    
    Wednesday, April 6, 2016 (ART)
14:00-16:00        Wednesday Afternoon session I        Buen Ayre A

Start:
Timing: 00:01:37 into Meetecho recording 

1)  Aministrivia                                                       1400-1410

    Presenter: chairs

2)  Other Work, Not WG Drafts

a)  Route Server Origin Validation                                     1410-1430
    Signaling RPKI Validation Results from a Route-Server to Peers
    draft-kklf-sidr-route-server-rpki-light-00
    https://tools.ietf.org/html/draft-kklf-sidr-route-server-rpki-light-00

    Presenter: Thomas King

b)  Requirements for RPKI Relying Parties                              1430-1450

    Presenter: Declan Ma

c)  RPKI Deployment Considerations: Problem Analysis                   1450-1505
    and Alternative Solutions
    draft-lee-sidr-rpki-deployment-01
    https://tools.ietf.org/html/draft-lee-sidr-rpki-deployment-01
    Presenter: Yu Fu

d)  Wither Adverse Actions?                                            1505-1515
    

3)  Discussion                                                         1515-1600

    - Chartered items nearly done, what now for sidr?

    - Intended status for the BGPsec protocol?


Start:
Timing: 00:01:37 into Meetecho recording 

1) Administrivia

   - Minute taker? - Sriram 
   - Jabber Scribe? John Scudder 

Timing: 00:05:25 into Meetecho recording 

a)  Route Server Origin Validation                                     1410-1430
    Signaling RPKI Validation Results from a Route-Server to Peers
    draft-kklf-sidr-route-server-rpki-light-00
    https://tools.ietf.org/html/draft-kklf-sidr-route-server-rpki-light-00

Thomas King: 
    
    How to signal prefix-origin val. results to peers?

    Mauricio Oviedo -- Costa Rica IXP
    Good to see large IXPs such as yours enabling RPKI validation.
    Enabled this in Costa Rica national IXP
    Customer experience -- enabling RPKI results in better routing operation in the
    country.
    Ecuadorian experience is documented in a draft that was very helpful.
    Robert: would like to get in touch to discuss the draft in detail
    Sriram: Good with not including the attribute for the case when no validation
    is done.  Not having the community conveys the message that it was not validated.
    As long as "not found" is not the default
    Robert: need to agree on text for how receiver interprets that
    Doug: Fourth value for "could not access RPKI" could be useful, more useful
    than removing the community.
    Robert: building on other draft that has only three values - but if enough
    demand, we could change that    
    
    Ruediger: Not judging whether you should extend or not.  "Not yet classified" 
    attribute will be better than saying nothing; Matching for a specific attribute
    is much easier to configure than "not any of the others"
    
    Oliver: new ROA changes validation; possibility of flapping
    
    Distinction between valid and "I have no idea" is useful
    In genaral 4th state is a good idea; worth taking another look
    
    TK: I will talk to authors of the other draft
    
    John Scudder: Makes me sad that these comments didn't come before; 
    But since it is coming up now, we'll take another look why we should need it
    Asked chairs what we should do
    
    Sandy: Puzzled what we would do differently with 4th state
    Oliver: 4th state is something the operator can do something with it
    4th state can have added value; holding state until you have real state;
    It's a little slippery slope why you need to have it!
    
    Ruediger: ietf should not say the aditional info will not be made available;
    certified vs not certified is important
    some part of edge may be RPKI enabled, some may not.

    Doug: one part of net is using one cache, one part uses another, and
    one cache is down.
        
        
        Sandy: juniper code -- status of the support.  looking glass that is
        Juniper shows a value that says whether or not it has been verified
        Jeff: displayed by CLI if verfied or not; by default it is unverified
    

=========================================================

Timing: 00:27:40 in Meetecho recording
    
b)  Requirements for RPKI Relying Parties                              1430-1450

    Presenter: Declan Ma
    
Plans to submit a draft on it soon
    
Comments/Questions:
    
TimB (RIPE NCC): we recently started writing a doc how we do validation in our case; 
It came up it would be useful to have a common document written between implementors, 
try to find commonalities, write a standards document to describe how relying party's 
validation must be done.
How would that relate to this requirements document.

Ma: we are trying to provide a framework -- this doc is simple and abstract

Tim: This doc can reference another doc that describes validation and provides broad 
context.  This document takes a high level view.  There could be a common document 
between the implementors to describe validation.

Ruediger: I was thinking what is the relation betweeen yours and Tim's draft he
presented yesterday.
I would be happy if each implementation comes with a clean doc about what that 
implementation is doing
So we can compare what actual different implementations are doing
No implementation will actually will do everything
Yes, so there may be value in explaining the whole problem field

Declan Ma:  one is relationship of our document and Tim's document.  This document
discusses, for example, delivering cache to bgp speakers - that is not covered
in Tim's document.  Second, this is a summary and digest of other documents about
relying party so it will not go into details of how relying party reacts.

Randy: This is fine theory; users/implementors will benefit from Tim's doc
I think Tim's doc is excellent; users will benefit from knowing what relying parties 
(like RIPE, Dragon's) are doing.  Can't promise we can produce such a document.


Timing: 00:44:30  in Meetecho recording

------------

c)  RPKI Deployment Considerations: Problem Analysis                   1450-1505
    and Alternative Solutions
    draft-lee-sidr-rpki-deployment-01
    https://tools.ietf.org/html/draft-lee-sidr-rpki-deployment-01
    Presenter: Yu Fu

Yu: We have RPKI deployment in China; we want to share experience
We offer RPKI service to our subscribers
6.29% (39500) prefixes can be validated Internet wide

Operational errors by CAs are inevitable; cause significant impact on Internet routing
Unilateral resource revocation
CA may maliciously offer different viewes to diff ISPs

Mentioned use of Suspenders and INRD (file external to RPKI repository)

Data sync -- uses rsync -- not well standardized, problems of security, stability, etc 
Solution is RRDP -- effectively eliminates a number of consistency related issues
CCNIC has procedure that improves on rsync (?)

Problems of stages and incomplete deployment -- can cause Invalid routes

Comments:
    
    Carlos: I have 2 diff types of comments
    Your talk mixes a bit of what Adverse actions document states; 
    better to focus on your deployment experience
    You said 18% of members of LACNIC have RPKI -- that would mean ~2000 ; not right
    18% of prefixes or IPv4 space may be correct
    Try not to ascribe the Ecuador and Costa Rica efforts to the governments
    
Tim: I have a long list of comments; I would be intereted in talking to you

Sandy: Does Mauricio (the Costa Rica person) want to add anything more;
Would be also interested hear what the Japanese folk you mentioned have to say about 
their experience

Yu: We have communications with the Japanese. JPNIC has deployed RPKI in their country

Sandy: Some of the things you brought up are issues in the WG and have solutions in the WG
Not sure what you are proposing; Are you proposing alternate solutions to what is in 
the WG?

Yu: That is not the case

Mauricio (Costa Rica): Interesting to add more details about the deployments
that have been successful.   Also from ISP perspective, valuable to add that information

Sandy: You said "CA might revoke all resources associated with ISP"
Are u suggesting a mechanism that would prevent the revoking resources (recovering
resources from a departed customer)

Yu: CA must let the affected party know that they are being revoked

Sandy: Let us see what comments the WG adoption call brings 

Randy: I come from Japan; the draft will benefit if you have communication with the 
ISP/JPNIC etc. in Japan.
You said "rsync does not work", where are the numbers to support it?
People in the WG have done detailed measurements

Ruediger: I was not clear what deployment experience you gained; 
RPKI deployment has many independent components;
running CAs, various uses of the RPKI data by RPs, etc. are different things
You should get into those specifics
Otherwise, we get only get fuzzy notions that something is happening

---------------

Timing: 01:11:55 in Meetecho recording

d)  Wither Adverse Actions?                                            1505-1515
    
Adverse actions:
    
    Sandy: Our AD Alvaro has said something; 
    RIR document published in GROW
    
    Alvaro: Somethings can happen with RPKI; those can also happen with RIRs
    It is not clear that RIRs need to be talked about in SIDR (?)
    We would be over stepping.
    
    Ma: Making that comparison in the adverse action draft is useful
    Sandy: Be careful not to go against the advice of our AD
    


------------

Timing: 01:16:10 in Meetecho recording

Chris: SIDR Future

Alvaro asked us to discuss the state of SIDR today/tomorrow
What is next for SIDR
Create Ops area or SEC area group to manage ongoing work related to deployment etc.

Randy: We do have a bunch on the table; with current rate of pace the WG has to be 
around for a while
Chris: Alvaro is suggesting limit new work and plan for wrap up
Randy: Alvaro is optimistic

Jeff: Dormant is a valid state; do not shut down
For this work, OPS and SEC stuff is too expert level
Sandy: Do you see a difference between dormant and ... 
Jeff: You want some place where RPKI/BGPsec questions can be addressed 
Geoff: Wash your hands; decalre success; 
Chris: I dont care either way
Geoff: I am not directing it at you; I am speaking to the room
What we are currently doing in the WG has nothing to do with routing

Tim: I have a feeling a lot of the work and a lot of the technology is in place; 
other groups can pick up

Dan York: I have spent time looking at deployment issues;
here we are -- published -- we get on
Our protocol design woked great in 2015; it doesn't work in 2018 during real deployment
Makes sense to continue having the expertise contnue and weigh in during deployemnt
time whe the exerise is needed again
Need some place (eg SIDR ops) where that group of experts are available

Sean: It is hard to start a WG; even harder to kill it
Keep the mailing list alive
draw a line; close it out; if we need to change crypto alg, etc., it can be done elsewhere
You can say now 'we are done'

Yu: If we have tech problems in our deployment, 
we need to have a place to bring it to for discussions

Wes George: I am more concerned about what goes on with BGPsec;
not implemented yet; May be declare it experimental; 
if we need expertise we can regroup for new work on that

Wes Hardaker: I agree with Wes about BGPsec needs help; fact is there are two 
implementations and they are interoperable; it is not experimental; 
in fact, the revisions in the latest v-15 were motivated by inputs from two implementors  

Wes: Mispoke on implementation ...

Russ White: I agree with Wes George; the spec is solid and probably useful
I don't see how I can deploy it in my network; I don't know how it scales

Wes H: Every proctol standard in ietf ... we don't know if it will scale;
ietf produces a ton of protocols that we dont know if it scales until after deployment

Randy: speaking of ... How is flowspe going?

Jared: Many netwros have deployed Flowspec; jury is still out; but flowspec is being used
and has been aroud for years

John Scudder: Have we digressed from "future of the WG"

Russ: Financial consideration forbid BGPsec

Wes George: BGPsec -- if it can be seen as something that can be deployed then we need 
a WG to be there; 
If we need to revisit SIDR work, we can recharter
We probably didn't get it right
Not that we need somebosy to call; what I meant is that if there are tweeks necessary.
that cannot be done in OPS or SEC s

Jeff: experimental definition from RFC 2026; experimental is not the right status;
"Proposed standard" is right;
It can be dead evntually but "proposed standard" is right

Wes H:  This discussion is one IETF too early; On BGPsec we may still have work to do 
 before it gets a RFC number (there are unknowns still)
 
 Randy: We need the mailing list to close the WG docs still in progress
 
 Ruediger: Splitting up stuff does not make sense
 The notion that it is a little premature to close down ...
 I will offer on +1 on that
 We need to see more progress on deployable BGPsec
 Moving the group to OPS or SEC will not matter a lot
 
 Chris: Alvarado, should we continue in Berlin and do you have the info you need
 
 Alvaro: We probably will continue to talk some more
 We are not going to stop the work; we can continue the discussion
 
 Russ Mundy: It seems to me that the AS path bit in the charter has not really been met
 
 Sandy: Alvaro, BGPsec experimental vs. standard
 
 Alvaro: Deployment of BGPsec will not happen soon;  Origin val is progressing but slow;
 BGPsec implies changes to Internet and means operational changes as well;
 two choices are experimental or stabdard; I am encouraging the WG to think it is 
 experimental
 
 Alvaro did a 'Hmm' vote
 
 Alvaro: OK, we'll continue discussion on the list   
 
 Sandy: some question about validity?? on the SIDR list
 
 Russ Mundy: I strongly object to Experimental; it discourages deployment
 
Dan York: Agreeing with what Russ Mundy said; are we just writing off BGPsec by calling 
it experimental
 
 Ruediger: Experimnetal means IETF does not conside it significant security improvement
  
Meeting adjourned.