Skip to main content

Minutes IETF102: rift
minutes-102-rift-00

Meeting Minutes Routing In Fat Trees (rift) WG
Date and time 2018-07-19 17:30
Title Minutes IETF102: rift
State Active
Other versions plain text
Last updated 2018-07-19

minutes-102-rift-00
Thursday, July 19, 2018

13:30-15:30         Thursday Afternoon session I ; Centre Ville

RTG    RIFT       Routing In Fat Trees Working Group

Chairs: Jeff Tantsura,         Jeffrey (Zhaohui) Zhang

WG Status Web Page: http://tools.ietf.org/wg/rift/

1) 13:30-13:50 - WG update: interim report, plan going forward
Jeff Tantsura, Jeffrey Zhang
20 minutes

2) 13:50-14:20 - draft-ietf-rift-rift
Tony P
30 min

(slide 7)
TonyP: For ZTP you have to nail the spines first. Bruno commented that the spec
does not often explain "why?"  So, why not nail the bottom? Because often the
CLOS depth varies, so starting at the bottom has varying depths to the top.
Simpler to nail the super-spines and percolate down. But then you don't know
when you're a leaf. So nailing at the bottom can also be desirable, but nailing
at the top must be done. Bruno: "There are 2 ways to contruct a software
design: make it so simple that there are no obvious deficiencies, or make it so
complex that there are no obvious deficiencies." The draft is very complex.
There might be some scenario where I can create a loop and things will go
haywire. How can we be sure it is correct? We have an opportunity to make it
simpler. TonyP: The requirements are well specified; if you can make it
simpler, go ahead. Show me how it breaks. Pascal: There are limits with what
you can do with startup reflection; we looked at how to do disaggregatation;
this aspect of the spec will be safer. TonyP: This (ZTP) has been done before.
Alia: ZTP stuff starts with simple principles; when cascaded out into an FSM it
starts to sound complex but the basic principles amount to 4 or 5 things.
Bruno: Ties back to lack of explanation. How can we be sure it works without
that? TonyP: I can put in an epistemology but people don't like to see that in
specs at the IETF. Jeffrey Zhang: We don't need loads of explanation but just a
few sentences where there have been particular questions raised. Lou Berger: We
don't want to bloat the spec and we don't typically do that here. Jeff Haas:
The draft has to be written for an audience that are writing code, and that
don't always have the same background or competence as us.

(Slide 8)
Sue Hares: Did you consider an alternative from down to up. It's the bounce.
Tony: The trick is do I have any sudden adacencies.

(slide 9)
(Anonymous contributor): If everything was fully connected, would you get the
same result? TonyP: In full connectivity it would be optimal, this shows it
works even for non-optimal cases. Pascal: In full connectivity, each leaf picks
exactly two up-links, since each up-link reaches all of the spine routers.
Bruno: If I only pick 2 up-links then I lose half of my bandwidth. TonyP: No
this is just for flooding - not forwarding! Alvaro (as individual): Why do you
want two paths to each spine router, not 1? TonyP: It's a nice number for
redundancy. Some people like 3 instead. Pascal: Unless the CLOS is fully cabled
you will get more than two paths, as in this example in fact.

(slide 11)
Sue Hares: You are doing statistical multiplexing; why did you choose your
particular algorithm? TonyP: Empirically, I picked something easy but played
with various approaches. Welcome other suggestions. Sue Hares: This type of
algorithm is susceptible to traffic problems, so run it through a variety of
traffic patterns. From past experiences, this particular algorithm needs a
fielding test. TonyP: Assuming hashing is uniform, whatever I do is better than
nothing. Barak: Do you take the downlink speed into account? There could be
failures that are remote to you but that affect your uplink. TonyP: Then I push
more information south and burden the server further. Makes link state less
scalable. This is a design choice.

(slide 12)
Pascal: We get 8-10% loss when flooding - why?
Tony: Running lots of instances will push the limits. There is no rate limit in
the spec. It is an implementation choice how to optimize this.

Sue: This is close to what TRILL does, and it is deployed and secure.

TonyP: If you have an absolute clock in the network, you can put the absolute
time in the timestamp and then security increases; reject if time is in future
or too far in past. Will probably put this idea into -04.

Bruno: Will the nagative stuff you just described be in the -03 draft?  (Yes)

3) 14:20-14:50 - Hackathon report and related discussion
Bruno Rijsman
30 minutes

Bruno: Two implementations were developed independently from spec on paper
without comparing notes. Interop was achieved in one hour. Bruno: I have given
Tony feedback by uploading the RIFT draft to a google doc and adding comments.
Tony P: This is an excellent way to collaborate on standards development. IETF
should learn this. Bruno: RIFT defines a data model for its messages using
Thrift which makes message build / parse code easy to generate and you can be
quite confident it is correct. Wolfgang Beck: Protocols used to use ASN.1 for
message definition but the compilers had some high profile bugs in them. So
this is not perfect. Bruno: I am willing to do a FRR implementation and
contribute it, but only if operators have interest, so please let me know.
Bruno: There will be another interop test. Jeff Haas: Should formalize
Thrift-type message definitions. There are holes in Thrift - let's fix them.
Tony P: Thrift is a very serious community so upstreaming enhancements will be
hard. Tony P: Thrift has no formal specification, just one source code which
acts as the de facto spec. But Thrift still has the edge as a message
definition language.  Compiler has something like 28 different language
back-ends. Lou Berger: I have crazy, unshareable ideas on model based language
definition of protocols. Lou: some people are interested in formal model-based
protocol definitions; let’s get a draft out

JeffH: Worked with Ron B on defining BGP4 in <some modelling language>. We got
really close. Bruno: For BGP5 let’s use meta-models not model-based encoding.
Every time we introduce a new AF, everyone who propagates it must understand it
even if they do nothing with it. So you must describe the meta-structure of the
DB as well as the structure of the encoding. JeffH: ASN.1 is fine but there’s
more than one way to serialize it, which is what killed it. There should be 1
way on the wire.

JeffT: Thanks Bruno for doing this work, it is important for the IETF to prove
that what is on paper will work. JeffH: YANG has output to gRPC but if we don’t
like it we have options.

Alia: Let’s try playing with different encaps at the next hackathon.

===

JeffT: Looking for volunteers to write security stuff, threat models, YANG
models.

4) 14:50-15:10 - next steps

-Wrap Up:

Chairs:    Jeff Tantsura (jefftant.ietf@gmail.com)
           Jeffrey (Zhaohui) Zhang (zzhang@juniper.net)