Skip to main content

Shepherd writeup
draft-filsfils-spring-large-scale-interconnect

This document is proposed for publication on the Independent Stream as an
Informational RFC.

The document describes an application of Segment Routing to scale the network
to support hundreds of thousands of network nodes, and tens of millions of
physical underlay endpoints.  This use-case can be applied to the
interconnection of massive-scale DCs and/or large aggregation networks. 
Forwarding tables of midpoint and leaf nodes only require a few tens of
thousands of entries.

The document was brought to the ISE in January 2018 after having been discussed
in the SPRING working group. The working group decided to not progress with
use-case documents and the chairs suggested that this document should be taken
to the Independent Stream (confirmed to me by Martin Vigoureux).

Expert reviews were provided by James Uttaro, Daniel Voyer, and Ronald Bonica
(all reviews included below).

The authors produced a revision -09 and the I did my ISE review. The authors
produced versions -10, -11, -12 to address all of my comments.

=== James Uttaro review

a)      The authors may want to consider adding a view towards the future of SR
in the construction SFCs and how that construction can be part of this approach.
IMO using SR provides a method to create a homogeneous approach across DC and
Network, in many cases the difference between DC and Network  is becoming moot.
b)      Although this draft is specific to scale, it would be useful to have a
section that addresses the use of BGP LS in terms of recovery, convergence, and
persistence of SRTE policies..
c)       One of the themes of this is that a large scale network can be done
without RSVP. There are there those that disagree. It would be useful to state
that RSVP can still be used in this environment at least within a subset of the
overall domain..

Section 3

"Agg routes ( Agg1, Agg2, Agg3, Agg4 ) are redistributed .."

It is clear that this reduces the amount of routing state required. May be
useful to point out that a schema needs to be developed to ensure the
redistributed routes do not form a routing loop.. AT&T utilizes a bit map schema
for OSPF routes, and CVs for BGP.

"Unique SRBG sub-ranges ."

This section calls out SRGB ranges and SID values but is incomplete. I would be
explicit as farther down in this section you identify an SRTE policy with SID
values not mentioned here. Certainly one can figure it out but better to be
specific and match with your example.

" The SR PCE is made of two components. "

The list of segments provided to node A is ( 16003 ( Agg3 ), 16005 ( Assume Agg4
), 18001 ). Shouldn't it be the anycast SID of 18006? If not identify what is
18001, is it DCI4??

Section 5

" A core node connects only one leaf domain"

I do not think you can make this assumption. Actually why is this assumption
being made?

"Each leaf domain has 600 leaf node segments ."

Not sure I understand this statement.. A metro in my world is not on the order
of 6K nodes. I guess what is being described in this section is the fact that
using redistribution of a selected set of network elements coupled with the
re-use of SIDs in DC spaces allows for this amount of scale.

Section 6.1

"In this simplified ."

Yes I would agree and would explicitly say that a larger range provides for
various applications in terms of TE within a given domain.

Section 6.4

Shouldn't the SRTE policy be {16006, 18006 . } instead of {16006, 17006 . } ?

Section 6.5

Same comment as above, it would be clearer to be explicit in the diagram and
call out the SI value for all nodes i.e Agg, DCI..

A general comment I would make is the need to detail the other
benefits/drawbacks of using Binding SIDs in a network.. Are the scale
implications if not used if so that should be called out.

Section 8.3

The heading of this section is "Scale". IMO the more important aspect of
reducing the number of protocols is simplicity.

=== Daniel Voyer review

This draft relates the possible options an operator can use to make an
end-to-end architecture more scalable in terms of segment routing identifier
(SID).

With the current SR architecture, the SID are contained within a 16 bits field,
which allow for a max of 65536.

The large-scale interconnects provide a great understanding of how we can scale
beyond the 16 bits by using a unique SID – anycast SID use to mask a set of
SID. The concept is the following:

- Use and select a unique set of SID’s in the core network, while reusing
identical SID range for islands of network beyond the core network.

- At the edge of the core, facing various other networks such as
datacenter/metro/access network, add an anycast SID which will be stacked in the
packet and masked the SID from the islands networks (datacenter/access/metro).

The understanding provided from this document is essential for large-scale
network that the typical operators may have and offer a better utility of the
segment routing architecture.

=== Ron Bonica review

> This draft is not ready for publication.
>
> MAJOR ISSUES:
>
> - It is not complete. Sections 9, 10 and 11 are all TBD
>
> - The title is misleading. The draft demonstrates that it is possible to
> interconnect millions of nodes without running out of SIDs. While this is
> a necessary precondition for scaling, it is not sufficient to
> interconnect millions of nodes.
>
> There are two ways the authors could go to correct this problem. One is to
> change the title. The other is to identify all of the other things that
> might be required for scaling and address them. For example, the authors
> might want to bring the controller back into scope and explain how an
> controller or group of controllers might support millions of nodes.
>
> Personally, I think that it would be easier to change the title.
>
> MINOR ISSUES:
>
> - In Section 6.2, the authors talk about a condition in which the operator
> might choose to not redistribute the Agg nodes routes  into the Metro/DC
> domains.  While I don't believe that this materially impacts the SID
> budget, the authors might want to make this clear to the reader.
>
> - The authors might want to beef up Section 7, describing how this
> strategy could be deployed in a brownfield with SR-MPLS.
>
> - I am not sure that Section 8 has anything to do with the main topic of
> the document

Back