minutes-115-rtgwg-202211101300-01

IETF 115 RTGWG Agenda

Chairs: Jeff Tantsura (jefftant.ietf@gmail.com)
Yingzhen Qu (yingzhen.ietf@gmail.com)

WG Page: http://tools.ietf.org/wg/rtgwg/
Materials: https://datatracker.ietf.org/meeting/115/session/rtgwg

15:00-16:30 - Wednesday, November 9 Session III
***************************

Meeting Administrivia and WG Update
Chairs (10 mins)

Stewart: we had a very productive meeting with the authors this
afternoon, hopefully this draft will progress soon.

=============================================
WG document Update
=============================================

YANG Models for Quality of Service (QoS)
Aseem Choudhary (10 mins)

https://datatracker.ietf.org/doc/draft-ietf-rtgwg-qos-model/

No questions asked. There are still review comments to be addressed.

SRv6 Path Egress Protection
Huaimo Chen (10 mins)

https://datatracker.ietf.org/doc/draft-ietf-rtgwg-srv6-egress-protection/

Ketan: The majority of this draft is SPRING and LSR related, you
mentioned that it was reviewed in LSR, how about SPRING?
Huaimo: I sent this draft to SPRING for comments a couple of years ago.

Yingzhen: before LC we will need to get feedbacks from SPRING.
Jeff: there are a few documents that share WGs. We will try to get
SPRING review this.
Yingzhen: The next presentation is on similar topic, let's have a
discussion after it.

=============================================
Individual draft
=============================================

SRv6 Egress Protection in Multi-homed scenario
Weiqiang Cheng & Wenying Jiang (6 mins)

https://datatracker.ietf.org/doc/draft-cheng-rtgwg-srv6-multihome-egress-protection/

Darren Duke: I haven't seen the PSD sid discussed in SPRING, did I miss
it?
Weiqiang: not yet. we'll try to present in SPRING.
Darren: or just send an email and discuss it.
Weiqiang: we'll.
Huaimo: I'd like to explain the differences between these two drafts. My
understanding is that this draft is trying to provide service
protection, on the backup egress node, a backup node needs to be
configured. The behavior of the SID should be mirrored or similar to the
primary. At ingress node, we can use the backup SID to achieve
protection. While our draft use mirror SID and no configuration needed,
we mirror the primary node and forward the packet. One focus on service
and the other is just protection.
Weiqiang: these two drafts are different and solve problems at different
levels.
Ron Bonica: it is a problem that needs to be solved but there were a few
impediments. First this needs to be brought to SPRING as well as 6man
since it changes the semantic of SRH. Second it does change the meaning
of fields in the routing header, so how to maintain compatibility?
Weiqiang: agreed with you, we need comments from both SPRING and 6MAN.
Regarding the 2nd question, I think it's compatible with SRH processing
rules.
Ron: you changed the next segment and segments left entries.
Weiqiang: that's right. we introduced new flavors.
Ron: finally how do this interact with c-sid?
Weiqiang: It's compatible with c-sid draft.
Jeff: this needs more discussions.
Ketan: question on slide #6. About penultimate node, it could be several
hops away, how does this work? assume P1/A1 is penultimate. when P2
detects the failure, how does it know the end/overlay node behavior
since it's underlay?
Jeff: please send the questions to the list.

SRv6 Midpoint Protection
Huaimo Chen (10 mins)

https://datatracker.ietf.org/doc/draft-chen-rtgwg-srv6-midpoint-protection/

Jeff: have you confirmed with SPRING and 6MAN that your changes are
compliant?
Huaimo: right now only the end point can change the destination.
Yingzhen: are you going to change the SRH or use another layer of
encapsulation is not clear in the draft.
Huaimo: not changing the SRH, only the segment left. When one node
fails, the previous endpoint decrement the segment left. It's a simple
change, not violating RFC 8200.
Yingzhen: if you skip one node, you can't not guarantee there is no
loop.
Huaimo: it's after IGP converge, we go along the shortest path. There
should be no loop.
Jeff: IGP convergence is slow.
Huaimo: there are two stages: before IGP converge, the adj node to the
failure will do ti-lfa. after converge, every nodes know the failure,
the endpoint node will bypass the failed node.
Darren: I don't think this change is discussed. If you're keeping the
original list, just decrement the segment left, you will break it. You
can't do that.
Huaimo: I'll send a pointer of the discussion thread.
Jeff: Please send another email to SPRING and discuss it before we can
consider adoption.
Ketan: again, the same question as the previous presentation. You're
assuming directly connected nodes, this is not going to be all cases.
I'm wondering how practical it is, or it only works with hop by hop
path.
Huaimo: there no assumption that endpoint is directly connected to
failed node, it can be multiple hops away.
Ketan: it's not clear in the draft.
Huaimo: after IGP converges, it knows the failed node and hence bypass
it.
Jeff: Please discuss it in SPRING.

Generalized IPv6 Tunnel (GIP6)
Qiangzhou Gao (10 mins)

https://datatracker.ietf.org/doc/draft-li-rtgwg-generalized-ipv6-tunnel/

Greg Mirsky: thank you for sharing your imagination. are you aware of
the work on MPLS actions?
Qiangzhou: No.
Greg: Please check the work in MPLS and DetNet, not to duplicate work
there.
Qiangzhou: I don't think it's duplicate.
Greg: since you're not aware of the work, you can't do a comparison. I
don't think what you're proposing is required.
Darren Duke: more for wg chairs, this draft is defining all new
encapsulations, is this the right WG?
Jeff: Usually there are requirements, I'm waiting for feedbacks from
transport area. The cost of doing this work is high while not that much
benefits. I understand what you're trying to do but I'm not sure the
industry really needs it.
Acee: this is interesting exercise, you map all encap heads into IPv6
routing header. This would have been a good research paper, but
considering how long these things have envolved and deployed, you will
neve see the light of the day. with this scope, your encap is trying to
map all cases, it never gonna happen.

From chat:
Ketan Talaulikar
Does this GIPv6 tunnel thing not belong to intarea?

BGP Blockchain
Mike McBride (10 mins)

https://datatracker.ietf.org/doc/draft-mcbride-rtgwg-bgp-blockchain/

David Lamparter: for block chain, you typically have proof of work etc.
The only thing I see this useful is the proof of something, such as
address. are you looking for something like that? or is it still too
early?
Mike: not yet, but good point.
Jeff Haas: having this as part of protocol control loop probably is not
a good thing. I'd suggest taking them throw into fire and keep warm. BGP
is about policy.
Dino: Agreed with Jeff, it's not for the protocol. It could be used for
administrative things, such as proof of ownership, registries.
Jeff: we're happy to provide a place for this work, but please continue
to work on it and add more meat.

From chat:
Eduard V
Smart Contracts are based on the particular Blockchain platform. Which
one is proposed? Why?

Jeffrey Haas
One of my criticisms for trying to use blockchains is even though they
are distributed for the work, the chain itself is effectively
centralized. The market for much of what we’d want for a competitor to
pki for bgp resources is further decentralization.

==============================================

SCION Update and Q&A
Nicola Rustignoli (20 mins)

Related drafts at the moment (including some planned ones):
draft-dekater-panrg-scion-overview
draft-rustignoli-panrg-scion-components (recently discussed at
panrg and updated)
draft-dekater-scion-pki (PKI specification)
Routing - Control plane specification (to be written)
Forwarding - Data Plane specification (to be written)

Jeff: thank you for sharing this work with us.

13:00-15:00 - Thursday, November 10, Session II
***************************

====================================================================
** Invited Talk, ACM Sigcomm 2022 Best Paper Award **
====================================================================

Software-defined network assimilation: bridging the last mile
towards centralized
network configuration management with NAssim
Huangxun Chen (30 mins)

https://dl.acm.org/doi/10.1145/3544216.3544244
https://amyworkspace.github.io/hxchen/files/sigcomm22-nassim.pdf

Acee: Are you saying you use the PDF as input and generate this JSON?
Huangxun: we use HTML documents.
Acee: so the initial inputs are the html docs you get from the web?
Huangxun: yes.
Jeff: you're trying inconsistent CLIs from different vendors, it's a
wrong start. Look at YANG, it will at least give you some consistent
behavior. It's a noble task.
Huangxun: CLI is entry level to configure network devices. YANG is more
advanced, but it's not fully addressing multi-vendor network. Each
vendor has its own vendor specific YANG. Our current project is to work
on different vendor specific YANG models. I agree this field still needs
efforts to address the heterogeneity.

From chat:
Tony Li
Couldn't this be done with YANG?
Amy
YANG did not fully address the semantic interoperability between
multiple vendors due to vendor-specific YANGs. We think the
parsing-validating-mapping methodology could also be applied to YANG. We
discuss it more in the discussion session of our paper:
https://amyworkspace.github.io/hxchen/files/sigcomm22-nassim.pdf

=============================================
New Routing Architecture Proposal
=============================================

Routing on Service Addresses
Dirk Trossen (15 mins)

https://datatracker.ietf.org/doc/draft-trossen-rtgwg-rosa/

Daniel Huang: from your presentation, it seems service
identification/address is globally unique, does it have to be assigned
by a higher management utility? the service discovery is removed from
dns and routing, so it reduces delay.
Dirk: you may rely on domain name, so we rely on existing governance.
You can use your own name space or your own stuff. The second, we put
this in path, the ingress is only dealing with the incoming service
request, so it's faster than DNS. The performance could be so much
better.
Roland: Adding application info into the network may bring complexities
into the network, so what kind of service/location semantics to put in a
gateway?
Dirk: the only application is the description of the service identifier.
if hash based, the lookup can be fast. The announcement may need
additional verification, but it's ok to not at data speed.
Jeff: please continue the discussion on the list.

From chat:
Eduard V
DNS has so many extentions. It makes sense to analyze which one is
sopported by ROSA and which one is not. Comparison to DNS looks
mendatory for better understanding.
Feng Yang
Does that mean the SAG needs to learn full of the services advertised ?
That will put a big burden to the SAG
Dirk Trossen
01:00:49
@Eduard fair point to outline which DNS capability we intend to
'replace'
Dirk Trossen
01:01:53
@Feng no, the SAG utilizes the namespace-specific resolution service,
e.g., the DNS, it does not learn all services available (which would
indeed be a significant scalability challenge)
Louis Chan
01:04:17
Just processing the URL in a different way could be a solution too.
Replacing DNS would a huge processing load to routing entities
Markus Stenberg
01:06:03
I don't really see what extra it brings over dns -> anycast address ->
(some routing happening), but perhaps I should read the draft. e.g.
adding new AF is pretty big hammer.
Dirk Trossen
01:07:37
@Louis at the SAR (i.e., the domain-internal elements), the binary names
(encoded from the URLs) are indeed processed through a hash-based
lookup, so much faster than a DNS lookup. In eBPF, that is what we do,
i.e., extract the serviceID from the EH, do the hash based lookup to
determine the next hop (possibly doing a secondary step to select one of
possibly more next hops) and forward. This is fast (100ks of requests
per second), even in eBPF. Also, the load is smaller since the SAR only
processes the local client load, rather than needing to resolve all
clients of a domain.
Dirk Trossen
01:09:34
@Markus less latency due to missing DNS lookup and the ability to
realize ingress scheduling of requests, in addition to anycast type of
ROSA-level routing.
Markus Stenberg
01:10:18
If those anycast addresses are static, they should be cacheable and have
long TTL etc. Is this just optimizing initial single rare requests to
bunch of seldom used services or what?
Dirk Trossen
01:10:48
@Markus this allows, for instance, for multi-site retrieval, scheduled
at the ingress, without involving any lookup and using the existing IPv6
routes to the unicast replicas.
Dirk Trossen
01:12:10
@Markus indeed, for long-lived anycast addresses, the routed mode of
ROSA is the same, essentially, but you could do per request scheduling
for those (anycast) services.
Darren Dukes
01:14:48
@Dirk Trossen The draft mentions LISP similarities, is it correct to say
you convert the service url into an identifier then make a map request
equivalent based on the service identifier? However you do this in-band
by sending the service request to a special map server - called ingress
SAR who forwards the request and then the host updates the service
mapping locally when they get a response from the receiving service.
Dirk Trossen
01:19:33
@Darren that is indeed correct. In a manner, the SAR is the mapping
service, particularly in the scheduled mode where indeed one of possibly
several service instance IPs is chosen, based on some possibly
service-specific selection mechanism. In routed mode, the SAR forwards
to another SAR, until it reaches the instance. This is different from
the mapping process in LISP.
Dirk Trossen
01:24:10
Differences come in the mapping federation, itself. SARs are not
federated but include mapping/routing entries populated by an overlay DV
protocol. Any locally unknown service address is being directed to the
SAG, which may either direct the packet to another ROSA domain or to the
public Internet.

KIRA: Distributed Scalable ID-based Routing with Fast Forwarding
Roland Bless (15 mins)

https://publikationen.bibliothek.kit.edu/1000148953

Acee: if there are node ids you don't know, how do you know how you want
to talk to?
Roland: you choose the closes node you know in ID space, and you can
tell whether you're making progress in ID space. In case you don't know
a node that's closer to the target then there's some inconsistency in
the routing table or the node doesn't exist any more.
Acee: if you want to talk to someone you don't know, do you send
discoveries out of all your interfaces?
Roland: not all, only one.
Acee: what if it's a wrong way? or partitioned?
Roland: if the ID overlay is consistent, we'll always make progress.
Acee: so the ID tells you which way to go?
Roland: exactly.
Acee: I think I saw similar work years ago.
Roland: yes, there are previous works.

From chat:
Eduard V
01:11:30
Is KIRA a replacement for BGP?

Tony Li
01:11:59
See bullet 1.1

Boris Khasanov
01:12:08
How does the KIRA communicate with routers?

Roland Bless
01:32:54

Eduard V said:

Is KIRA a replacement for BGP?

No, it is not. It's designed for control plane connectivity, so routing
of data packets would be a different thing. One of KIRA's use cases is
its use as in-band control plane fabric for SDN.

Roland Bless
01:34:40

Boris Khasanov said:

How does the KIRA communicate with routers?

Not sure what you mean. Currently, we are using IPv6 packets and
link-local addresses for communicating between KIRA routing instances.
So a router would always have to implement KIRA for its control plane.
Using this connectivity, you could simply use ssh netconf or whatever to
connect to the router in order to configure it, e.g., its data plane
routing like OSPF or BGP.

=============================================
Individual draft
=============================================

Requirement of Fast Fault Detection for IP-based Network
Lily Zhao (10 Mins)

https://datatracker.ietf.org/doc/draft-guo-ffd-requirement/00/
Framework of Fast Fault Detection for IP-based Networks
Haibo Wang (10 Mins)

https://datatracker.ietf.org/doc/draft-wang-ffd-framework/00/

Greg Mirsky: the requirement slide. what's your goal with this work?
Haibo: to let the network help endpoints quickly detect failures.
Greg: the whole network or the device to detect failure?
Haibo the network detect failure then notify access points.
Greg: you plan to do it as distributed or centralized?
Haibo: distributed.
Greg: so each device in the network must know about the failure.
Haibo: yes, this is the first step, later there might be optimizations.

Greg: that's already done in IGP.
Haibo: the network synchronization part is not described in the
framework draft yet.
Jeff: as a participant, I want to talk about machine learning clusters,
the goal is to converge with a number of entities not number of seconds.
The infrastructure is parallel without single point of failure, the goal
is to detect a failure asap and route it in ip network. This is commonly
implemented on hosts today like flow bender or a variety of other
techniques. If you need to notify a controller you're in seconds and
your machine learning job is dead. The requirement is not suitable for
machine learning clusters.
David Black: I'm one of the original designers of NVMe over fabric and
NVMe over TCP transport. I'm surprised the storage network configuration
shown here is unrealistic. The active passive configuration is typically
moved active active these days, which means the second path is active.
if there is a failure on the first path there is an opportunity to
immediately use a second path to get the failure information
communicated without having to go through all switches. That's better
for NVMe because it's not relying on switch interactions. It seems to me
the failure detection is building based on ip accessibility, it's not a
good idea as routing is the authority for the topology, what ip
addresses are reachable. Please don't reinvent that. The draft labels
security consideration as NA, not applicable, which might also be not
acceptable. It's a great vector of dos attack. when a switch side
detects a link failure it should turn the link off, so the other end
notice it pretty quickly and you don't have a problem with two ends
disagreeing on link failure.
Jeff: all storage protocols do implement their own livability mechanisms
at much lower layers than IP, and they are very fast, 2 RTTs, we're
talking about nano seconds.
Sasha: (the slide using NVMe as an example) in this configuration, if
sw1 and sw2 were connected, and the hosts are using loopback addresses,
when failure happens the switch would reroute and host remain ignorant
of the failure, that's what network operators would prefer. If the
switches are not connected, personally I see it as a poor network design
and we shouldn't propagate new functions to host.

From chat:
Tony Li
01:27:31

The similarities to the UPA work in LSR are not small.

=============================================
Individual draft from other WG (if time allows)
=============================================

SRv6 Deployment Consideration
Qiangzhou Gao (10 mins)

https://datatracker.ietf.org/doc/draft-tian-spring-srv6-deployment-consideration/

Jeff: which of these services can't be done using as SR-MPLS?
Qiangzhou: MPLS is only used in the backbone but not campus, while SRv6
can achieve end to end connection.
Jeff: (New Network Deployment slide) looking at the features implemented
there are no preventions from using other technologies. I'll take this
to the list.
Gyan: is there any interconnect technology you're using? the core is
srv6 and campus different? (voice was breaking up, please take it to the
list)

From chat:
Tony Li
01:43:34

SRv6 is required to waste more bandwidth.
Boris Khasanov
01:46:14

@Tony: "SRv6 or not SRv6, that is the question..." :)

Intelligent Routing Method of SR Policy
Feng Yang (10 mins)

https://datatracker.ietf.org/doc/draft-yang-sr-policy-intelligent-routing/

Jeff: Due to time constraint, please take questions to the list.

Signaling In-Network Computing operations (SINC)
Zhe Lou or Luigi Iannone (15 mins)

https://datatracker.ietf.org/doc/draft-zhou-sfc-sinc/

Jeff: currently SFC doesn't accept new drafts, we're looking for ADs'
advice on next steps.

From chat:
Tony Li
02:00:14

This seems somewhat similar to MPLS MNA