Meeting Notes - TEAS WG Session - IETF 122

Notes: https://notes.ietf.org/notes-ietf-122-teas

WG ICS: https://datatracker.ietf.org/meeting/122/session/33938.ics
Datatracker: https://datatracker.ietf.org/group/teas/about/

Thursday, March 20, 2025

15:00-16:30 Bangkok Time
https://www.timeanddate.com/worldclock/converter.html?iso=20250320T080000&p1=28

Room: Chitlada 2 --
https://datatracker.ietf.org/meeting/122/floor-plan?room=chitlada-2

Materials: https://datatracker.ietf.org/meeting/122/session/teas
Note taking: https://notes.ietf.org/notes-ietf-122-teas
Onsite tool: https://meetings.conf.meetecho.com/onsite122/?session=33938

Video stream: https://meetings.conf.meetecho.com/ietf122/?session=33938

Audio stream: https://mp3.conf.meetecho.com/ietf122/33938.m3u
Zuilip: https://zulip.ietf.org/#narrow/stream/teas

Post session materials:

Recording: http://www.meetecho.com/ietf122/recordings#TEAS
YouTube: https://www.youtube.com/watch?v=bTfJ-Clu0Zo

Slot#) Start | Duration | Information

01) 15:00 | 10 min | Title: Administrivia & WG Status

Presenter: Chairs

Dhruv (regarding draft-ietf-teas-actn-vn-yang): for VN YANG we are in
AUTH-48, but we cannot reach one author

Pavan: let us give him a couple of days, if we do not get back by the
end of this week we will move him to contributor

02) 15:10 | 10 min | Title: WG Draft updates

Draft: WG Drafts (not on agenda)

Presenter: Chairs

Zafar (regarding draft-ietf-ns-ip-mpls): I have some concerns on the
conclusion of the mailing list discussion because I noticed that the
sample size is very low

Pavan: We think that the call we have made was correct, based on the
responses that were sent to the list and the sample size. If there are
reservations, please share your concerns to the list.

Zafar: Understood

03) 15:20 | 10 min | Title: Scalability Considerations for Network Resource Partition

Draft: https://datatracker.ietf.org/doc/draft-ietf-teas-nrp-scalability/07/

Presenter: Jie Dong

Zafar: If your draft you are asserting that the scale size of NRP
instances is in thousands. We need to discuss this because I believe it
is in the order of tens. It is a transport network for 5G and you do a
1:1 mapping so it's not that many. Since this is a scaling draft it is
good to clarify which is the number.

Jie: The draft describes use cases of different scenarios of using
network slicing and NRP. In an early version we have mentioned that the
number of NRP required may be around 10 but there are also other cases
that require hundreds or even thousands. We mention the use cases in the
draft where we need a per-application or per-tenant NRP which are not
yet deployed but we have seen some requirements. That's why we need the
scalability draft: tens NRPs are easy to be achieved

Zafar: If this is a scalability draft it should guide people that done
1:1 mapping does not scale. Anyway we can talk a bit offline.

Zafar: There is also another WG document (draft-ietf-teas-ns-ip-mpls),
are you going to the take these two documents together toward WG LC?

Pavan: There are still open items in draft-ietf-teas-ns-ip-mpls. The
expectation is that these will be resolved before Madrid and then we
will look at progressing these two documents together. There may be some
text to be moved around these two documents but we will make the call
when we are done with the draft-ietf-teas-ns-ip-mpls document.

Zafar: Thank you. This is what I was expecting since there is some
dependency.

Adrian: To Zafar point, the NRP was introduced in order to achieve
scalability. As we did the framework document, we were pushed to say
that one option is to have one slide per NRP. This is not saying this is
a good choice but it is a possible choice and it is up to the deployment
what it does. I think what Jie is trying to do with this document is
saying these are the scaling properties and if you make this choice this
is the consequence. So it must be allowed that it scales to thousands
but be aware that, if you do that, this is what you get.

Pavan: we did have an interim sometime back and we did have some
discussion about what maximum number needs to be, but nobody came up
with the number so that's where we are. The only recommendation we made
is to just mention a minimum size.

04) 15:30 | 10 min | Title: Applicability of ACTN for POI service assurance

Draft: https://datatracker.ietf.org/doc/draft-poidt-teas-actn-poi-assurance/05/

Presenter: Paolo Volpato

Oscar: one of the goals of this document is to identify the existing
YANG models, but you mention later that it were not clear where to
address the gaps. What is the expected output?

Paolo: The document is describing the different cases for failure,
performance and protection in optical and IP networks. The idea ... We
have started this part of the analysis but we will try to complete by
the next IETF. We will try to look at the different models to see if
everything which is currently available is enough to support the cases
described in the draft. If this is not the case, we will raise a flag
indicating which bit is missing in which specific YANG model. This is
our target for the moment but we will accept more feedbacks from the WG.

Pavan: This seems fair. We did do a poll for this document last IETF and
the notes say that there was sufficient support for considering it for
adoption. Oscar and I will talk about initiating the process for the WG
adoption and see where it takes us.

05) 15:40 | 10 min | Title: DC aware TE topology model

Draft: https://datatracker.ietf.org/doc/draft-llc-teas-dc-aware-topo-model/04

Presenter: Luis Contreras

Pavan (as introduction to this presentation and the next presentation):
The next couple of presentations may not end up in TEAS WG but they are
presented here because there is some relevance either to work that has
been done in the past or currently on-going in TEAS WG. Let's discuss
where these documents belong to at the end of the presentations.

Pavan: last time you have presented it was at San Francisco IETF ...

Lou: I do not see network resources in the YANG model, are you thinking
about tying these to the network resources?

Luis: Yes, there is no network here but the idea is to associated this
with points of attachment to the network and so to the node where these
network resources are behind. For example, a DC gateway can be
associated with a node the allows the reachability of these resources.

Lou: So you are planning to add an attachment point and not to describe
the required network resources. What about requirements of an
application from the network?

Luis: I am not sure it would fit it here.

Lou: Ok, so this will be just limited to resources which are available.

Luis: Yes. The idea is that this model is to be consumed by the cloud
manager which will know the needs of the application and so, with that
knowledge, it can select the proper DC, so no need to express the
requirements of the applications here

Lou: Ok. And, what about the available network resources?

Luis: that will be part of the linkage with the TE model

Lou: So we will have a leafrefs from this model to other models?

Luis: Yes, I think so.

Pavan: Lou, do you have any thoughts on what would be the home for this
work?

Lou: I am not sure there is a good home. If you bring in the network
resources, maybe you can say it fits here.

Oscar: If we do Traffic Engineering of those and link to resources, it
might fit here, but I still need to see the TE part on this

Lou: So I think your question really belongs to the AD

Italo: I have a question on slide 2, is the solution applicable only to
TE networks or applicable to any type of network?

Luis: I would say it is generically applicable. The reference to TE
network is coming from the previous effort mentioned in this page (the
SF-aware TE topology model) but I think it cna be generically applicable
also to non-TE networks

Italo: Ok, so we can make it clear in the next draft update

Pavan: Let's discuss offline and see what can be done.

Oscar: At there there is some interest on this topic and we have had
some discussion. So I would not rule it out for the moment and I would
expect to see how we link these computing resources with TE and then
let's see if it makes sense here.

06) 15:50 | 10 min | Title: OAM for Network Resource Partition NRP in SR

Draft: https://datatracker.ietf.org/doc/draft-gong-teas-spring-nrp-oam/00/

Presenter: Liyan Gong

Greg: I have shared my comment to the list and I think this is very
timely work that really fills the gaps.
My question is about the benefits of using BFD or STAMP to troubleshoot
consistency of NRP selector programming in the data plane.
I agree that using Echo request/reply link in ICMP and LSP Ping is
reasonable and it corresponds to how these methods (either Ping or
Traceroute) have been used to find the problems and localize them in
troubleshooting. But this is not really what BFD or STAMP have been
designed for. If someone wants to monitor consistency of NRP then the
same rules that you describes for BFD or STAMP would be applicable also
to data traffic.
Why a data traffic encapsulated with an unknown NRP selector ID would
not generate the same message as suggested? The problem I see is that,
since this needs to be processed in the control plane, it could be used
as an attack vector by sending packets with an misprogrammed NRP
selector ID. I think it would be reasonable if these packets are dropped
and then, because BFD session does not come up, operator can use
appropriate tools to do the troubleshooting and localize the problem.

Liyan: I understand your comment. For BFD we did not have any extension,
we just added the NRP resource ability check. In current flow, there is
no check about NRP resource and we just forward the BFD packet or reply
it. But now we have to track it along the path of the forwarding
devices. For the second point, in the next slide I will elaborate more
about the road to the round trip where we have not extended BFD. Maybe
the BFD protocol can be done in the control plane and it is not
necessary to extend it in the data plane.

Greg: We can take this discussion to the list because my concern is that
you reference some packet type BFD reply which I am not very familiar
with since BFD does not have request/reply messages

Pavan: this is related to NRP and it is necessary work but it is not
TEAS the right home for this. We will work with you offline to figure
out where this needs to go.

Jie: I want to understand whether PING and STAMP are used to check the
availability per-hop or just at the egress node

Liyan: On per-hop along the path

Jie: But PING and STAMP are only processed at the egress. If there are
no resources available in the transit node, how can it respond with a
PING or STAMP message?

Pavan: Please take this question to the list.

07) 16:00 | 10 min | Title: In-Place Bandwidth Update for MPLS RSVP-TE LSPs

Draft: https://datatracker.ietf.org/doc/draft-alibee-teas-rsvp-inplace-lsp-bw-update/01/

Presenter: Zafar Ali

Pavan: This is a shipping feature for us and it has been deployed in
large auto-bandwidth networks, where there is frequent churn in the
network. As said by Zafar, we could not agree on the open issue for the
error message. We are using the existing error code because the
procedure does not change. Zafar has a different opinion so any input on
that would be greatly appreciated, maybe not here but to the list.

Lou: The ability to signal in-place bandwidth decrease was in the
original RFC3209, so it is not clear which problem you are trying to
solve. There was an ambiguity in whether the error is destructive or not
which was fixed quite a while ago in an update to RFC3209. We can
presume everybody is implementing that now and if not and they have an
issue, the answer is to go an implement that because we have already
fixed that problem. I am trying to understand what problem we are trying
to fix here and it seems to be an implementation problem. If this is the
case, at best we can have an informational document that explains how to
use existing RFCs. Am I missing something here and it is more than just
an implementation problem?

Zafar: Firstly, this is not for bandwidth decrease but for bandwidth
increase case.

Lou: There is already a procedure for that which is make before break.

Zafar: yes but we are saying that instead of doing make before break,
you can do bandwidth in-place modification. This is not bandwidth
decrease but both decrease and increase.

Pavan: RFC3209 always talk about using make before break for bandwidth
increase while for bandwidth decrease there was always room for doing it
in-place. This is understood. What we are advocating is to try to do
in-place even for increase if it is possible. The only thing which is
not specified in the standard is what a transit node does when it
receives an in-place bandwidth request which it can not accommodate.

Lou: this is based on the node implementation. Sometimes if you are
doing an increase you have already released the bandwidth so you cannot
just revert. I have to go and look at what Spec is there but that's part
of the description of in-place versus removed on path error and I do not
know if text is there talking about this specific one.

Pavan: It's a shipping feature and we did not see the need to
standardize it. But there are implementations which do no handle
in-place requests coming in so this is an attempt to fix that.

Lou: it was left as an implementation choice previously, so are you
saying you do not want that implementation choice?

Zafar: RFC3209 only talks about make before break for bandwidth
increase. If you look at slide 4, if you do in-place bandwidth update,
R2 gets into an inconsistent state because R2 did the admission which
die not fail in R2 but it failed on R3.

Oscar: Let's continue the discussion on the list.

Himanshu: We have a shipping product which has been deployed using the
same. For use there was no need to have a specific configuration to
enable in-place update (it is enabled by default) and if it fails, MBB
is used. It works and it has been there for a long time.

08) 16:10 | 15 min | Title: Multipath Traffic Engineering

Draft: https://datatracker.ietf.org/doc/draft-kompella-teas-mpte/00/

Presenter: Kireeti Kompella

Andrew: I think it is nice to keep the slack concept. When you are
mentioning that the DAG is a result of a CSPF output, I would actually
drop the CSFP term because where you are invoking slack it is not
actually in shortest path first. I would actually say it is a result of
other computation. The slack is a property of the thing that it is doing
the computation and I do not know if actually needs to be standardized
since different implementations may have different ways to compute their
slack. It is worth describing the concept of slack and the fact that it
can be non ECMP and how do you bound it. I do not know if it needs to be
standardize that.

Kireeti: You are right and we typically do not standardize the
algorithms. It is a good point so we will just put in a description and
then leave it like that. We can say there is value in this and
implementers may decide to do this and operators can ask for this, but
we do not standardize it.

Pavan: For those who were not at the side meeting, there is a lot to
unpack in a 10min slot, but keep look at the space and there will be
more

Adjourn 16:25