Minutes IETF124: nmop: Mon 19:30
minutes-124-nmop-202511031930-00
| Meeting Minutes | Network Management Operations (nmop) WG | |
|---|---|---|
| Date and time | 2025-11-03 19:30 | |
| Title | Minutes IETF124: nmop: Mon 19:30 | |
| State | Active | |
| Other versions | markdown | |
| Last updated | 2025-11-19 |
Network Management Operations (nmop) WG Agenda - IETF 124
- When: Mon, Nov 3, 2025
- Co-Chairs: Benoît Claise & Reshad Rahman
- Secretary: Thomas Graf
Current WG's Priorities
- P1: NETCONF/YANG Push integration with message brokers & time
series databases - P2: Anomaly detection and incident management
- P3: Issues related to deployment/usage of YANG topology modules
- P4: Consider/plan an approach for updating RFC 3535
- P5: Knowledge Graphs
Compact Agenda
Session 1
| Slot | Priority | Topic | Presenters |
| :-: :-: :-: :-
| 14:30 - 14:35 | | Agenda Bashing & Introduction | Chairs |
| 14:35 - 14:50 | P1 | YANG-Push to Message Broker Integration | Thomas Graf |
| 14:50 - 15:00 | P2 | Validate Network Telemetry Messages Implementations | Thomas Graf |
| 15:00 - 15:10 | P2 | YANG data model for Network Incident Management | Qin Wu |
| 15:10 - 15:20 | P3 | SIMAP concept | Olga Havel |
| 15:20 - 15:35 | P4 | RFC 3535 20 years later | Luis Contreras |
| 15:35 - 15:55 | P1 | YANG Message Keys for message broker integration | Thomas Graf |
| 15:55 - 16:05 | P2 | AI-based Network Management Agent | Xing Zhao |
| 16:05 - 16:15 | P3 | Model for distributed authorization policy sharing | Lucia Cabanillas |
| 16:15 - 16:20 | | Generalized capability principles | Nigel Davis |
| 16:20 - 16:25 | | Graph based meta schema | Wim Henderickx |
| 16:25 - 16:30 | | Open | All |
Session 2
| Slot | Priority | Topic | Presenters |
| :-: :-: :-: :-
| 14:30 - 14:35 | | Agenda Bashing & Introduction | Chairs |
| 14:35 - 14:50 | P2 | China Telecom: Sharing your incident | Chongfeng Xie |
| 14:50 - 15:05 | P3 | SIMAP Hackathon results | Vivekananda Boudia |
| 15:05 - 15:20 | P5 | Knowledge Graphs Design Team update | Lionel Tailhardat |
| | | | Michael Mackey |
| 15:20 - 15:30 | P3 | draft-havel-nmop-simap-yang and side meeting summary | Olga Havel |
Session 1: Detailed Agenda
1. Agenda Bashing & Introduction (Chairs) (5 min)
2. YANG-Push to Message Broker Integration (15 min)
- Presenter: Thomas Graf
- Slides
- Draft: draft-ietf-nmop-yang-message-broker-integration,
draft-ietf-nmop-message-broker-telemetry-message
Benoit Claise as a chair: I believe I know what Mahesh is asking. Can
you briefly jump to slide 11. Mahesh, I asked Thomas since the authors
are asking for OPS directorate review to list all the document
references, the status and who were the reviewers to help the OPS
directorate to choose a reviewer so that the reviever perhaps would be
already familiar with the topic to avoid having to read into all the
documents to understand the context.
Mahesh Jethanandani: That's very helpful but that's not why I came
forward. Regarding the first slide where Paul asked to add references.
How stable are those references?
Thomas Graf: These are references to research papers.
Mahesh Jethanandani: So they are stable. Good.
Thomas Graf: There is actually also an IETF blog post on YANG adoption
in the
industry from Benoit. However, it is a bit outdated so we opted for
these two references.
Diego Lopez: On behalf of Nacho who was not able to join. I suggest to
add text to describe the relationship between Data Catalog and Streaming
Catalog.
Thomas Graf: That's a good idea. Will do this in the next revision.
Holger Keller (chat): Stream catalog vs. Data catalog. DT calls it data
catalog, streaming is Not yet "implemented" everywhere
Benoit Claise (chat): Holger: "data catalog" made me think of the YANG
catalog
Rob Wilton: Yes, I agree, the telemetry message should be a YANG data
structure.
Reshad Rahman: We had a chat in private. Can you add more why YANG data
structure is the right choice.
Thomas Graf: To my understanding, a container is meant to be used for
data being used in context of a YANG datastore. Where notifications are
meant to be used for notifications out of a YANG datastore. Where YANG
data structure is defining data which is other than data from a YANG
datastore or notifications. Since we are describing a telemetry message
between a message broker producer and consumer, we believe that YANG
data structure is the right choice.
3. Validate Network Telemetry Messages Implementations (10 min)
- Presenter: Yannick Buchs
- [Slides](https://datatracker.ietf.org/meeting/124/materials/slides-124-nmop-sessb-validate-network-telemetry-messages-implementations
- Draft: draft-ietf-nmop-message-broker-telemetry-message
Rob Wilton (chat): On the question of Information vs Standards Track, it
should be Standards Track.
Mahesh Jethanandani (chat): I would agree with Rob that it should be
Standards Track.
Rob Wilton (chat): standards track because it is effectively defining an
API (the sx:structure message) between two separate devices.
4. YANG data model for Network Incident Management (10 min)
- Presenter: Qin Wu
- Slides
- Draft: draft-ietf-nmop-network-incident-yang
Reshad Rahman as an individual: I provided some comments on the document
last week. I understand it has not been updated yet. One of the things
which surprised me a little bit when I read the doc was if I remember
properly that the incident server creates the incident but the incident
client decides how to resolve it. I found that a bit odd because the
server based on device alarms finds issues and decides there's an
incident correct? It was not clear to me whether the client has enough
information to resolve the incident or not. There is maybe a gap in the
document or my understanding.
Qin Wu: That is a good and valid point. I haven't thought that through,
but it depends on how incident client and server are implemented,
whether they rely on OSS/BSS to resolve incident, or follow intent based
network approach. We will clarify this and make the text more clearer.
Chongfeng Xie: You mentioned the demarcation in multi-domain scenarios
correct? It is based on the assumption that this incident is caused in
one domain, right? But maybe in some cases an incident affects
multi-dommains. How do you deal with such scenarios?
Qin Wu: That's a good question. We gave some examples how you can use
this incident in the YANG data model. If we consider multi-dommains oss
or orchestrator to correlate all incident related information from all
domains, for example optical and IP domain, an AI related algorithm
could be facilitatet to locate the incident. Happy to take your cases as
new input.
Nigel Davis: Regarding your question on the reasoning why the client
understands that the instance is cleared is because the although the
server may think all the alarms have cleared, it's only the client that
can get the external view. For example the fiber has been fixed and the
repair crews have gone and all the tests have been done which server
doesn't know. So the client has the broader picture which is why it then
clears the incident.
Reshad Rahman: But what about a case if the client thinks that the
problem has been resolved but the server is still detecting any alarms,
maybe the wrong fiber was fixed.
Nigel Davis: Then then the client shouldn't be able to clear the
incident. It should really be once all the alarms are gone. Then the
client now has the additional responsibility of saying it's now really
genuinely gone because it has that broader picture. But it shouldn't be
able to do You're right. It shouldn't be able to do it when the alarms
are present. Okay. And if it did then a new incident would come up
immediately.
Reshad Rahman: So I think that part in the doc maybe needs some extra
love.
5. SIMAP concept (10 min)
- Presenter: Olga Havel
- Slides
- Draft: draft-ietf-nmop-simap-concept
Mahesh Jethanandani (chat): Just want to make sure that the SIMAP side
meeting is not covering any WG-related work.
Reshad Rahman (chat): My recollection is that it's for
draft-havel-nmop-simap-yang i.e. not a WG document yet
6. RFC 3535 20 years later (15 min)
- Presenter: Luis Contreras
- Slides
- Draft: draft-ietf-nmop-rfc3535-20years-later
Benoit Claise: What do you expect from this working group? What I see
from the bullet points, gathering feedback from the working group, these
are operator requirements, right? So I go and say this is what I would
prefer but I'm not an operator. So and also you mentioned going back to
to RIPE and NANOG. So knowing that there are more people and just
operators, what do you expect here from NMOP. Because right now it seems
like you're doing the work and you're going to validate with operators
only. So you see where I'm slightly confused?
Luis Contreras: I think whatever feedback is useful. At the end
operators set the priorities but the work should be done by the entire
community. So whatever feedback is useful for sure. But our point here
is somehow we are stuck of gaining more feedback.
Benoit Claise: From NEMOPS you had a good list of requirements from
operators. I guess that you don't want more feedback about new
requirements here, you actually just want feedback on the priorities
Luis Contreras: yes, feedback on the priorities
Rob Wilton: I was just going to comment actually having feedback from
the vendors who are likely to implement the sort of requirements might
also be interesting. Especially wherever those requirements can be
fulfilled likely or not. Having some sort of feedback coming back from
the vendors might be useful.
Benoit Claise: So you set requirements that you don't believe as a
vendor that the operators need.
Rob Wilton: No. Requirements that we don't believe as a vendor that we'd
be able to implement with current software and current hardware.
Benoit Claise: ok, with current hardware and software.
7. YANG Message Keys for message broker integration (20 min)
- Presenter: Thomas Graf
- Slides
- Draft: draft-netana-nmop-yang-message-broker-message-key
Reshad Rahman: for me, subject is a schema, right?
Thomas Graf: correct
Reshad Rahman: You try to get multiple scheme in the same topic.
Thomas Graf: An example, YANG-Push. One session but has multiple
subscription. Subscription = Subject
Rob Wilton: the deduplication process, is it YANG-Push specific.
Thomas Graf: yes, the schema comparison.
Reshad Rahman: Was not clear why you needed to go from one topic for
everything to one topic per schema?
Thomas Graf: You give me the segway. Slide 12 answers that. In short
it's about reducing the amount of stored messages since message keys
enabes topic compaction and reducing the number of messages to be
consumed since only the interested messages are consumed.
Reshad Rahman: When you have multiple topics, do you worry about message
re-ordering? Are you leveraging timestamps?
Thomas Graf: Ordering is already built in the message broker itself.
There is a concept of message id's.
Nigel Davis: How is compaction applied? Immediately or later?
Thomas Graf: Later, I believe this is called lazy compaction.
Nigel Davis: In lazy you can delay up to a defined value the time until
compaction should be performed.
Thomas Graf: This is exactly what the message broker usually does.
Nigel Davis: With compaction, I only have the current state. As a
message broker consumer, if I lost state, how can I re-consume the
current state?
Thomas Graf: This is done by changing the consumer group id. With the
consumer group id the message broker knows what messages were already be
delivered for which consumer. As a message broker producer I have the
choice wherever the topic is compacted or not and if not what the
retention time of that topic is.
Nigel Davis: Is that written in the document?
Thomas Graf: No. Because that specific knowledge is how message brokers
work. In the document we describe how YANG and YANG-Push can be
facilitatet in indexing messages and naming topics.
Camilo Cardona: Please use also other examples than ietf-interfaces. I
do know it is an example everybody knows. I would fancy something like
quality if service with multiple queues. Having two or more keys in a
list.
Thomas Graf: In slide 15, we mention the next steps. One of the items is
that we take YANG-Push subscription examples from the previous IETF 123
hackathon and see wherever the specification for generating message
keyes and topic names work.
Holger Keller: Obviously I like the idea. Regarding the creation and
consumption of topics and how this works. Should that not be mentioned
in the document as well?
Thomas Graf: You are probably refering to the automatic topic creation
(in Apache Kafka: auto.create.topics.enable). Which is needed for the
message broker producer and perhaps also the wildcard consumption on the
message consumer. I don't think this should be described further since
this is more related to how a message broker works and might differ from
implementation to implementation. The document scope is on using
YANG-Push and YANG to create message keys and how to index YANG data in
message broker.
Reshad Rahman: I start a poll to understand how many people have read
the document. Yes 11, No 14, No Opinion 1
Rob Wilton: Great presentation. This is really interesting. I think I
need to go through the slides in more detail since you went through them
quiet quickly. One of the questions which came up today at the NETCONF
working group. Why not doing gNMI than YANG-Push lite? I think this is a
great presentation explaning why it is important not just to get the
data out to the next hop but integrating into a wider system instead.
And if YANG-Push is inherently better than gNMI for various reasons
that's a compelling reason to the market to why this is a good solution
because I think people will want to implement this because they get
other bits that come along with rather than because it's there already.
Thomas Graf: Absolutely. One example is the discoverability in YANG-Push
what we can subscribe. The same thing we also have with the Stream
Catalog which describes what can be subscribed on a Message Broker.
Rob Wilton: On another item. You mention "statistics" for one of the
three metrics because you are not always getting statistics with
periodical subscriptions. Maybe use a different term there. And you
mentioned partitions. I am not sure wherever this is a layer violation
when adding or removing partitions from a topic. You may not want to do
that.
Thomas Graf: Ack on the the term statistics. There are indeed documents
describing the challenges of changing partition count through the topic
lifecycle and its implications and how to deal with it. Sure, we can
extend the document and describe this scenarios and perhaps refer to
thos documents.
Reshad Rahman: On your first slide you mentioned no network access and
database access needed. The second part, no access to database needed,
surprised me a bit. Are you trying to remove database? Is that a goal? I
mean, I'm not questioning whether one should or should not have.
Thomas Graf: I can give you an example. When we developed anomaly
detection as proof of concept. We ingested all the data into a database
since the data is querried periodically in near real-time there are a
lot of queries, right? Therefore, databases for automation in analytics,
they do not scale well. Usually databases are being used more that
humans can on demand accessing the
the data and not only the current but also historic data. Therefore
streaming scales higher and if possible we want to have everything in
streaming and this is where the industry is heading to and also there is
more and more a blur about streaming versus storing the data temporarily
in a time series database versus having them for long-term stored.
Diego Lopez: Just a short question. Maybe this might be extremely naive
or not. Have you considered other message brokers apart from Apache
Kafka?
Thomas Graf: Absolutely! We mentioned two message brokers, Apache Pulsar
and Apache Kafka, becauseApache Pulsar is a bit newer than Apache Kafka
and everything which we discussed today works on both message broker.
But there are also others. So most of the things are supported by most
message brokers. Topic compaction not by everyone because that's a bit
more more advanced but these techniques are really generic. I was
showing the picture of NASA at the beginning in the 60s. What we're
doing here is no rocket science really.
Reshad Rahman as chair: We already asked this question many times. There
is no normative reference from
draft-ietf-nmop-yang-message-broker-integration to this document
correct?
Thomas Graf: Correct!
Camilo Cardona (chat): @thomas, also since you have a length limitation,
maybe add a determination function using hashing to generate it for
paths that are longer.. you will hit that limitation, I promise
Holger Keller (chat): @camilo, yes, we already changed naming Convention
because of that. Feels like dos-times
Thomas Graf (chat): @Camilo and Holger, we already discussed this and
have raised an issue to be investigated:
https://github.com/network-analytics/draft-netana-nmop-yang-message-broker-message-key/issues/2.
In the document we used a conservative length and the hashing as
solution is already on our minds. We intend to propose solutions in the
next document version.
8. AI-based Network Management Agent (10 min)
- Presenter: Daniele Ceccarelli
- Slides
- Draft: draft-zhao-nmop-network-management-agent
Benoit Claise: A quick one. I was just writing down all the things that
I learned so far this week: AI agent, network management agent, A2C,
A2A, A2N. I was looking up the site meeting this week. AI network, AI
agent discovery, AI agent protocol, there is IRTF, there is IETF. I
believe current time we might need guidance on where we work, right?
Because I'm sure we're going to receive the same type of AI related
stuff everywhere. So the question is: Is it a contribution for OPS?
Don't answer now, right? But this is like a generic statement for for
the week.
Daniele Ceccarelli: At least it means that there is interested in in it.
Rob Wilton: So, so maybe rather than an IETF, we need an AITF. On slide
7, about whats the best way of doing it. Instead of putting into the
document, we should put it somewhere else on GitHub and do it
dynamically I suggest to something which is already widely used in the
industry such as MCP and A2A. And I think that if we try and do
something like Netconf YANG, we'll be so far behind by the time we get
it done. We have to we have to follow the flow and know that it is
changing. Hence in order that this document is useful, I think it should
specify some sort of requirements and behavior but the details of how it
is done that is something that might be changing quite dynamically.
Daniele Ceccarelli: I completely agree with you.
Joe Clarke: It is hard to follow AITF. I have actually been building
agents for the IETF NOC here. I tend to agree on MCP. In that regard, I
think to make it independent makes more sense to me because it allows me
to plug in things and I can have an agent that understands controller A
and controller B. And the other thing I would be caution you against is
going to closed loop. Don't make it mandatory that the loop is always
closed. There should be the ability to have oversight if needed. And the
goal shouldn't always be to autonomize everything.
Daniele Ceccarelli: You are absolutely right. So with the term closed
loop, I don't mean something which necessarily keeps the operator out of
out. In the sense that there could be cases where there are just
insights that are provided to the operator and the operator decides what
to do. I don't know if you call it closed loop or not maybe.
Mahesh Jethanandani: All right. I just wanted to acknowledge that I'm
glad the question came up on AI early in the week on Monday. So
hopefully by Friday we have an answer.
Chongfeng Xie: What function can be provided by the AI agent controller?
Daniele Ceccarelli: Anything from analyzing the configuration of the
network to implementing any type of use case.
Chongfeng Xie: This requirement is not clear to me yet.
Reshad Rahman as chair: Please follow up on the mailing list.
9. Model for distributed authorization policy sharing (10 min)
- Presenter: Lucia Cabanillas
- Slides
- Draft: draft-cabanillas-nmop-authz-policy-sharing-model
Daniele Ceccarelli as IVY chair: So first of all I beg your pardon but I
missed to read the document. I'm just asking based on the presentation
that you just gave. I was wondering about the relationship between this
work and what is done at IVY working group with respect to capabilities
entitlements.
Lucia Cabanillas: To be honest, I was not aware of that. Lets talk
offline.
Diego Lopez: Potentially yes but it's not intended to. I mean the point
is that when we're talking about capabilities we're talking about what
you can do but not how you are controlling. Some policies may refer to
capabilities. If we have a formal mechanism for expressing capabilities,
we could refer policies to capabilities. But not yet. The main intention
here is to have a focal point. I think about this is a little bit like
going a step forward with SDN etc. you have a central control point that
distributes a configuration. Something similar with policies and making
it machine readable. That's roughly speaking the idea.
10. Generalized capability principles (5 min)
- Presenter: Nigel Davis
- Slides
- Draft: draft-davis-nmop-generalized-capability-principles,
draft-davis-ivy-equipment-capability-application
Reshad Rahman: How is this related to RFC 9196 (systems and notification
capabilities)?
Nigel Davis: We want to build on top of RFC 9196 and make it more
generic.
Rob Wilton: When people say capabilities, I think they mean many things
to many different people. So I think it's a complicated problem space
and there are certainly some aspects of this work which definitely need
to be addressed.
Some of that is already being addressed. My one fear here is that you
end up creating a really complicated system and that becomes too
complicated at one point. And I think you're right in saying you need to
do these two things in parallel. You need to bring an implementation and
solve simple problems at the same time. And in this whole problem space,
I would try not to solve everything. I would try and solve the low
hanging fruits first
Nigel Davis: Absolutely. I think if we get the framework right we'll
have this fractal recursive thing we can keep opening up and opening up
as far as you want to go without it being horribly complicated on the
surface. So that's the sort of thing we're looking for.
11. Graph based meta schema (5 min)
- Presenter: Wim Henderickx
- Slides
- Draft: draft-henderickx-meta-graph-schema
Mahesh Jethanandani: Thanks for bringing this work in the OPS area and
specifically into NMOP. NMOP generally tends to be the entry point for
such work but usually the destination is somewhere else and in this
particular case I believe this work belongs in a work group that is not
quite formed yet.
Wim Henderickx: Thats exactly the question. If you look to the there. In
my view the mechanism is something to be standardized. And then of
course at MMOP itself how you consume and how it's being used. So the
main question is where does it belong in IETF.
Nigel Davis: I like to join the draft. I just want to highlight a couple
of pieces of work that were done in ONF at TMF, what we call the
component system platform which I think is very much the same sort of
shape and purpose as you've got. iI's a meta schema essentially for a
more specific model but we ended up using hyper graph and hyper edge
rather than just graph because we found that it was a more powerful
mechanism
Wim Henderickx: So just to be clear, I personally did the
implementation. I know another company who actually built upon the same
basis. It's actually used at scale in certain things. But I am happy to
compare the hyper graph.
Michael Mackey: I was just wondering what you described earlier on the
graphing and everything else it looks exactly like RDF and I'm wondering
why you're coming up with your own type.
Wim Henderickx: RDF tries to standardize the whole entity whereas what
we try to do is decouple a very small layer to connect that data.
Michael Mackey: I wouldn't agree with that.
Wim Henderickx: Okay, we can discuss that. But for example the other
multi-actor operation stuff is not described in RDF at all. So there are
a number of things in this graph that RDF doesn't really talk about at
all.
Michael Mackey: Okay. But that's not a graph schema then.
Wim Henderickx: No, it's the way that the set of attributes and
operations that you do in order to actually interact with the data in
various ways in order to offload some of the capabilities from the
client to the server. That's not described in RDF at all.
Michael Mackey: What I would like to point out is that we had a
hackathon where we have translated YANG models directly into RDFS
schemas. We have instance data connected them to a graph. We have a
session at NMOP on Wednesday.
Wim Henderickx: What I'm trying to say is that's the data but then the
question is what do you do with the data. That's what I talk about the
operational side of things and so what you see is that they typically
describe the data concept but not so much how you interact with it.
12. Open Mic (5 min)
Session 2: Detailed Agenda
1. Agenda Bashing & Introduction (Chairs) (5 min)
2. China Telecom: Sharing your incident (15 min)
- Presenter: Chongfeng Xie
- Slides
- Network Incident
Dan Voyer: Great presentation. I saw knowledge graph in your slides, we
have a few knowledge graph document in the working group. Did you build
a knowledge graph and how does it align with one of the document that we
already have in the working group?
Chongfeng Xie: Currently, since the standization process has not been
finished, so we used proprietary solution for the definition of the
knowledge graph.
Thomas Graf: Great presentation. Just to follow up what Dan just said.
In the anomaly detection documents, we have the architecture, the
semantics and also the life cycle and at the working group we are
working on the knowledge graph and you probably saw in the architecture
we also intend to work on postmortem systems to have statistical
information from the anomaly detection and the network incidents. We
envision when this information is gathered that at the end through
knowledge graphs and large language models we can get additional
insights and then go over the life cycle and do optimizations and
refinements. So to answer your question, it's it's on the scope. I think
that's the right
direction.
Thomas Graf: Maybe one question on the slides. It's slide number four.
What was the reasoning, why did you go directly to packet captures?
Chongfeng Xie: As mentioned, that was the first step. We tried to ping
from base station to the mobile call, but the network seemed normal. So
we did packet capturing at the downstream next to find underlying
problem. It shows that our
endeavor is successful. We found that the source address in the data
plane of the packet was empty.
Robert Wilton (chat): Chongfeng, when you said 930 AI agents. Are all of
these AI agents doing different roles/configuration, or are some of
these agents the same, and you have a higher number for
scale/performance reasons. Oh, and thanks for a great presentation!
Chongfeng Xie (chat): There have been multiple types of AI Agents
developed, some are doing fault diagnosis, some are for netework routine
inspection, some are for network optimization, and Network agents are
also tied to their network type, different types of network may have
their AI agents, but the number of '930' is about the total number of AI
Agents in CT‘s product network currently.
Thomas Graf (chat): @Chongfeng, Can you detail what does data quality
mean in this context. Are you referring to schema and semantics? Or
delay and loss of data? Or data integrity?
Chongfeng Xie (chat): Data quality in this context is about the corpus
itself. Being high quality means that the corpus is large-scale,
accurate, professional and multi-types(image, video, CoT, evaluation
data etc.), the quality of corpus will inluence the performance of
Network Large Model direcely. To guanrantee the quality of corpus,
several process will be taken, such as , Text extraction, corpus
cleaning, deduplication, desensitization, quality assessment.
Thomas Graf (chat): @Chongfeng and @Dan, To detail on the points on
knowledge graph, Agentic AI and LLM's. The goal of Network Anomaly
Detection is fault identification, impact scope analysis and identifying
the causality chain. In
https://datatracker.ietf.org/doc/html/draft-ietf-nmop-network-anomaly-architecture-05#section-2.3
we are describing the principle of knowledge based detection. In
https://www.linkedin.com/pulse/agentic-ai-its-network-analytics-applicability-thomas-graf-kirle/
I am describing the importance of structured data and its semantics, how
ontology help to interact between humans and machine, knowledge graphs
to store knowledge and establish relationships and how LLM's with
Agentic AI can make use of it. In
https://datatracker.ietf.org/doc/html/draft-ietf-nmop-network-anomaly-lifecycle
we describe how that knowledge can be refined. Therefore I see that the
next step in Network Anomaly Detection is to detail the scope of the
Postmortem system. Besides operational and analytical metrics which
knowledge needs to be stored, linked among and graphed as well. From
there we can explore how that information can be used with Agentic AI to
refine and optimize the detection rules and the symptom and causality
information.
Chongfeng Xie (chat): Thank you for providing the related documents,
which have laied a wonderful fundation for anomaly detection in network
operation. I agree with you, next step we can explore how the existing
information can be used with Agentic AI to optimize anomaly detection
and handling.
3. SIMAP Hackathon results (15 min)
- Presenter: Olga Havel
- Slides
- Side meeting updates
No comments
4. draft-havel-nmop-simap-yang and side meeting summary (10 min)
- Presenter: Olga Havel
- Slides
- Hackathon results
Summary: Discussion on a programmatic way of representing profiles in
YANG. Wherever the authors should wait until this is adressed by NETMOD
working group or not.
Reshad Rahman: I have a question to Kent. I was not aware about the
profile work until this morning. Are you aware of the profile work at
all?
Kent Watsen as NETMOD chair: No, not really following.
Mohamed Boucadair as AD: I thought at some point this was part of the
templating part as well and the profiling was one of the requirement of
the templating effort. There was some work about the templating here and
that the profiling was part of that but maybe I'm mistaken on the actual
requirement.
Kent Watsen as NETMOD chair: You are referring to the template
requirements issue tracker? I actually don't recall it referring to
profiles but I'll check it again.
Reshad Rahman: I remember a template discussion in NETMOD. I follow
NETMOD and I don't recall profiles.
Kent Watsen as NETMOD chair: I don't recall it either.
Italo Busi: Just a quick comment on profile. We have a draft in TEAS.
The big issue is making it programmable because I saw a lot of
implementations which do a profile of RFC 8795 because the YANG module
is too big. Those who implement the YANG module implement just a subset
of the leaves but we think maybe this is a more general problem if you
have a YANG model which has 10 leaves and you have an application where
you need only eight. Does it mean that we need to create another YANG
model which only defines 8 leaves or we say for this use case you have
only to implement this part and the other two you can take it out and
maybe this is a more generic problem. We can discuss at NETMOD.
Reshad Rahman: Italo, yes please send an email to NETMOD.
Olga Havel: Number 2 on slide 17 is a core question for everyone here.
Benoit Claise: Sorry for the interruption, Kent and Italo we're
discussing profiles here.
Olga Havel: So my understanding, but Italo please correct me, is that if
you have a model that is very thorough and you need a much simpler
model, what you do is you copy and paste, cut and take bits out of that
model and then the operators implement that model. But I have a
questions about tooling, versioning, library, capabilities, discovery,
you know so is it part of the IETF approach or is it something that is
kind just done for TE?
Italo Busi: No I don't think people has basically cut anything out of
the model. Just that some of the operators are not using leaves which
are optional, simply are not instantiated in the implementation. You
just implement the the limited set of of leaves that you need. I think
it's a generic issue because the YANG models have broader applicability
and maybe there some attributes are not needed in some deployments or in
some applications.
Olga Havel: But what's the the definition of the model? Which model?
Italo Busi: People before implementation agreeing on a subset of RFC
8795. You just do not implement all the optional leaves. So you don't
get the information.
Kent Watsen: And this is what we were talking about on the side. It
could be addressed in two ways. One in YANG with existing capabilities.
So as you're using a grouping, you could refine it and put in a feature
statement that is not implemental like "foo" and "not foo". So basically
it eliminates the ability to configure that leaf.
Kent Watsen: In YANG next, one of the items has the ability when you're
using a grouping to say certain leaves are not inherited. And in the
template reqs discussion, I think there was one idea when you're using a
template to delete certain nodes from the template that you're
inheriting, but I'm pretty sure that idea was down voted so it may not
have been carried forward or picked up in the current work. We can pick
it up offline.
Olga Havel: But in any case, do you think it should be in TEAS or should
it be in NETMOD as a generic solution?
Mahesh Jethanandani as AD: On the question on which working group. It
depends wherever it is a language or a modelling issue. If it is a
modelling issue, I think TEAS is still the right place. If it is YANG
language issue than it would be NETMOD.
Olga Havel: But even if it's a modeling issue and it claims to be
generic, let's say that can be applied to the data center domain, should
it be done in the TEAS working group or should it go to NMOP?
Reshad Rahman as chair: I think the answer that Olga is looking for and
we're looking for as a working group is do we use RFC 8345 or do we go
to look at RFC 8795 which uses TEAS profile which for most of us was an
unknown until recently. I think on Olga's slides she pointed some
lackings about no versioning support etc. As chair I'm very reluctant to
say you should follow profiles today when there as far as I know only
had been discussions on profiles in TEAS.
Reshad Rahman as chair: Where I stand today is to stay on RFC 8345 but
that might change when there's more discussions on profiles in the
context of NETMOD then that doesn't mean we will ask you to change what
you have done but maybe there will be a new document. At this moment,
waiting for profiles, I'm not comfortable with that at all. But maybe
people know more about profiles than I do. So please speak up.
Kent Watsen: When Med mentioned profiles with templates, I was just
saying that templates could be used as a way of implementing profiles.
So perhaps look at the template work in NETMOD and see if that would
address your concern.
Olga Havel: Like I'm personally not proposing profiles. That was
proposed by TEAS and they're saying that profiles should be used. My
question is about how do I use them because they're not part of YANG. I
would therefore prefer not to use them until they become the agreed way
to do it in a generic way.
Kent Watsen: When you say agreed, do you mean published?
Olga Havel: If I'm an application developer, I'll go to the YANG module
and
if you are saying that I need only to understand 10 attributes, my
formal definition of that YANG module is still thousands attributes. I
don't understand what is the model definition then. Do I cut the model
into smaller models or is it just instantiation and instance examples
that are shown as simplification of that model? How to define that
interface for the big model if we want to be able to allow some kind of
profiling, how to understand versioning and capabilities etc.?
Kent Watsen: I think it's probably something to do in the uses
statement. If it's a grouping and you're using the grouping and in that
uses statement, you want to be able to do something to a subset of it
should be carried forward. How to do that right now? Phil Shafer
actually mentioned maybe you could use a status deprecated or status
obsolete. I had this idea of putting the if feature statement foo or and
not foo. So there's different ways of removing things, but nothing's
very clean.
Olga Havel: But they're saying per use case. So you can have hundreds of
use cases. Would you have hundreds of features?
Kent Watsen: I think if you have 100 use cases then you have 100
groupings.
Michael Mackey: I could almost withdraw my question because it's mainly
answered already by Olga and Kent's discussion. I just want to
re-enforce that if you're suddenly having to troll through a very large
model and, I consider this part optional, but somebody else might not,
and how you end up into a mess especially when it's very large and
you've got dependencies between them. Groupings, okay but maybe we're
talking about an existing model where groupings might not even exist and
your use cases might not match my use case. Whereas the SIMAP model
we've seen a picture from the knowledge graph approach, it's very
generic, very simple and very navigable.
Aihua Guo: We are we are describing the profiles but it's out of the
scope of the draft to describe a programmatic way of defining the
profiles. You asked how we do that today: we pick up the attributes we
want to support and that's it.
Olga Havel: So it's some kind of description, user manual or something
like that.
Aihua Guo: But I still think in the future if we have a programmatic way
of describing profiles that will be better. We can follow the profiles
and create an implementation that applies to any model not just the TE
topology profiles.
Italo Busi: I am aware of several instances of profiles. I fully agree
on following up on a programmatic way. We don't want a YANG model per
use case.
Benoit Claise as chair: The key question is how long shall we wait for
this programmatic way. Using RFC 8345 or 8795 as basis. Actually we
asked the very same question almost like a year ago. It is something
that we are repeating. So that's really the key question. How long do we
wait?
Olga Havel: If the community agrees that profiles are acceptable. The
agreement
was that Italo will look at the SIMAP requirements, identify which ones
are supported by RFC 8795 and then we would compare the approach in our
draft with how to define profiles in the textual form and make a
decision.
Benoit Claise as chair: But we are back where we were 9 months ago. Lets
see how we are progressing in NETMOD.
Reshad Rahman as chair: I'm very reluctant on waiting on something which
hasn't even started at NETMOD. Now I'll go and read the profiles draft
next week. So maybe my opinion will change then. For me RFC8345 is still
the way to go. I still encourage the profile work to be programmatic.
Italo Busi (chat): I do not think having a programmatic solution it is a
blocking issue but a facilitator. Implementations already exist with no
need for a programmatic solution.
5. Knowledge Graphs Design Team update (15 min)
- Presenter: Michael Mackey
- Slides
- Design team updates
Reshad Rahman: These three existing documents which were kind of the
impetus for to form the design team. I've attended some of those calls.
Have you guys decided whether it's going to stay as is or you don't know
yet and keep going?
Michael Mackey: I would say don't know yet. We we did discuss this this
week. I was keen for the draft that I'm part of because it's really
almost like a white paper of the problem. It's not really offering any
solution. It's offering possible technologies and solutions that could
be addressed, but it's really just trying to outline what the problem
is. I was hoping that maybe could be moved a bit forward but there's a
question of whether or not to do that before we have agreed the
deliverables for the design team. So for that reason I think it's TBD.
Benoit Claise: A question that I've been debating myself with is: Yes,
we could do work in the IETF, but under what circumstances? Which
process? So let's say you developed an ontology, would it be an RFC? No,
because it has to evolve. So that's something that we'll have to take
some time? It's easy to have something in the charter and say we should
do another draft. There is something which is half open source in it and
it evolves. So we don't want to do the same mistake as we have done with
YANG and with Semver.
Michael Mackey: My personal opinion, not the design team, is that maybe
the type of relationships that we want to track via the use cases could
all be defined just from a semantic point of view. This is the
relationship we're trying to discover. These are some examples of those
types of relationships. The implementation can be external.
Reshad Rahman: A question to our AD. I think you sent an email regarding
ONIONS this morning. Do you think possibly, yes, no, don't know, whether
this kind of work could happen in the context of ONIONS or is it too
early to say?
Mahesh Jethanandani as AD: I think it's a little too early for me to say
if that would be part of ONIONS. I think the ONION charter is still
being worked upon. But I think the question of open source and that it
is constantly evolving, I wonder whether from a standards perspective
IETF is the best place.
Benoit Claise: That's what we've been discussing. On one side we are
networking experts on the other side we have a too rigid process. So if
not done here where than?
Mahesh Jethanandani as AD: That's a valid question. I was probably a
little more focused on the the challenge you were talking about. Where
there are models, they're fragmented, they're inconsistent. That problem
certainly belongs to IETF, and that's something that doesn't work.
Michael Mackey: We're starting with YANG models, but there's more than
YANG. There's BMP, IPFIX etc.
Mahesh Jethanandani as AD: So I'd certainly like to confer with Med also
and see whether it makes sense to try to standardize this or at least do
the work in IETF.