Network Management Operations (nmop) WG Agenda - IETF 121

Current WG's Priorities

For P4 items: We propose to submit those for the forthcoming NEMOPS
Workshop. Will discuss in future whether we keep this item in the
charter or remove it
after the NEMOPS Workshop takes place.

Compact Agenda

Session 1

| Slot | Priority Label | Topic | Presenters |
| :-: :-: :-: :-
| 09:30 - 09:40 | | Agenda Bashing & Introduction | Chairs |
| 10:40 - 10:50 | P3 | Pick a Concept Name: Digital Map or not Digital Map | Chairs & Adrian |
| 09:50 - 10:10 | P3 | Digital Map: Concepts & Requirements | Olga |
| 10:10 - 10:40 | P1 | YANG-Push to Message Broker Integration | Thomas |
| 10:40 - 10:45 | P2 | Next Steps for the terminology draft | Chairs |
| 10:45 - 11:00 | P2 | Incident Management YANG Module | Qin |
| 11:00 - 11:20 | P2 | Follow up to the Anomaly Interim | Vincenzo/Alex |
| 11:20 - 11:30 | | Flash Teasers | Robert/Rob/Xing/Diego |

Session 2

| Slot | Priority | Topic | Presenters |
| :-: :-: :-: :-
| 18:00 - 18:05 | | Introduction | Chairs |
| 18:05 - 18:20 | P1 | Validate Configured Subscription YANG-Push Publisher Implementations | Yannick |
| 18:20 - 18:45 | P3 | Digital Map Hackathons | Sherif/Henry |
| 18:45 - 18:55 | P2 | Network Anomaly Lifecycle Hackathon | Vincenzo |
| 18:55 - 19:00 | | Knowledge Graph for Network Operations Hackathon | Mike |

Session 1: Detailed Agenda

1. Agenda Bashing & Introduction (Chairs) (10 min)

2. "Digital Map"

2.1. Pick a Name for the Concept (10 min)

Thomas: It's great. A new term makes it easier. Would suggest plural, ie
"simaps".

2.2. Concepts & Requirements (20 min)

Italo: Regarding multi layer versus topology hirachy. Are we going to
talk also about navigation between different abstraction levels, or only
about navigation between different network layers (L3, L2, etc)? Thats
whould be defined first before defining a name.

Olga: It's between nodes, links, etc. I think it's both, but more about
abstraction layers.

Italo: My point is wherever there are different layer or for a single
layer we see dependencies to other layers.

Olga: I think it is both. When we say multi layer, that includes
hirachy. Lets continue the discussion on the mailing list.

Mahesh: The subsections in section 3 are going more into requirements.

Olga: We'll change to "the user retrieves" rather than "the user should
retrieve".

Mahesh: It appears that only 4 are being defined. The others are empty.

Med: We have actually open issues for this.

Olga: Only four use cases because this is still being looked at. One
approach: business value. Another approach: API use cases (when you need
read, write, etc). Will continue discussion on the list.

Rob: Since the document introduces the topic, diagrams might be helpful
for understanding.

Olga: It is covered in one of the requirements.

Rob: Most of the requirements are north bound related. Towards the user
who consumes it. Do we have to describe anything on how to retrieve the
data from network devices? Not sure wherever it is a requirement. Worth
mentioning it.

Olga: I would say it's outside the scope. We assume IETF modules will be
used to retrieve the different technologies, but focusing on the
northbound interface here.

Med: If I understand Rob correctly, Rob likes to ensure that the
topology data is mappable towards the operational network models.

Rob: The commonality of defining SIMAP is about saying "we expect
retrieval using one of these mechanisms".

Olga: We did a hackathon item on IS-IS topology. The challenge is in
multivendor enviroments.

Rob: For the hackathon, could say "the vendor is expected to use
OpenConfig models, for example".

Olga: If we could say IETF it would be easy, however reality is another
thing.

Benoit: Not sure we can add these requirements within IETF.

Rob: It is not a requirement, more like a guidance.

Med: It is not only about YANG data. It can also relate other kinds of
data.

Olga: We moved the requirements from the other documents to this
document so that we can reach agreement more quickly.

Italo: I like that it is solution driven. I think we can discuss
one-by-one, "what is required, rather than how it should be done".

Olga: We will put each item on the mailing list.

3. YANG-Push to Message Broker Integration (30 min)

3.1 An Architecture for YANG-Push to Message Broker Integration (20 min)

Benoit: Is there something you expect from this WG?

Thomas: We need a very good wrap-up on the YANG-Push notification and
capability side. Would be great if you could review the documents, give
feedback, address open points now, so we can move on up the chain.

Med: Is there any specific point in this draft that's worth zooming
into? There's a lot in this document, especially for those unfamiliar
with this area.

Thomas: On the subscription part of YANG-Push, discovery, notification
and transformation part, would be good to get feedback from the WG.
Right now we have many implementations going on so the earlier we can
address the feedback the better.

Med: We have a strong dependency on what's going on in other WGs. How is
progress of other documents?

Thomas: For 1.5 years have been discussing YANG-Push notification header
and now have agreement we want a entirely new YANG module. But since
everything else depends on this header, it's improtant we come to an
agreement. That's the focus.

Benoit: As a remark. We asked specifically that NMOP is scheduled before
NETCONF, since as Med mentioned, the document gives a good overview.

Rob: Not sure if in scope, but one area of the architecture that isn't
clear to me is how you get the YANG schemas that comes off the device
into Kafka, and validate the data all the way through, and what the
encoding is. eg is JSON the right encoding? This is the "elephant in the
room" issue that we should spend time on (somewhere, maybe this
document).

Thomas: I think it fits into this architecture document. The encoding is
not normal JSON, it includes namespaces. For the next revision I will
point some solutions out.

3.2. Operator/Implementer Contribution: YANG-Push Next Steps (10 min)

Mahesh: As you were talking about implementation support from different
vendors, I was trying to understand "on change". Is there a specific
issue there?

Thomas: We got feedback from almost every implementor that on-change is
hard to implement beacuse of patch-id. It's hard to keep track of the
state and its changes, like an interface goes down, often more context
to that interface is needed (rather than just notifying the interface
name). Equally on YANG-Push publisher side and also on YANG consumer
side, to keep track of the state and apply changes is challenging to
implement.

Mahesh: In gNMI, there is a concept of target-defined, in which if
subscription request is at a container level, and something inside
changes, we just send node that is marked on-change. But I can see how
providing a wider context would be useful to send as part of on-change.

Thomas: I think key is to look at the entire end-to-end chain with our
engineering choices. Sometimes we tend to solve it one side easily and
create a problem on the other side. Good that we have the end-to-end
chain, and multiple implementations and operators, and now we're getting
good feedback where we can propose changes.

Andy (chat): Agree with all problems raised with YANG-Push.

4. Anomaly Detection and Incident Management (40 min)

4.1 Next Steps for the Terminology Draft (Chairs, 5 min)

Adrian: We need to draw a line, and either get the last few terms agreed
or taken out again.

Thomas: The proposal makes perfectly sense. The document is in good
shape. I only have one minor comment. Will comment on the mailing list.

Benoit: I looked at draft with a fresh pair of eyes, good job Adrian. I
like the workflows there. When I was thinking of new terms, they helped.
One observation: a reference to wikipedia is not ideal :-)

Rob: What's the rush to publish? I understand that the working group
wants to finish and move on. However could we keep it open until the
first document which references it, is ready to publish and publish it
together.

Med: End of January?

Benoit: Anyone not happy with the plan?

No one obejcted.

4.2 Incident Management YANG Module (15 min)

Benoit: What is the scope of the document. Does it include service?

Qin: Other SDO's include also service for incident management. Service
is not in scope of incident management.

Thomas: I suggest to focus on causes and symptoms for incident
correlation.

Thomas: Incident management defines incident notifications. Anomaly
documents relevant-state notifications. How do we allign the basic
structures? Asking the authors, chairs and the working group. Do we need
a dedicated document similar as we did for terminology?

Qin: Either way works for me.

4.3. Follow up to the Anomaly Interim (20 min)

Nacho: Have you explored how Knowledge Graphs could be used.

Vincenzo: Not yet. Was not our primary focus.

Med: Knowledge Graph is not on the charter now. Its great that we have
already 3 documents. As a chair, however I was not happy that the
anomaly documents relates already to the knowledge graph document. I
believe that would be a discussions with NEMOPS. If the authors of the 3
documents could come together and propose how to come forward, that
would be very helpful.

Nacho: We found with anomaly a good use case where Knowledge Graph could
contribute to see the value.

Dan Voyer: I think these two documents are very important for covering
our use cases and operators will value the syptoms definitions.

Adrian Farrell: I agree with Dan. These documents are important. To the
question wherever relevant-state notification should be defined in the
document, yes I believe so. And please note, in the terminology we
define relevance.

Vince: That's exactly what we are aligning to. When we talk about
relevant states, we are refering to the terminology.

Benoit: We like to raise the question wherever it is the right time for
adopting those two documents.

Med: I believe we have a good indication for the next steps.

Benoit: We have someone voted no.

Balasz (reason for "no" on anomaly lifecycle): I believe that there is
no difference between incident, fault and anomaly. I understand that
incident management can take input from many sources, so does fault
management.

Benoit: I believe this is being defined in the terminology document and
should be raised.

Vince: The focus on the anomaly documents are on anomaly. Not on
incident and fault.

Anomaly Lifecycle Adoption Poll

Yes 23
No 1
No Opinion 3

Anomaly Semantics Adoption Poll

Yes 20
No 0
no Opinion 11

Med: We will follow up on the mailing list.

5. Flash Prez: 1-slide Teasers (10 min)

The slides are available here.

5.1. A YANG Template Framework (Robert Peschi)

5.2. AI based Network Management Agent (NMA): Concepts & Architecture

5.3. NETCONF YANG-Push Observability (Rob Wilton)

5.4. ETSI TC DATA (Diego Lopez)

Session 2: Detailed Agenda (Hackathon-focused)

1. Agenda Bashing & Introduction (Chairs) (5 min)

2. Validate Configured Subscription YANG-Push Publisher Implementations ( 15 min)

3. Digital Map Hackathons (25 min)

Presentation 1

Benoit (slide 13): Is this based on RFC 8345 ietf-network?

Sherif: Correct.

Med: Clarification on the discovery aspect. Could you elaborate how you
do it?

Sherif: We use at the YANG files for each of the topologies. We use it
internally to do some mappings on different layers and use then data
from different sources such as config or LSDB offline.

Samir: Can you elaborate wherever you are using operational and
configurational data and wherever you can integrate to a controller.

Sherif: Running config might not be enough. We need structured YANG
data. Controller integration is possible.

Olga: Our focus was the northbound interface. How it will be collected
is out of cope. Just needs to agnostic.

Presentation 2

Olga: I just wanted to remind how we ended up in two hackathon projects
from the interim meeting. To understand which approach is appliccable
best. It appears so that the TE modelled approach is difficult to apply,
and ad-hoc profile was used to workaround the problem. I think this
should be addressed in the responsible working group, NETMOD I presume.
Are profiles mandatory?

Henry: We started simple as well but ended up with a more complex model.
With SIMAP we like to start simple as well. The complexity might
increase as we go along.

Med: I like to remind the aim of the hackathon experiments is to find
you wherever it is applicable or the complexity is to high. Therefore,
this is really helpful. Having two different approaches gain experience
in applicability. Please continue and it would be great if you could
sync with the other hackathon team so we have comparable outputs from
the experiments which helps for the decisions.

Italo: TE and non TE needs to work together. We have them in the same
topologies. I agree with Henry as we are going new requirements are
coming up and need to be adressed. And I agree it needs to be addressed
in NETMOD.

Aihua: Profiles are not mendatory for SIMAP. It is being used when the
entire topology is not known. Agree, should be addressed in NETMOD for a
generic method. The goal of the hackathon was to show what with the
current capabilities is possible. We identified some gaps such as
bidirectional links. We were able to show in this demo the underlay
forwarding path by using the TE topology model.

Julien: As an implementor for optical networks, I don't think it is that
complex. I don't think we should use complexity as an argument.

Rob: In the first presentation it was somehow confusing to see the
bidirectional links in the map. I don't know wherever this could be
hidden or not. It might be confusing from a user perspective.

Oscar: We will need TE consequently for optical networks for sure. The
question is wherever TE is needed in the core model or not.

Olga: Agree. The discussion is about the core model.

4. Network Anomaly Lifecycle (Hackathon) (10 min)

5. Knowledge Graph for Network Operations (Hackathon) (5 min)

Nacho: Have you consider using YIN for the transformation?

Michael: We did an we saw how you did it. The reason why we did it
differently was because the parser incredibly fast. We can translate the
entire device in 2-3 minutes.