Paper: Device Network Management - Current Status, and Future Direction
slides-nemopsws-paper-device-network-management-current-status-and-future-direction-02
| Slides | IAB workshop on the Next Era of Network Management Operations (nemopsws) Team | |
|---|---|---|
| Title | Paper: Device Network Management - Current Status, and Future Direction | |
| Abstract | Robert Wilton, Cisco Systems; Nick Corran, Cisco Systems This document gives a perspective of where we believe the industry is with regarding to network management … Robert Wilton, Cisco Systems; Nick Corran, Cisco Systems This document gives a perspective of where we believe the industry is with regarding to network management and telemetry based on Rob's experience as a recent IETF OPS Area Director for Network Management and our joint experience designing and implementing network management technologies for large IP/MPLS Internet scale backbone routers. |
|
| State | Active | |
| Other versions | plain text | |
| Last updated | 2025-01-23 |
slides-nemopsws-paper-device-network-management-current-status-and-future-direction-02
Network Working Group R. Wilton
Internet-Draft N. Corran
Intended status: Informational Cisco Systems
Expires: 22 May 2025 18 November 2024
Device Network Management - Current Status, and Future Direction
draft-wilton-nemops-net-mgmt-future-latest
Abstract
This document gives a perspective of where we believe the industry is
with regarding to network management and telemetry based on Rob's
experience as a recent IETF OPS Area Director for Network Management
and our joint experience designing and implementing network
management technologies for large IP/MPLS Internet scale backbone
routers.
About This Document
This note is to be removed before publishing as an RFC.
The latest revision of this draft can be found at
https://rgwilton.github.io/network-mgmt-future/draft-wilton-nemops-
net-mgmt-future.html. Status information for this document may be
found at https://datatracker.ietf.org/doc/draft-wilton-nemops-net-
mgmt-future/.
Source for this draft and an issue tracker can be found at
https://github.com/rgwilton/network-mgmt-future.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 22 May 2025.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction
2. End Goal
2.1. Long term vision
2.2. Strategic Medium Term Goals
2.2.1. Small improvements to existing network management
protocols
2.2.2. Improvements to the YANG language
2.2.3. Better Data Models
2.2.4. Continued engagement with operators
2.2.5. More efficient execution, smaller steps
2.2.6. Availability of Open Source solutions
3. Security Considerations
4. IANA Considerations
Acknowledgments
Informative References
Appendix A. Current technology and solutions
A.1. CLIs interfaces
A.2. SNMP & MIBS
A.3. YANG based Network Management Protocols
A.3.1. NETCONF
A.3.2. RESTCONF
A.3.3. gNMI Protocol Suite
A.4. YANG & YANG Data Models
A.4.1. YANG
A.4.2. Network Device YANG Data Models
A.4.3. Network and Service YANG Models
Authors' Addresses
1. Introduction
The focus of this document is predominantly on the technology
requirements on network devices rather that network management agents
or network wide controllers that hold a wider view of the network.
In addition, the document focusses more on the historical network
management configuration rather than YANG telemetry solutions such as
YANG Push [RFC8639] [RFC8641] or gNMI.
In addition, any complete telemetry solution is likely to also be
interested in IPFIX and BMP, Both of which should be considered
alongside YANG based solutions when monitoring network devices and
subscribing to telemetry data.
The main body of this document provides suggestions for what problems
IETF should focus on for evolving network configuration. The
appendices explore the current landscape of network management tools,
protocols, and models for network devices.
2. End Goal
2.1. Long term vision
The obvious and cliched long term goal would be for a network to be
completely self-managed, automatically recovering from any failures
or configuration errors, with strong projections of future capacity
planning and network evolution. Such a network would be configured
through high level statements of intent, with the network or
controller(s) making intelligent and automatic decisions as to how to
enact that intent with little or no human involvement. The network
would also self-monitor, continuously comparing the actual network
state with the desired intent, reconfiguring on the fly to meet the
service level requirements.
Although this may be an achievable goal in the medium to long term,
we believe that such a goal remains a reasonable time away, and there
are significant areas of unknown complexity that would need to be
solved before being able to achieve this. Generative AI, and other
machine learning techniques may help, and these technologies appear
to be evolving rapidly, but it is still unclear if they will be able
to manage this level of complexity in a robust and comprehensible
way, and even if they are, what the resource and financial cost of
relying on such technologies would be. In short, we believe that it
would be foolish to assume that AI will solve all network management
problems and no further short or medium term technology investment is
required.
Hence, the following sections presents our view of the most pragmatic
effective improvements for Network Management technologies in the
short/medium term.
2.2. Strategic Medium Term Goals
The overall theme of this section is that IETF energies would be
better spent making the existing functionality a bit better rather
than trying to come up with the next big idea. Hence, this section
contains the authors' views of improvements that would likely help
vendors and operators move to more network automation and to extract
maximum value over the next 5 to 10 years.
The summary of recommendations:
1. Improving the existing networking management protocols, to make
them easier to implement and use and deploy. E.g., the NETCONF
RFC is specified to support arbitrary XML data that is not
modelled in YANG, causing the RFC to be harder to specify and
harder to understand. Conversely, the YANG RFC contains
normative text regarding how it should behave with NETCONF.
2. The IETF should carefully update the YANG language to make it a
little bit better, but without a large cost to updating tool
chains. The focus of the update should focus on small meaningful
improvements rather than turning YANG into a much bigger
language.
3. Control the proliferation of YANG data models. Ideally we would
have one industry supported external data model for devices
rather than both IETF and Open Config. IETF should decide
whether to focus only on network and service YANG models, or
whether to also provide complete device YANG models.
4. The IETF should maintain continued engagement with operators (as
NMOP WG already is) to ensure that IETF's network management
focus is on solving the most urgent and most important problems
that network operators are currently facing.
5. Solutions should be delivered in a reasonable time frame. I.e.,
it is better to get to 90% functionality in 1-2 years, than a
100% in 5+ years. An agile iterative approach is best.
6. This is somewhat outside the scope of IETF, but having freely
available open source implementations of the protocols,
particularly, for the client code.
Some additional detail of the recommendations is provided in the
following sections.
2.2.1. Small improvements to existing network management protocols
For devices using YANG data models, there has been strong industry
adoption of using NETCONF as the protocol for editing and querying
configuration of devices. However, this protocol has various
scenarios that are not clearly specified, increasing the cost of
implementation and the risk of incompatible implementations arising,
and making the protocol more complicated than it needs to be.
An 2.0 version of NETCONF could:
* be optimized to specify the minimum functionality required to
manage network devices using YANG. E.g., mandate consistent with-
defaults handling for all server implementations.
* make all extra functionality optional, perhaps moving them to a
separate document (e.g., XPath filtering)
* consider if there is any legacy features that are no longer useful
and could be removed altogether (e.g., shared candidate)
* model all NETCONF RPC operations in YANG data models.
* support for JSON encoding of YANG data by default, but also
allowing support for CBOR and XML.
2.2.2. Improvements to the YANG language
The YANG language specification should be updated. However, it is
important that there is a clear focus and strategy for updating the
language. A large number of issues, tracked on github, have been
analyzed, but there needs to be a very critical view at ensuring that
the language doesn't evolve into an overly complex second version.
The next version of YANG should focus on:
* merging in the core versioning changes
* any small changes to the language that significantly improve
modelling of difficult cases
* any small generalizations to the language that make it more widely
usable (e.g., add a base float type)
* deprecation of functionality that adds unnecessary complexity, to
tbe removed in future version (e.g., sub-modules)
* any bug fixes or omissions from the existing specification.
2.2.3. Better Data Models
It would be better for both vendors and operators if there was a
single set of standard/open data models rather than the competing
sets from IETF and OpenConfig that are incompatible ecosystems.
However, from an authors perspective it is hard to see how these two
ecosystems can combine - if anything there seems to be a hardening of
the gulf between the two ecosystems, with very different designs to
the data models, and diverging protocol specifications that are being
optimized for the different styles of data models. E.g., the IETF
protocols are being developed in a direction with direct and explicit
datastore support, whereas the Open Config models combine intended
configuration and operational state into a single data model.
Having different data models and protocols greatly increases the cost
for vendors, and adds indecision in the market because nobody wants
to heavily invest in a technology that may have a limited lifetime.
Even though there may not be any way to converge the IETF and Open
Config ecosystems, the IETF should try hard to ensure that there is
no further fracturing of the YANG Ecosystem.
The IETF could, strategically, decide that it doesn't want to invest
in device YANG models, but given that it has already published a
large number of them, this may not be the best strategy. Assuming
that the IETF still wants to develop and improve the ecosystem of
IETF YANG data models then there should be more efforts to ensure
that the data models work well together and function as a cohesive
API. The IETF should:
* Develop a mechanism to define sets of IETF and other SDO YANG
models that are known to work well together, e.g., perhaps via
defining YANG packages [I-D.draft-ietf-netmod-yang-packages].
* Define a more efficient mechanism for evolving YANG data models.
Rather than having all of the YANG modules residing in RFCs, that
are slow and expensive to update, it would be better to have a
working copy of the IETF YANG models with fixes and enhancements
applied, stored in github and readily available for use. Over
time, as these models become stable they could be published in
RFCs, if necessary.
* The IETF should consider whether assets, such as YANG models,
should be specified in documents at all, of whether the RFCs
should only document the abstract overview of the YANG data model
structure with the details of the code assets versioned within a
git repository (perhaps backed by IANA).
* The IETF should check whether the YANG data models are complete to
solve particular standard deployments and configuration. E.g.,
are all the required IETF YANG models available to configure an
L3VPN service, or are there basic bits of functionality that would
also be needed that are missing. The IETF should aim to fill in
any gaps in the model to ensure that at least the basic
functionality can be defined in a vendor agnostic way.
2.2.4. Continued engagement with operators
The NMOP WG was deliberately chartered to encourage and bring more
operator engagement into the IETF, and although the WG is only
recently chartered, currently it is working well, particularly as
there is increased energy by some operators towards more network
automation, and trying to make significant improvements to the tools
and techniques that are available, e.g., see
[I-D.ietf-nmop-network-anomaly-architecture] and
[I-D.ietf-nmop-yang-message-broker-integration].
This equitable collaboration between operators, vendors, and
universities is great for the IETF, and should be used as an example
of a collaborative project within the IETF sphere that is working
well. This close collaboration means that the focus is directly on
solving the most critical problems, with the running code being
developed by multiple vendors at the same time to ensure that the
solution is efficiently implementable.
2.2.5. More efficient execution, smaller steps
IETF has a reputation for being slow to standardize new protocols and
features, and partly that is the cost of a full consensus based
approach. One beneficial aspect of the increased time allows for
more reviews and more implementation experience before the
specification is finalized. However, the IETF also needs to
understand that the slowness also comes at a cost, and for network
management it would be better to have a solution driven approach,
embracing the IETF mantra of "rough consensus and running code".
E.g., it is arguably better to have a solution that achieves all the
critical functionality and 90% of the desired functionality delivered
in 1-2 year, rather than a full solution that achieves all of the
desired functionality, but that takes 5 years to achieve consensus
and be standardized.
This is the approach that is being taken with the YANG Push Telemetry
work. Driven by a group of operators, the focus is on staged
"minimum-viable-product" deliverables, where each deliverable is
specified to require the minimum agreed functionality to meet a set
of goals. The vendors who are participating are developing
implementations at the same time as the drafts are being standardized
which quickly highly potential problems in the proposed standards
which means those issues can be more quickly mitigated, and we also
have high confidence as the drafts progress towards RFC.
2.2.6. Availability of Open Source solutions
More focus should also be given to the availability of open source
solutions that are easy for operators to adopt and that are shown to
interoperate well with vendor implementations. Although the IETF is
not in the game of conformance checking of implementations, it could
still be helpful for the Network Management related working groups to
collectively invest in supporting an open source "reference"
implementation that keeps pace with the standards, and robustly
implements the core functionality defined in the specifications.
The goal here is to reduce the barrier of entry for operators and
vendors making better use of the existing network management
configuration and telemetry solutions.
3. Security Considerations
Security of network management operations is of high importance due
to the sensitive nature of the information.
4. IANA Considerations
This document has no IANA actions.
Acknowledgments
Informative References
[I-D.draft-ietf-netmod-yang-packages]
Wilton, R., Rahman, R., Clarke, J., Sterne, J., and B. Wu,
"YANG Packages", Work in Progress, Internet-Draft, draft-
ietf-netmod-yang-packages-04, 21 October 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-netmod-
yang-packages-04>.
[I-D.ietf-nmop-network-anomaly-architecture]
Graf, T., Du, W., and P. Francois, "An Architecture for a
Network Anomaly Detection Framework", Work in Progress,
Internet-Draft, draft-ietf-nmop-network-anomaly-
architecture-01, 20 October 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-nmop-
network-anomaly-architecture-01>.
[I-D.ietf-nmop-yang-message-broker-integration]
Graf, T. and A. Elhassany, "An Architecture for YANG-Push
to Message Broker Integration", Work in Progress,
Internet-Draft, draft-ietf-nmop-yang-message-broker-
integration-05, 19 October 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-nmop-
yang-message-broker-integration-05>.
[RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed.,
and A. Bierman, Ed., "Network Configuration Protocol
(NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011,
<https://www.rfc-editor.org/rfc/rfc6241>.
[RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
"Specification of the IP Flow Information Export (IPFIX)
Protocol for the Exchange of Flow Information", STD 77,
RFC 7011, DOI 10.17487/RFC7011, September 2013,
<https://www.rfc-editor.org/rfc/rfc7011>.
[RFC7854] Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP
Monitoring Protocol (BMP)", RFC 7854,
DOI 10.17487/RFC7854, June 2016,
<https://www.rfc-editor.org/rfc/rfc7854>.
[RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF
Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017,
<https://www.rfc-editor.org/rfc/rfc8040>.
[RFC8199] Bogdanovic, D., Claise, B., and C. Moberg, "YANG Module
Classification", RFC 8199, DOI 10.17487/RFC8199, July
2017, <https://www.rfc-editor.org/rfc/rfc8199>.
[RFC8299] Wu, Q., Ed., Litkowski, S., Tomotaki, L., and K. Ogaki,
"YANG Data Model for L3VPN Service Delivery", RFC 8299,
DOI 10.17487/RFC8299, January 2018,
<https://www.rfc-editor.org/rfc/rfc8299>.
[RFC8342] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
and R. Wilton, "Network Management Datastore Architecture
(NMDA)", RFC 8342, DOI 10.17487/RFC8342, March 2018,
<https://www.rfc-editor.org/rfc/rfc8342>.
[RFC8345] Clemm, A., Medved, J., Varga, R., Bahadur, N.,
Ananthakrishnan, H., and X. Liu, "A YANG Data Model for
Network Topologies", RFC 8345, DOI 10.17487/RFC8345, March
2018, <https://www.rfc-editor.org/rfc/rfc8345>.
[RFC8466] Wen, B., Fioccola, G., Ed., Xie, C., and L. Jalil, "A YANG
Data Model for Layer 2 Virtual Private Network (L2VPN)
Service Delivery", RFC 8466, DOI 10.17487/RFC8466, October
2018, <https://www.rfc-editor.org/rfc/rfc8466>.
[RFC8526] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
and R. Wilton, "NETCONF Extensions to Support the Network
Management Datastore Architecture", RFC 8526,
DOI 10.17487/RFC8526, March 2019,
<https://www.rfc-editor.org/rfc/rfc8526>.
[RFC8527] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
and R. Wilton, "RESTCONF Extensions to Support the Network
Management Datastore Architecture", RFC 8527,
DOI 10.17487/RFC8527, March 2019,
<https://www.rfc-editor.org/rfc/rfc8527>.
[RFC8528] Bjorklund, M. and L. Lhotka, "YANG Schema Mount",
RFC 8528, DOI 10.17487/RFC8528, March 2019,
<https://www.rfc-editor.org/rfc/rfc8528>.
[RFC8639] Voit, E., Clemm, A., Gonzalez Prieto, A., Nilsen-Nygaard,
E., and A. Tripathy, "Subscription to YANG Notifications",
RFC 8639, DOI 10.17487/RFC8639, September 2019,
<https://www.rfc-editor.org/rfc/rfc8639>.
[RFC8641] Clemm, A. and E. Voit, "Subscription to YANG Notifications
for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641,
September 2019, <https://www.rfc-editor.org/rfc/rfc8641>.
[RFC9182] Barguil, S., Gonzalez de Dios, O., Ed., Boucadair, M.,
Ed., Munoz, L., and A. Aguado, "A YANG Network Data Model
for Layer 3 VPNs", RFC 9182, DOI 10.17487/RFC9182,
February 2022, <https://www.rfc-editor.org/rfc/rfc9182>.
[RFC9291] Boucadair, M., Ed., Gonzalez de Dios, O., Ed., Barguil,
S., and L. Munoz, "A YANG Network Data Model for Layer 2
VPNs", RFC 9291, DOI 10.17487/RFC9291, September 2022,
<https://www.rfc-editor.org/rfc/rfc9291>.
Appendix A. Current technology and solutions
This section of the document gives a perspective of the current
landscape of existing network management solutions that may be found
on network devices, along with a brief mention of network and service
YANG models, and lists some of the issues with those existing
technologies.
A.1. CLIs interfaces
Most vendors offer a command line interface (CLI) for configuring of
devices and reporting the operational state of devices (e.g., via
_show commands_).
For some devices, these CLIs offer interactive text based interfaces
to underlying external management models, where as for other devices,
the CLIs are independently defined from any programmatic external
data model, which can make it hard for network engineers to migrate
from a familiar CLI to using a very different programmatic data
models.
Generally, these CLI based interfaces offer configuration for the
full capabilities of the device, including all optional functionality
and features. They also generally ensure that the device can be
configured in an efficient way (e.g., consider scenarios where adhoc
data model specific templates allow a particular configuration to be
represented both concisely on the CLI and also implemented in the
device's hardware in a resource efficient way).
In some cases, network controllers are used as a bridge between
offering a north bound programmatic data model and a south bound
interface for configuring and managing the device by CLI.
In all cases, the CLIs are generally not designed or optimized to be
manipulated programmatically, lacking consistent structure and
typing, making the solution more fragile when used in this manner.
Most devices also offer _show commands_, or the equivalent, for
reporting operational state in a text based format, either using
tables, or free-form text reporting of relevant fields. Automated
parsing of this output can be very fragile, particularly for values
that are occasionally outside the anticipated ranges, that may skew a
table formatting, or be truncated. For some devices, these show
commands are available in different variants that control the level
of detail reported, and often the same fundamental information may be
reported in multiple separate show commands. I.e., there can be a
level of duplication in the data that is reported. Device vendors
seem to more commonly apply version management to the configuration
aspects of the CLI rather than operational show commands.
Generally, access to the show commands is via a synchronous
operation, that queries the device and waits for it to collate, sort,
and format the data before returning it. These mechanisms are likely
to be less efficient for devices that push the operational state off
the device, particularly if only changes in the data are pushed, and
if those changes don't occur at high frequency (or some form of
efficient dampening mechanism is employed).
A.2. SNMP & MIBS
Although there is currently a wide deployed base of SNMP and MIBS
used for monitoring operational data via periodic polling, we expect
there to be a significantly decrease in deployments over the next 10
years, at least when used for network management, although there will
inevitably be a long tail of deployments before everyone has migrated
to newer technologies. SDOs, like IETF, have generally stopped
writing new MIBs, or making significant updates to existing MIBs, or
the SNMP protocol. Similarly, it appears that most vendors are
investing much more heavily in more recent network management
protocols and data models rather than investing in either SNMP,
updating existing MIBs with new OIDs, or creating new MIBs.
For distributed servers, the design of the SNMP protocol is
inherently expensive to implement, generally requiring lots of
information to be cached before it can be returned.
SNMP & MIBs never achieved significant traction in the industry for
configuration of core network devices, with operators required to
either use the CLI, YANG based management protocols, or other
proprietary management interfaces or APIs.
A.3. YANG based Network Management Protocols
These section describe the current modern network management
protocols, that are predominantly YANG based, or optimized for use
with YANG.
A.3.1. NETCONF
NETCONF [RFC6241] is an XML based network management protocol.
Originally it was specified to work with generic XML based network
management data, but now it is generally expected to be used in
conjunction with YANG modeled configuration and operational data.
More recently, NETCONF was extended to support the NMDA [RFC8342].
This hasn't yet seen wide adoption, but there is gradually increasing
interest.
NETCONF is one of the main network management protocols used for
configuring devices, used alongside the CLI and gNMI.
Most of the NETCONF protocol is reasonably well specified, but there
aspects of the protocol that have more patchy implementation support,
including a shared candidate datastore, confirmed commit capability,
and XPath based filtering. Some areas of the specification are
unclear or hard to understand because the definition of the expected
behaviour is split between the NETCONF and YANG RFCs.
Some aspects of the NETCONF specification give flexibility for the
server to implement the behaviour in different ways (e.g., different
YANG defaults handling and reporting, startup configuration handling,
writable running vs candidate configuration). This flexibility makes
it easier for device implementations, but increases the complexity
for clients because they must be able to interoperate with different
server behaviour.
This is some early work within the NETCONF WG to update the NETCONF
protocol. This could be a good opportunity to drive for more
baseline conformity in behaviour across all network devices that
support the new protocol version.
A.3.2. RESTCONF
RESTCONF [RFC8040] is a newer _REST_ style network management
protocol that runs over HTTP and uses YANG Data models. Broadly,
RESTCONF offers similar functionality to NETCONF. RESTCONF has
achieved more traction as a Northbound interface to network
controllers, whereas the main programmatic network management
interfaces to devices remains as NETCONF or gNMI.
RESTCONF initially offered a "simulated combined datastore" view of
the data, done as an effort to simplify the interface. However, the
NMDA architecture effectively changed this to a datastore aware
architecture, more closely mirroring NETCONF. There seems to be more
support for ensuring that RESTCONF maintains feature parity with
NETCONF. It has been suggested that RESTCONF could just replace
NETCONF as the single IETF protocol to network devices, but there
doesn't appear to be a strong industry backing for going in that
direction.
RESTCONF supports encoding the data in both JSON and XML. The RFC
specifies XML as the mandatory to support encapsulation, but it seems
likely that over the last seven years since the RFC was published,
that the JSON encoding is becoming much more popular than XML.
Some enhancements to RESTCONF, in many cases, mirroring similar
enhancements being made for NETCONF, are being considered for
standardization within IETF.
A.3.3. gNMI Protocol Suite
gNMI is a newer, industry defined, gRPC based network management
protocol that carries data modelled in YANG, encoded via JSON or
Protobuf. The _gNMI family_ of protocols includes other related
protocols for the management and orchestration of devices (e.g.,
gNOI, gNSI, gRIBI, Bootz). These are often modelled using gRPC and
separate Protobuf definitions rather than leveraging YANG's gRPC,
Action, and Notification mechanisms.
The stewardship of this protocol suite predominantly falls on the
operator community, but with strong leadership by a principal network
operator. This allows the protocol to evolve more quickly, although
potentially in non-compatible ways that could break existing
deployments. The specifications tend to be less precisely specified
than the equivalent IETF protocols, and generally have a lower level
of technical review, meaning that there are more likely to be
interoperability issues between different implementations.
There are efforts underway to improve interoperability via a
conformance test suite that is being collectively maintained.
A.4. YANG & YANG Data Models
A.4.1. YANG
The YANG data modelling language exists in two version, YANG 1, and
YANG 1.1. The effective differences between the two versions are
relatively minor and YANG models using both versions are deployed.
The IETF NETMOD working group is at the early stages of considering a
new version of the YANG language, considering over 100 potential
issues and enhancements! At the time of this draft publication, it
is unclear whether consensus will converge around a relatively small
update to the language, or a more significant new version. It is
anticipated that any new version of the YANG language would likely
take several years to specify and gain consensus. Care must be taken
to strike the right balance of making enough improvements to the
language to make an upgrade worthwhile, vs bloating the language with
too many features, i.e., suffering from second system syndrome. A
future version of the language should be framed clearly around the
set of problems it is aiming to solve, e.g., minor fixes to the
existing specification, ease of use improvements, or making it easier
to model specific problem domains, hopefully without introducing too
much additional complexity.
A.4.2. Network Device YANG Data Models
There appears to be some what of a fracture in the industry as to
whether YANG models should be modelled using datastores (as per the
IETF Network Management Datastore Architecture), or they should adopt
OpenConfig's style, where a single data model contains intended
configuration, applied configuration, and operational state in a
combined data tree, using a structural naming convention.
In some ways, the OpenConfig style leads to a simpler combined data
tree, but the YANG files themselves, through the frequent use of
groupings are generally much harder to read then the NMDA equivalent,
unless compiled into a more readable format. The OpenConfig style
doesn't lend itself well to modeling special configuration, e.g.,
boot configuration, or ephemeral configuration, both of which can be
modelled cleanly using the NMDA datastore architecture. Further,
there are aspects of the YANG language that somewhat conflict with
the OpenConfig style, meaning that there are various YANG language
constructs, i.e., presence containers or choice & case statements,
that are problematic to use with OpenConfig modelling.
Conversely, models designed using the NMDA require using extensions
to the NETCONF [RFC8526] and RESTCONF [RFC8527] protocols, that
require the target datastore to be specified during operations, to
use those models effectively. Further, this means that operations
and requests act on either configuration or operational data, not
both together.
In terms of implementation, many network devices store and manage
configuration data separately from operational data due to the
different constraints and requirements on the different data sets,
e.g., configuration must be transactional and fully consistent,
whereas, the operational data is generally only ever eventually
consistent. This means that queries or subscriptions that require
both configuration and operational state in a single response require
the system to fetch the information from two different subsystems and
to merge the data into a single response before returning. Depending
on the system design, this may be required when combining 'applied
configuration' and system defined operational state (e.g., counters
and protocol network state), depending on where the applied
configuration is tracked in the system.
A.4.2.1. Standards based YANG models (IETF, IEEE, BBF. 3GPP)
Various SDOs, e.g., IETF, IEEE, BBF, and 3GPP are all in the process
of defining YANG models to define network management interfaces for
the network protocols that they are responsible for.
For the IETF, these data models are designed around the NMDA,
allowing the same models to be used both for configuration and be
extended to cover operational state aligning the same paths and
definitions wherever possible. This approach allows for flexibility
for other views (i.e., datastores) on the data to be provided (e.g.,
factory-default, startup configuration, system defaults, or ephemeral
configuration).
IETF has already produced RFCs defining network device YANG data
models covering many of the key network protocols defined by the
IETF. Where published, the YANG models generally provide good
coverage of the protocol in question, including optional
functionality. The problem with the set of IETF YANG models
published so far is that it has taken them a very long time to reach
standardization, and they make use of YANG extensions that are still
not yet widely implemented (e.g., Schema mount [RFC8528]) and there
are significant gaps in the YANG modules that have been published to-
date, e.g., the IETF doesn't yet have a published RFC for BGP, L2VPN
or EVPN functionality.
A.4.2.2. Industry based YANG Models (OpenConfig)
The Open Config industry consortium also defines a set of YANG models
for configuring and monitoring devices. Like the gNMI protocol, the
stewardship of these models predominantly falls on the operator
community, but with strong leadership by a principal operator.
Vendors implementing the models can also make suggestions and provide
comments on proposed changes and additions to the data models,
particularly when they would be hard, or impossible, to implement
effectively on the network devices. However, generally decisions are
less open, than say, IETF's consensus based procedures.
The OpenConfig models are focussed on solving the configuration
requirements of those operators who participate in the forums, and
hence they are somewhat more focussed on solving particular network
designs and protocol choices. This can be mean that some
technologies may not currently be covered by the OpenConfig YANG
models at all, and it can be harder to get additions added, or those
additions could undergo significant breaking changes if more
operators start to pick a particular technology and collectively
decide that a different approach to modelling would be better.
The OpenConfig models evolve at a much faster rate than those in the
IETF with a lower bar to review and more willingness to make breaking
changes to just fix issues, or improve the models, and then move on.
There are efforts to restrict those breaking changes to an annual
basis, but this will still likely mean that many deployments that
move between software releases more slowly would see breaking changes
in the management model whenever they update. Models are likely to
gain more stability over time, but it is still very likely that there
will be issues with version skew in the models, which is likely to
fall on the clients or controllers using the models.
Generally, OpenConfig models are restricted to using YANG 1, rather
than using the updated YANG 1.1 specification.
A.4.2.3. Vendor specific YANG models
Most large Internet Routers all expose YANG data models for
configuring and monitoring the device.
There are various choices for the sources of these data models:
* based on an existing internal data model
* based on the CLI (or show commands)
* based on an existing publish or draft models (e.g., IETF or
OpenConfig)
* designed form scratch.
Each of these designs have advantages and disadvantages.
Generating or basing the external model on an internal model normally
has the advantage that it is easy to translate the configuration for
consumption by the system. However, it has the disadvantage that it
may leak internal details and structures into the external model, not
being able to leverage the full capabilities of YANG, and not being
as easy to use. If the internal model is quite different from the
CLI then network operators familiar with the CLI must still learn the
new model structure. It probably also forces some level of
versioning on the internal data-structures or alternatively the
ability to handle version skew between the generated models and the
internal data model.
Basing the vendor device YANG model on the CLI makes the models more
familiar, but the structure and extensibility of the CLI and YANG
somewhat differs, potentially making for somewhat less well
structured YANG models (compared to designing the YANG models from
scratch). One strong advantage of this approach is allowing a clean
bijective conversion between CLI and the equivalent YANG.
Basing the vendor device YANG model on existing SDO or Industry YANG
models potentially allows for network operator familiarity (but not
with respect to the CLI) and conformability, but unless the device is
a green field development, the way particular features are modelled
in the external model may differ significantly from the internal
device representation, requiring more complex, and potentially less
efficient, mapping and internal representation (e.g., expansion of
config and less efficient use of hardware resources). Hence, it is
likely that deviations and augmentations to the external models will
be required to ensure that the external model can be mapped
reasonably cleanly into internal representations. A further concern
is version skew if the published models change over time but more
stability is required in the vendors external model to support
existing customer deployments. A final concern here is trying to
predict the right public model familiy to base the models on - i.e.,
which YANG models will likely end up succeeding in the market in the
medium term.
The final choice is to define the model entirely from scratch. This
potentially allows for a better solution, but at a greater
development cost. Depending on how closely the model maps to the
existing CLI, internal model, or industry or SDO models generally
affects the different advantages/disadvantages of this approach from
those described above.
Generally, in all cases, you would desire and expect the vendor
models to hae full parity with the configuration that can be
expressed via the CLI, leveraging all of the device configuration
capabilities.
A different set of choices may be made for the operational data
(e.g., show command equivalents), although many of the same
advantages and disadvantages equally apply.
A.4.2.4. Problems with the YANG model ecosystem
One of the biggest problems that is slowing the adoption of YANG and
automated network management is the fracture between standard network
management models for managing devices, documented in
Appendix A.4.2.1 and Appendix A.4.2.2:
* OC YANG is more cohesive and complete for various deployments.
* IETF YANG is more complete for some specific protocols, but it may
not be sufficient to be deployed on its own, retaining some large
gaps that must be filled with draft models, or augmented with
vendor proprietary models.
In addition to this, every vendor has their own legacy CLI, their own
data models, which may be entirely independent, be based on the CLI,
or perhaps an internal data model. Most devices are likely to have
separate internal data models that differ from the external data
models, and won't necessarily even be defined in YANG.
All of these data model families define their properties in different
ways that are not completely compatible with each other. Further, it
isn't clear which external YANG data models, if any, will dominate in
the market, and hence modifying the internal data models to align
with a particular external data model family could be a risky
strategy if the wrong data model is chosen. Hence, this generally
requires some form of 'mapping' of data in external model families
into internal model families, which has its own set of challenges and
complexities, see Appendix A.4.2.5.
It is unclear which external YANG data models, if any, will end up
dominating the market place, and hence, reworking (perhaps based on
previous non YANG technologies) or aligning a device's internal data
models to better suit the style of a single external model family is
likely to be a risky strategy.
A.4.2.5. Problems with mapping between internal and external data model
families
Mapping between external and internal data model families brings its
own set of issues.
The first obvious problem occurs when the external and internal data
models are not fundamentally defined in the same modelling language
and where equivalent concepts are modelled in different ways. For
example, the concept of how filtering is performed can be specified
in an optimized form in the data model, or it can be defined purely
as a protocol operation.
Secondly, even when both external and internal models are represented
in the same domain language (e.g., YANG) then there is a fundamental
choice about how to map data (configuration or operational) between
the external and internal model families, and what represents the
source of truth of configuration data for the device.
The perhaps naive, and most obvious, approach is to try to convert
between configuration data in the external model to configuration
data in the internal data model, and then store the configuration in
that internal format. Whenever a request is made to read the current
configuration, the device converts back from its internal
configuration back to the requested external representation. For the
device, the source of truth for the configuration is always stored in
the internal native format. Such a choice would allow clients to
query the configuration in different formats (e.g., device-native,
Open Config, or IETF), or send in separate configuration requests in
different families (e.g., the bulk of the configuration could be
defined as Open Config YANG, but overridden with native CLI or YANG
to cover the parts of the configuration that are not expressible in
Open Config). Alas, this approach also brings significant problems.
Unless the internal and external data models are very closely aligned
(and this isn't generally possible when different incompatible
external model families exist) then exact bijective mappings are not
possible, since there is always a loss of data, and when you request
to read the configuration back, even in the same model family as
first configured, you will receive a slightly different version of
that configuration data, perhaps with default values added/removed,
or differences in the name of arbitrary identifiers. It is the
authors' opinion that this is not the best way of trying to solve
this problem.
The alternative solution, for configuration, is to only map the
external configuration down into the internal configuration in a
single direction (but allow for configuration errors to be correctly
propagated back). The device persists the configuration in the
external format as the source of truth, but any queries to return the
applied configuration are able to return the exact configuration
originally provided. This approach allows for more complex mappings
than the bidirectional mapping approach described above, but requires
that the external client manage configuration in different model
families effectively.
A.4.2.6. Problems with how the IETF creates and management YANG models
It is hard to argue that IETF has been anything less that very
successful at encouraging and advancing interopability between
devices over the last four decades. Some aspects that make the IETF
process very successful also somewhat act to its detriment. One key
observation is that new technology and advances generally move fairly
slowly in the IETF, and once standardized, are often even slower to
change further. Generally, it is much easier to slow down or block
work within the IETF than it is to bring new ideas. Although the
slow pace of initial standards development and subsequent evolution
can be frustrating, it has the benefit that once the technology
becomes mature and is implemented, those protocols and
implementations can be stable over a relatively long time period.
For some operators and deployments this isn't necessarily important,
for others, it can reduce long term costs
A.4.3. Network and Service YANG Models
The IETF has also specified various YANG models that are exist at the
Service or Network-wide layer rather than models for managing
specific devices. E.g., L3VPN [RFC8299], and L2VPN [RFC8466] define
_Service_ YANG models. [RFC9182] and [RFC9291] define _Network-wide_
YANG models. In addition, network wide topologies can be modelled
using [RFC8345], along with many augmentations that have been
published or are being developed. [RFC8199] helps characterize the
difference between service and device (element) YANG models, but
doesn't cover the network-wide layer classification.
There has been somewhat stronger adoption of the network and service
IETF YANG models by operators, sometimes used in conjunction with
OpenConfig YANG models for configuring elements or otherwise device
native CLI or YANG models.
These models are generally fall outside the scope of the YANG models
discussed in the rest of this document, because they do not directly
apply to network elements.
We are not aware of other industry attempts at defining Network or
Service YANG models, but MEF has been working on defining APIs at
various management layers, mostly built around OpenAPI specifications
rather than YANG.
Authors' Addresses
Robert Wilton
Cisco Systems
Email: rwilton@cisco.com
Nick Corran
Cisco Systems
Email: ncorran@cisco.com