Skip to main content

Some Key Terms for Incident Management
draft-davis-nmop-incident-terminology-00

Document Type Active Internet-Draft (individual)
Authors Nigel Davis , Adrian Farrel
Last updated 2024-01-18
RFC stream (None)
Intended RFC status (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-davis-nmop-incident-terminology-00
Network Working Group                                           N. Davis
Internet-Draft                                                     Ciena
Intended status: Informational                                 A. Farrel
Expires: 21 July 2024                                 Old Dog Consulting
                                                         18 January 2024

                 Some Key Terms for Incident Management
                draft-davis-nmop-incident-terminology-00

Abstract

   This document sets out some key terms that are fundamental to a
   common understanding of Incident Management.

   The purpose of this document is to bring clarity to discussions and
   other work related to Incident Management in particular YANG models
   and management protocols that report, make visible, or manage
   incidents.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 21 July 2024.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

Davis & Farrel            Expires 21 July 2024                  [Page 1]
Internet-Draft            Incident Terminology              January 2024

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   2
   3.  Security Considerations . . . . . . . . . . . . . . . . . . .   4
   4.  Privacy Considerations  . . . . . . . . . . . . . . . . . . .   5
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   5
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   5

1.  Introduction

   Incident Management is an important aspect of network management and
   control solutions.  It deals with the reporting, inspection,
   correlation, and management of events within the network where those
   events have a negative effect on the network's ability to forward
   traffic in an optimal way.  Incident management extends to include
   actions taken that work toward recovery of optimal network behavior.

   A number of work efforts within the IETF seek to provide components
   of an Incident Management system, such as YANG models or management
   protocols.  It is important that a common terminology is used so that
   there is a clear understanding of how the elements of the management
   and control solutions fit together, and how the incidents will be
   handled.

   This document sets out some key terms that are fundamental to a
   common understanding of Incident Management.

2.  Terminology

   The terms are presented below in an order that is intended to flow
   such that it is possible to gain understanding reading top to bottom.

   Resource:  A component or commodity that can be used in a valuable
      way in the performance of some activity.

   State:  A particular condition that something is in (at a specific
      time).

Davis & Farrel            Expires 21 July 2024                  [Page 2]
Internet-Draft            Incident Terminology              January 2024

   Change:  A modification to the state of a resource in time.

      *  Most changes are not noteworthy (and are not relevant).

      *  Perception of change depends upon the sampling rate/accuracy/
         detail and perspective.

   Occurrence:  A particular relevant change.

      *  The change is potentially without a plan or intent.

      *  An occurrence may be an aggregation or abstraction of smaller
         occurrences.

      *  Applies to all scales and scopes, i.e., is essentially fractal
         (can recurse indefinitely).

      *  Note that occurrence is used here with respect to the temporal
         dimension.

   Event:  The state modification in an occurrence.

      *  Compared with a change which is over a period of time, an event
         happens at a measurable instant.

   Incident:  An event that has a negative effect that is not as
      required/desired.

   Problem:  A state regarded as undesirable that needs to be dealt with
      and overcome.

      *  There is a need to change to a desirable/appropriate state.

      *  Note that there is a historic aspect to this.  The current
         state may be operational, but there was a failure that is
         unexplained and therefore the network is in a state of
         unexplained recent failure which, although the network has
         recovered, is a problem.

      *  Note that whilst a problem is unresolved it requires attention.
         A record of a resolved problem may be maintained in a log of
         history.

      *  Note that the network may be in a state which is considered to
         be a problem from several perspectives (e.g., there is loss of
         light causing services to fail).  A state change (so that the
         light recovers) may cause the problem to be resolved from one
         perspective (the services have are now operational) but may

Davis & Farrel            Expires 21 July 2024                  [Page 3]
Internet-Draft            Incident Terminology              January 2024

         still leave the problem as unresolved from another perspective
         (because the loss of light has not been explained).  There can
         be further developments (the reason for the temporary loss of
         light is traced to a microbend in the fiber that is repaired)
         that cause another problem to be resolved.  But this leaves a
         final problem still unresolved (why did the microbend occur in
         the first place?).

   Alert:  The indication of the potential existence of a problem

   Notification:  Communication of a state change.

      *  May be an alert.

   Alarm:  An indication to a human operator highlighting the potential
      presence of a problem.

      *  The alarm state change is an event.

   Transient:  A state, considered as a problem, that persists for a
      limited amount of time before becoming resolved without direct
      action by an operator or control system.

   Intermittent:  A state that is not maintained, but keeps occurring in
      some meaningfully short time frame.

   Cause:  The activity, event, etc. that gives rise to an (undesired)
      event, condition, or behavior.

   Detect:  To notice the presence of something (state, activity, form,
      etc.).

      *  Hence also to notice a change (from the perspective of the
         viewer).

   Condition:  The state of something with regard to its working order.

      *  Here, this term is used where the state is an issue with
         operation.  For example, "signal degraded" is a condition that
         indicates an issue with the operation.

3.  Security Considerations

   This document specifies terminology and has no direct effect on the
   security of implementations or deployments.  However, protocol
   solutions and management models need to be aware of several aspects:

Davis & Farrel            Expires 21 July 2024                  [Page 4]
Internet-Draft            Incident Terminology              January 2024

   *  The exposure of information pertaining to incidents may make
      available knowledge of the internal workings of a network (in
      particular its vulnerabilities) that may be of use to an attacker.

   *  Systems that generate management information (messages,
      notifications, etc.) when incidents occur, may be attacked by
      causing them to generate so much information that the management
      system is swamped an unable to properly manage the network.

   *  Reporting false information about incidents (or masking reports of
      incidents) may cause the management system to function
      incorrectly.

4.  Privacy Considerations

   In general, Incident Management will not expose information about
   end-user activities or user data.  The main privacy concern is for a
   network operator to keep control of all information about incidents
   to protect their privacy and the details of how they operate their
   network.

5.  IANA Considerations

   This document makes no requests for IANA action.

Authors' Addresses

   Nigel Davis
   Ciena
   United Kingdom
   Email: ndavis@ciena.com

   Adrian Farrel
   Old Dog Consulting
   United Kingdom
   Email: adrian@olddog.co.uk

Davis & Farrel            Expires 21 July 2024                  [Page 5]