IDR Working Group D. Freedman
Internet-Draft Claranet
Intended status: Standards Track R. Raszuk
Expires: January 1, 2012 Cisco Systems
R. Shakir
C&W
June 30, 2011
BGP OPERATIONAL Message
draft-frs-bgp-operational-message-00
Abstract
The BGP Version 4 routing protocol (RFC4271) is now used in many
ways, crossing boundaries of administrative and technical
responsibility.
The protocol lacks an operational messaging plane which could be
utilised to diagnose, troubleshoot and inform upon various conditions
across these boundaries, securely, during protocol operation, without
disruption.
This document proposes a new BGP message type, the OPERATIONAL
message, which can be used to effect such a messaging plane for use
both between and within Autonomous Systems.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 1, 2012.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
Freedman, et al. Expires January 1, 2012 [Page 1]
Internet-Draft bgp-operational-message June 2011
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. BGP OPERATIONAL message . . . . . . . . . . . . . . . . . . . 5
3.1. BGP OPERATIONAL message capability . . . . . . . . . . . . 5
3.2. BGP OPERATIONAL message encoding . . . . . . . . . . . . . 5
3.3. PRI Format . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4. BGP OPERATIONAL message TLVs . . . . . . . . . . . . . . . 9
3.4.1. ADVISE TLVs . . . . . . . . . . . . . . . . . . . . . 9
3.4.2. STATE TLVs . . . . . . . . . . . . . . . . . . . . . . 10
3.4.3. DUMP TLVs . . . . . . . . . . . . . . . . . . . . . . 11
3.4.4. CONTROL TLVs . . . . . . . . . . . . . . . . . . . . . 13
4. On the use of STATE and DUMP TLVs . . . . . . . . . . . . . . 16
5. On the use of ADVISE TLVs . . . . . . . . . . . . . . . . . . 17
6. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . 19
7. Security considerations . . . . . . . . . . . . . . . . . . . 20
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
10.1. Normative References . . . . . . . . . . . . . . . . . . . 24
10.2. Informative References . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26
Freedman, et al. Expires January 1, 2012 [Page 2]
Internet-Draft bgp-operational-message June 2011
1. Introduction
In this document, a new BGP message type, the OPERATIONAL message is
defined, creating a communication channel over which messages can be
passed, using a series of contained TLV elements.
The messages can be human readable, for the attention of device
operators or machine readable, in order to provide simple self test
routines, which can be exchanged between BGP speakers.
A number of TLV elements will be assigned to provide for these
message types, along with TLV elements to assist with description of
the message data, such as describing precisely BGP prefixes and
encapsulating BGP UPDATE messages to be sent back for inspection in
order to troubleshoot session malfunctions.
The use of OPERATIONAL messages will be negotiated by BGP Capability
[RFC5492], since the messages are in-band with the BGP session, they
can be assumed to either be authenticated as originating directly
from the BGP neighbor.
The goal of this document is to provide a simple, extensible
framework within which new messaging and diagnostic requirements can
live.
Freedman, et al. Expires January 1, 2012 [Page 3]
Internet-Draft bgp-operational-message June 2011
2. Applications
The authors would like to propose three main applications which BGP
OPERATIONAL TLVs are designed to address. New TLVs can be easily
added to enhance further current applications or to propose new
applications.
The set of TLVs is organised in the following four functional groups
comprising the three applications and some control messaging:
o ADVISE TLVs, designed to convey human readable information to be
passed, cross boundary to operators, to inform them of past or
upcoming error conditions, or provide other relevant, in-band
operational information. The "Advisory Demand Message" ADM
(Section 3.4.1.1) is an example of this.
o STATE TLVs, designed to carry information about BGP state across
BGP neighbors, including both per-neighbor and global counters.
o DUMP TLVs, designed to describe or encapsulate data to assist in
realtime or post-mortem diagnostics, such as structured
representations of affected prefixes / NLRI and encapsulated raw
UPDATE messages for inspection.
o CONTROL TLVs, designed to facilitate control messaging such as
replies to requests which can not be satisfied.
Means concerning the reporting of information carried by these TLVs,
either in reply or request processing are implementation specific but
could include methods such as SYSLOG.
Freedman, et al. Expires January 1, 2012 [Page 4]
Internet-Draft bgp-operational-message June 2011
3. BGP OPERATIONAL message
3.1. BGP OPERATIONAL message capability
A BGP speaker that is willing to exchange BGP OPERATIONAL Messages
with a neighbor should advertise the new OPERATIONAL Message
Capability to the neighbor using BGP Capabilities advertisement
[RFC5492] . A BGP speaker may send an OPERATIONAL message to its
neighbor only if it has received the OPERATIONAL message capability
from them.
The Capability Code for this capability is specified in the IANA
Considerations section of this document.
The Capability Length field of this capability is 2 octets.
+------------------------------+
| Capability Code (1 octet) |
+------------------------------+
| Capability Length (1 octet) |
+------------------------------+
OPERATIONAL message BGP Capability Format
3.2. BGP OPERATIONAL message encoding
The BGP message as defined [RFC4271] consists of a fixed-size header
followed by two octet length field and one octet of type value. The
RFC limits the maximum message size to 4096 octets. As one of the
applications of BGP OPERATIONAL message (through the MUD
(Section 3.4.3.3) message) is to be able to carry an entire,
potentially malformed BGP UPDATE, this specification mandates that
when the neighbor has negotiated the BGP OPERATIONAL message
capability, any further BGP message which may be subject enclosure
within a BGP OPERATIONAL message must be sent with the maximum size
reduced to accommodate for the potential need of additional wrapping
header size requirements. This is applicable to both the current BGP
maximum message size limit or for any future modifications.
For the purpose of the OPERATIONAL message information encoding we
will use one or more Type-Length-Value containers where each TLV will
have the following format:
Freedman, et al. Expires January 1, 2012 [Page 5]
Internet-Draft bgp-operational-message June 2011
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type (2 octets) | Length (2 octets) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Variable size TLV value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
OPERATIONAL message TLV Format
TYPE: 2 octet value indicating the TLV type
LENGTH: 2 octet value indicating the TLV length in octets
VALUE: Variable length value field depending on the type of the TLVs
carried.
To work around continued BGP churn issues some types of TLVs will
need to contain a sequence number to correlate a request with
associated replies. The sequence number will consist of 8 octets and
will be of the form: (4 octet bgp_router_id) + (local 4 octet
number). When the local 4 octet number reaches 0xFFF it should
restart from 0x0000. The sequence number is only used if the TLV
requires sequencing else it is not included.
The typical application scenario for use of the sequence number is
for it to be included in a request TLV to be copied into associated
reply messages in order to correlate requests with their associated
replies.
3.3. PRI Format
Prefix Reachability Indicators (PRI) are used to represent prefix
NLRI and BGP attributes in a request and only prefix NLRI in a
response, in this draft.
Each PRI is encoded as a 3-tuple of the form <Flags, Payload Type,
Payload> whose fields are described below:
Freedman, et al. Expires January 1, 2012 [Page 6]
Internet-Draft bgp-operational-message June 2011
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags (1 octet) | Payload Type (1 octet) |
+---------------------------------------------------------------+
| Payload (Variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The use and the meaning of these fields are as follows:
a) Flags:
Four bits indicating NLRI Reachability:
aa) R Bit:
The R (Reachable) bit, if set represents that the prefixes were
deemed reachable in the NLRI, else represents that the prefixes
were deemed unreachable. This bit is meaningless in the
context of all currently defined requests and can thus only be
found in a response. If found in a request an implementation
MUST ignore its state.
ab) I Bit:
The I (Adj-RIB-In) bit, if set in a query, indicates that the
requestor wishes for the response to be found in the Adj-RIB-In
of the neighbor representing this session, if cleared indicates
that the Adj-RIB-In of the neighbor representing this session
is not searched. If set in a response, indicates that the Adj-
RIB-In of the neighbor representing this session contained this
information, if cleared it did not.
ac) O Bit:
The O (Adj-RIB-Out) bit, if set in a query, indicates that the
requestor wishes for the response to be found in the Adj-RIB-
Out of the neighbor representing this session, if cleared
indicates that the Adj-RIB-Out of the neighbor representing
this session is not searched. If set in a response, indicates
that the Adj-RIB-Out of the neighbor representing this session
contained this information, if cleared it did not.
ad) L Bit:
The L (Loc-RIB) bit, if set in a query, indicates that the
requestor wishes for the response to be found in the BGP Loc
RIB of the neighbor, if cleared indicates that the Loc-RIB of
the neighbor is not searched. If set in a response, indicates
that the Loc-RIB of the neighbor contained this information, if
cleared it did not.
The rest of the field is reserved for future use.
Freedman, et al. Expires January 1, 2012 [Page 7]
Internet-Draft bgp-operational-message June 2011
b) Payload Type:
This one octet type specifies the type and geometry of the
payload.
ba) Type 0 - NLRI:
The payload contains (perhaps multiple) NLRI, the format of
each NLRI is as defined in the base specification of such NLRI
appropriate for the AFI/SAFI.
bb) Type 1 - Next Hop:
The payload contains a Next Hop address, appropriate for the
AFI/SAFI. When used in an SSQ (Section 3.4.2.7) message the
response is expected to contain prefixes from the selected RIBs
which contain this next-hop in their next-hop attribute.
bc) Type 2 - AS Number:
The payload contains a 16 or 32 bit AS number (as defined in
[RFC4893]), when used in an SSQ message the response is
expected to contain prefixes from the selected RIBs which
contain this AS number in their AS_PATH or AS4_PATH (as
appropriate) attributes.
bc) Type 3 - Standard Community:
The payload contains a standard community (as defined in
[RFC1997]), when used in an SSQ message the response is
expected to contain prefixes from the selected RIBs which
contain this standard community in their communities attribute.
bd) Type 4 - Extended Community:
The payload contains an extended community (as defined in
[RFC4360]), when used in an SSQ message the response is
expected to contain prefixes from the selected RIBs which
contain this standard community in their extended communities
attribute.
be) Types 5-65535 - Reserved:
Types 5-65535 are reserved for future use.
c) Payload:
Contains the actual payload, as defined by the payload type, the
payload is of variable length, to be calculated from the remaining
TLV length.
PRI are used for both request and response modes, a response MUST
only contain an NLRI (type 0) payload but a request MAY contain
payloads specifying a type to search for, an implementation MUST
validate all PRI it receives in a request against the type of request
which was made.
Freedman, et al. Expires January 1, 2012 [Page 8]
Internet-Draft bgp-operational-message June 2011
An implementation MUST NOT send a PRI in response with no NLRI (type
0) payload, this is considered to be invalid. If the implementation
wishes to signal that a request did not yield a any valid results an
implementation MAY respond with an NS TLV (Section 3.4.4.2), using
the "Not Found" subcode, for example.
3.4. BGP OPERATIONAL message TLVs
3.4.1. ADVISE TLVs
ADVISE TLVs convey human readable information to be passed, cross
boundary to operators, to inform them of past or upcoming error
conditions, or provide other relevant, in-band operational
information.
3.4.1.1. Advisory Demand Message (ADM)
TYPE: 1 - ADM
LENGTH: 3 Octets(AFI+SAFI) + Variable value (up to 2K octets)
USE: To carry a message, on demand, comprised of a string of UTF-8
characters (up to 2K octets in size), with no null termination. Upon
reception, the string SHOULD be reported to the host's administrator.
Implementations SHOULD provide their users the ability to transmit a
free form text message generated by user input.
3.4.1.2. Advisory Static Message (ASM)
TYPE: 2 - ASM
LENGTH: 3 Octets(AFI+SAFI) + Variable value (up to 2K octets)
USE: To carry a message, on demand, comprised of a string of UTF-8
characters, with no null termination. Upon reception, the string
SHOULD be stored in the BGP neighbor statistics field within the
router. The string SHOULD be accessible to the operator by executing
CLI commands or any other method (local or remote) to obtain BGP
neighbor statistics (e.g. NETCONF, SNMP).
The expectation is that the last ASM received from a BGP neighbor
will be the message visible to the operator (the most current ASM).
Implementations SHOULD provide their users the ability to transmit a
free form text message generated by user input.
Freedman, et al. Expires January 1, 2012 [Page 9]
Internet-Draft bgp-operational-message June 2011
3.4.2. STATE TLVs
STATE TLVs reflect, on demand, the internal state of a BGP neighbor
as seen from the other neighbor's perspective.
3.4.2.1. Reachable Prefix Count Request (RPCQ)
TYPE: 3 - RPCQ
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number
USE: Sent to the neighbor to request that an RPCP (Section 3.4.2.2)
message is generated in response.
3.4.2.2. Reachable Prefix Count Reply (RPCP)
TYPE: 4 - RPCP
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 4 Octet RX Prefix
Counter (RXC) + 4 Octet TX Prefix Counter (TXC)
USE: Sent in reply to an RPCQ (Section 3.4.2.1) message from a
neighbor, RXC is populated with the number of reachable prefixes
accepted from the peer and TXC with the number of prefixes to be
transmitted to the peer for the AFI/SAFI.
3.4.2.3. Adj-Rib-Out Prefix Count Request (APCQ)
TYPE: 5 - APCQ
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number
USE: Sent to the neighbor to request that an APCP (Section 3.4.2.4)
message is generated in response.
APCQ can be used as a simple mechanism when an implementation does
not permit or support the use of RPCQ.
3.4.2.4. Adj-Rib-Out Prefix Count Reply (APCP)
TYPE: 6 - APCP
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 4 Octet TX Prefix
Counter (TXC)
USE: Sent in reply to an APCQ (Section 3.4.2.3) message from a
neighbor, TXC is populated with the number of prefixes held in the
Adj-Rib-Out for the neighbor for the AFI/SAFI.
Freedman, et al. Expires January 1, 2012 [Page 10]
Internet-Draft bgp-operational-message June 2011
3.4.2.5. BGP Loc-Rib Prefix Count Request (LPCQ)
TYPE: 7 - LPCQ
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number
USE: Sent to the peer to request that an LPCP (Section 3.4.2.6)
message is generated in response.
3.4.2.6. BGP Loc-Rib Prefix Count Reply (LPCP)
TYPE: 8 - LPCP
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 4 Octet Loc-Rib
Counter (LC)
USE: Sent in reply to an LPCQ (Section 3.4.2.5) message from a
neighbor, LC is populated with the number of prefixes held in the
entire Loc-Rib for the AFI/SAFI.
3.4.2.7. Simple State Request (SSQ)
TYPE: 9 - SSQ
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + Single request PRI
(Variable)
USE: Using a PRI as a request form (See Section 3.3), an
implementation can be asked to return information about prefixes
found in various RIBs.
A single, simple PRI is used in the request, containing a single NLRI
or attribute as the PRI payload. RIB response filtering may take
place through the setting of the I, O and L bits in the PRI Flags
field.
An implementation MAY respond to an SSQ TLV in with an SSP (See
Section 3.4.3.4) TLV (containing the appropriate data). An
implementation MAY also respond to an SSQ with an NS TLV (with the
appropriate subcode set) indicating why there will not be an SSP TLV
in response. An implementation MAY also not respond at all (See
Section 7).
3.4.3. DUMP TLVs
DUMP TLVs provide data in both structured and unstructured formats in
response to events, for use in debugging scenarios.
Freedman, et al. Expires January 1, 2012 [Page 11]
Internet-Draft bgp-operational-message June 2011
3.4.3.1. Dropped Update Prefixes (DUP)
TYPE: 10 - DUP
LENGTH: 3 Octets(AFI+SAFI) + Variable number of dropped UPDATE Prefix
Reachability Indicators (PRI) (See Section 3.3)
USE: To report to a neighbor a structured set of prefix reachability
indicators retrievable from the last dropped UPDATE message, sent in
response to an UPDATE message which was well formed but not accepted
by the neighbor by policy.
For example, an UPDATE which was dropped and the rescued NLRI
concerned a number of both reachable and unreachable prefixes, the
DUP would encapsulate two PRI, one with the R-Bit (reachable) set,
housing the rescued reachable NLRI and the other with the R-Bit
cleared (unreachable), housing the rescued unreachable NLRI as
payload.
3.4.3.2. Malformed Update Prefixes (MUP)
TYPE: 11 - MUP
LENGTH: 3 Octets(AFI+SAFI) + Variable number of dropped update Prefix
Reachability Indicators (PRI) (See Section 3.3) due to UPDATE
Malformation.
USE: To report to a neighbor a structured set of prefix reachability
indicators retrievable from the last UPDATE message dropped through
malformation, sent in response to an UPDATE message which was not
well formed and not accepted by the neighbor, where a NOTIFICATION
message was not sent. A MUP TLV may accompany a MUD
(Section 3.4.3.3) TLV.
See the example from Section 3.4.3.1.
3.4.3.3. Malformed Update Dump (MUD)
TYPE: 12 - MUD
LENGTH: 3 Octets(AFI+SAFI) + Variable length representing retrievable
malformed update octet stream.
USE: To report to a peer a copy of the last UPDATE message dropped
through malformation, sent in response to an UPDATE message which was
not well formed and not accepted by the neighbor, where a
NOTIFICATION message was not sent. A MUD TLV may accompany a MUP
(Section 3.4.3.2) TLV.
Freedman, et al. Expires January 1, 2012 [Page 12]
Internet-Draft bgp-operational-message June 2011
3.4.3.4. Simple State Response (SSP)
TYPE: 13 - SSP
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + Single Response PRI
(Variable)
USE: Using a PRI as a response form (See Section 3.3), an
implementation uses the SSP TLV to return a response to an SSQ (See
Section 3.4.2.7) TLV which should contain information about prefixes
found in various RIBs. These RIBs should be walked to extract the
information according to local policy.
A single, simple PRI is used in the response, containing multiple
NLRI. The I, O and L bits in the PRI Flags field should be set
indicating which RIBs the prefixes were found in.
An implementation MAY respond to an SSQ TLV in with an SSP TLV
(containing the appropriate data). An implementation MAY also
respond to an SSQ with an NS TLV (with the appropriate subcode set)
indicating why there will not be an SSP TLV in response. An
implementation MAY also not respond at all (See Section 7).
If no data is found to satisfy a query which is permitted to be
answered, an implementation MAY respond with an NS TLV with the
subcode "Not Found" to indicate that no data was found in response to
the query. An implementation MUST NOT send a PRI in response with no
NLRI payload, this is considered to be invalid.
3.4.4. CONTROL TLVs
CONTROL TLVs satisfy control mechanism messaging between neighbors,
they are used for such functions as to refuse messages and
dynamically signal OPERATIONAL capabilities to neighbors during
operation.
3.4.4.1. Max Permitted (MP)
TYPE: 65534 - MP
LENGTH: 3 Octets(AFI+SAFI) + 2 Octet Value
USE: The Max Permitted TLV is used to signal to the neighbor the
maximum number of OPERATIONAL messages that will be accepted in a
second of time (see Section 7, Security Considerations), an
implementation MUST, on receipt of an MP TLV, ensure that it does not
exceed the rate specified in the MP TLV for sending OPERATIONAL
messages to the neighbor, for the duration of the session.
Freedman, et al. Expires January 1, 2012 [Page 13]
Internet-Draft bgp-operational-message June 2011
An implementation MAY send subsequent MP TLVs during the session's
lifetime, updating the maximum acceptable rate
MP TLVs MAY be rate limited by the receiver as part of OPERATIONAL
rate limiting (see Section 7, Security Considerations).
3.4.4.2. Not Satisfied (NS)
TYPE: 65535 - NS
LENGTH: 3 Octets(AFI+SAFI) + Sequence Number + 2 Octet Error Subcode
USE: To respond to a query to indicate that the implementation can or
will not answer this query. The following subcodes are defined:
0x01 - Request TLV Malformed: Used to signal to the neighbor that
the request was malformed and will not be processed. A neighbor
on receiving this message MAY re-transmit the request but MUST
increment the sequence number. Implementations SHOULD ensure that
the same request is not retransmitted excessively when repeatedly
receiving this Error Subcode in response.
0x02 - TLV Unsupported for this neighbor: Used to signal to the
neighbor that the request was unsupported and will not be
processed. A neighbor on receiving this message MUST NOT
retransmit the request for the duration of the session.
0x03 - Max query frequency exceeded: Used to signal to the neighbor
that the request has exceeded the rate at which the neighbor finds
acceptable for the implementation to transmit requests at, see
Section 3.4.4.1 (MP TLV) and Section 7 and (Security
Considerations) for more information.
0x04 - Administratively prohibited: Used to signal to the neighbor
that the request was administratively prohibited and will not be
processed. A neighbor on receiving this message MUST NOT
retransmit the request for the duration of the session.
0x05 - Busy: Used to signal to the neighbor that the request will
not be replied to, due to lack of resources estimated to satisfy
the request. It is suggested that, on receipt of this error
subcode a message is logged to inform the operator of this failure
as opposed to automatically attempting to re-try the previous
query.
Freedman, et al. Expires January 1, 2012 [Page 14]
Internet-Draft bgp-operational-message June 2011
0x06 - Not Found: Used to signal to the neighbor that the request
would have been replied to but does not contain any data (i.e the
data was not found). An implementation MUST NOT send a PRI
response with no NLRI payload, this is considered to be invalid.
NS TLVs MAY be rate limited by the receiver as part of OPERATIONAL
rate limiting (see Section 7, Security Considerations).
Freedman, et al. Expires January 1, 2012 [Page 15]
Internet-Draft bgp-operational-message June 2011
4. On the use of STATE and DUMP TLVs
The STATE TLVs use three classes of counters, defined in this
document: sent counters (TXC), received counters (RXC) and current
table state counters (LC). The table state counters (for example
number of BGP RIB entries) are exchanged only for informational
purposes and they should not be subject to comparison with any local
counter values.
Where a query of the neighbor's RXC is required to be correlated, the
local TXC coupled with the sequence number SHOULD be stored and used
to perform such a correlation. If a discrepancy is detected, an
automated or manual Route Refresh message can be triggered (utilising
Start_of_Refresh and End_of_Refresh markers) that would allow for
purge of any stalled data across two BGP databases.
It is important to note that, as BGP is never stable it is expected
that the counters will also be subject to continues value change
making any comparison of their values questionable.
The DUMP TLVs report information back to an operator about messages
which were not accepted, from machine-readable rescued UPDATE NLRI to
an entire copy of the malformed UPDATE message. These can be used
for troubleshooting purposes when such a message is transmitted and
the implementation gracefully continues (such as treat-as-withdraw).
Freedman, et al. Expires January 1, 2012 [Page 16]
Internet-Draft bgp-operational-message June 2011
5. On the use of ADVISE TLVs
The BGP routing protocol is used with external as well as internal
neighbors to propagate route advertisements. In the case of external
BGP sessions, there is typically a demarcation of administrative
responsibility between the two entities. While initial configuration
and troubleshooting of these sessions is handled via offline means
such as email or telephone calls, there is gap when it comes to
advising a BGP neighbor of a behaviour that is occurring or will
occur momentarily. There is a need for operators to transmit a
message to a BGP neighbor to notify them of a variety of types of
messages. These messages typically would include those related to a
planned or unplanned maintenance action. These ADVISE messages could
then be interpreted by the remote party and either parsed via logging
mechanisms or viewed by a human on the remote end via the CLI. This
capability will improve operator NOC-to-NOC communication by
providing a communications medium on an established and trusted BGP
session between two autonomous systems.
The reason that this method is preferred for NOC-to-NOC
communications is that other offline methods do fail for a variety of
reasons. Emails to NOC aliases ahead of a planned maintenance may
have ignored the mail or may have not of recorded it properly within
an internal tracking system. Even if the message was recorded
properly, the staff that are on-duty at the time of the maintenance
event typically are not the same staff who received the maintenance
notice several days prior. In addition, the staff on duty at the
time of the event may not even be able to find the recorded event in
their internal tracking systems. The end result is that during a
planned event, some subset of eBGP peers will respond to a session/
peer down event with additional communications to the operator who is
initiating the maintenance action. This can be via telephone or via
email, but either way, it may result in a sizeable amount of replies
inquiring as to why the session is down.
The result of this is that the NOC responsible for initiating the
maintenance can be inundated with calls/emails from a variety of
parties inquiring as to the status of the BGP session. The NOC
initiating the maintenance may have to further inquire with
engineering staff (if they are not already aware) to find out the
extent of the maintenance and communicate this back to all of the
NOCs calling for additional information. The above scenario outlines
what is typical in a planned maintenance event. In an unplanned
maintenance event (the need for and immediate router upgrade/reload),
the number of calls and emails will dramatically increase as more
parties are unaware of the event.
With the ADVISE TLV set, an operator can transmit an OPERATIONAL
Freedman, et al. Expires January 1, 2012 [Page 17]
Internet-Draft bgp-operational-message June 2011
message just prior to initiating the maintenance specifying what
event will happen, what ticket number this event is associated with
and the expected duration of the event. This message would be
received by BGP peers and stored in their logs as well as any
monitoring system if they have this capability. Now, all of the BGP
peers have immediate access to the information about this session,
why it went down, what ticket number this is being tracked under and
how long they should wait before assuming there is an actual problem.
Even smaller networks without the network management capabilities to
correlate BGP events and OPERATIONAL messages would typically have an
operator login to a router and examine the logs via the CLI.
This draft specifies two types of ADVISE TLV, a DEMAND message (ADM)
and a STATIC message (ASM), it is anticipated that the DEMAND message
will be used to send a message, on demand to the BGP neighbor, to
inform them of realtime events. The STATIC message can be used to
provide continual, "Sticky" information to the neighbor, such as a
contact telephone number or e-mail address should there be a
requirement to have continual access to this information.
Freedman, et al. Expires January 1, 2012 [Page 18]
Internet-Draft bgp-operational-message June 2011
6. Error Handling
An implementation MUST NOT send an OPERATIONAL message to a neighbor
in response to an erroneous or malformed OPERATIONAL message. Any
erroneous or malformed OPERATIONAL message received SHOULD be logged
for the attention of the operator and then MAY be discarded.
Freedman, et al. Expires January 1, 2012 [Page 19]
Internet-Draft bgp-operational-message June 2011
7. Security considerations
No new security issues are introduced to the BGP protocol by this
specification.
Where a request type is not supported or allowed by an implementation
for some reason, the implementation MAY send an NS (Section 3.4.4.2)
TLV in response, the Error subcode of this TLV SHOULD be set
according to the reason that this request will not be responded to.
Implementations MUST rate-limit the rate at which they transmit and
receive OPERATIONAL messages. Specifically, an implementation MUST
NOT allow the handling of OPERATIONAL messages to negatively impact
any other functions on a router such as regular BGP message handling
or other routing protocols.
Although an NS error subcode is provided to indicate that a request
was rate-limited, an implementation need not reply to a request at
all, this is the suggested course of action when rate-limiting the
sending of responses to a neighbor.
An implementation MAY send an MP (Section 3.4.4.1) TLV to indicate
the maximum rate at which it will accept OPERATIONAL messages from a
neighbor, upon receipt of this TLV the sender MUST ensure it does not
transmit above this rate for the duration of the session.
An implementation, considering a request to be too computationally
expensive, MAY reply with the "Busy" NS error subcode to indicate
such, though the implementation need not reply to the request.
Implementations MUST provide a mechanism for preventing access to
information requested by SSR (Section 3.4.2.7) messages for the
operator. Implementations SHOULD ensure that responses concerning
the Loc-RIB (PRI with L-Bit set or responses which would set the
L-Bit) are filtered in the default configuration.
Freedman, et al. Expires January 1, 2012 [Page 20]
Internet-Draft bgp-operational-message June 2011
8. IANA Considerations
IANA is requested to allocate a type code for the OPERATIONAL message
from the BGP Message Types registry, as well as requesting a type
code for the new OPERATIONAL Message Capability negotiation from BGP
Capability Codes registry.
This document requests IANA to define and maintain a new registry
named: "OPERATIONAL Message Type Values". The allocation policy is
on a first come first served basis.
This document makes the following assignments for the OPERATIONAL
Message Type Values:
ADVISE:
* Type 1 - Advisory Demand Message (ADM)
* Type 2 - Advisory Static Message (ASM)
STATE:
* Type 3 - Reachable Prefix Count Request (RPCQ)
* Type 4 - Reachable Prefix Count Response (RPCP)
* Type 5 - Adj-RIB-Out Prefix Count Request (APCQ)
* Type 6 - Adj-RIB-Out Prefix Count Response (APCP)
* Type 7 - Loc-Rib Prefix Count Request (LPCQ)
* Type 8 - Loc-Rib Prefix Count Response (LPCP)
* Type 9 - Simple State Request (SSQ)
DUMP:
* Type 10 - Dropped Update Prefixes (DUP)
* Type 11 - Malformed Update Prefixes (MUP)
* Type 12 - Malformed Update Dump (MUD)
* Type 13 - Simple State Response (SSP)
Freedman, et al. Expires January 1, 2012 [Page 21]
Internet-Draft bgp-operational-message June 2011
CONTROL:
* Type 65534 - Max Permitted (MP)
* Type 65535 - Not Satisfied (NS)
Freedman, et al. Expires January 1, 2012 [Page 22]
Internet-Draft bgp-operational-message June 2011
9. Acknowledgements
This memo is based on existing works [I-D.ietf-idr-advisory] and
[I-D.raszuk-bgp-diagnostic-message] which describe a number of
operational message types documented here. The authors would like to
thank Enke Chen, Bruno Decraene, Alton Lo, Tom Scholl, John Scudder
and Richard Steenbergen for their valuable input.
Freedman, et al. Expires January 1, 2012 [Page 23]
Internet-Draft bgp-operational-message June 2011
10. References
10.1. Normative References
[RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP
Communities Attribute", RFC 1997, August 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, February 2006.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760,
January 2007.
[RFC4893] Vohra, Q. and E. Chen, "BGP Support for Four-octet AS
Number Space", RFC 4893, May 2007.
[RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement
with BGP-4", RFC 5492, February 2009.
10.2. Informative References
[I-D.ietf-idr-advisory]
Scholl, T., Scudder, J., Steenbergen, R., and D. Freedman,
"BGP Advisory Message", draft-ietf-idr-advisory-00 (work
in progress), October 2009.
[I-D.jasinska-ix-bgp-route-server]
Jasinska, E., Hilliard, N., Raszuk, R., and N. Bakker,
"Internet Exchange Route Server",
draft-jasinska-ix-bgp-route-server-02 (work in progress),
March 2011.
[I-D.nalawade-bgp-inform]
Nalawade, G., Scudder, J., and D. Ward, "BGPv4 INFORM
message", draft-nalawade-bgp-inform-02 (work in progress),
August 2002.
[I-D.nalawade-bgp-soft-notify]
Nalawade, G., "BGPv4 Soft-Notification Message",
draft-nalawade-bgp-soft-notify-01 (work in progress),
July 2005.
Freedman, et al. Expires January 1, 2012 [Page 24]
Internet-Draft bgp-operational-message June 2011
[I-D.raszuk-bgp-diagnostic-message]
Raszuk, R., Chen, E., and B. Decraene, "BGP Diagnostic
Message", draft-raszuk-bgp-diagnostic-message-02 (work in
progress), March 2011.
[I-D.retana-bgp-security-state-diagnostic]
Retana, A. and R. Raszuk, "BGP Security State Diagnostic
Message", draft-retana-bgp-security-state-diagnostic-00
(work in progress), March 2011.
[I-D.shakir-idr-ops-reqs-for-bgp-error-handling]
Shakir, R., "Operational Requirements for Enhanced Error
Handling Behaviour in BGP-4",
draft-shakir-idr-ops-reqs-for-bgp-error-handling-01 (work
in progress), February 2011.
Freedman, et al. Expires January 1, 2012 [Page 25]
Internet-Draft bgp-operational-message June 2011
Authors' Addresses
David Freedman
Claranet
London
UK
Email: david.freedman@uk.clara.net
Robert Raszuk
Cisco Systems
170 West Tasman Drive
San Jose, CA 95134
US
Email: raszuk@cisco.com
Rob Shakir
Cable&Wireless Worldwide
Email: rob.shakir@cw.com
Freedman, et al. Expires January 1, 2012 [Page 26]