Skip to main content

Guidance to Avoid Carrying RPKI Validation States in Transitive BGP Path Attributes
draft-spaghetti-sidrops-avoid-rpki-state-in-bgp-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Replaced".
Authors Job Snijders , Tobias Fiebig , Massimiliano Stucchi
Last updated 2024-02-06
Replaced by draft-ietf-sidrops-avoid-rpki-state-in-bgp
RFC stream (None)
Formats
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-spaghetti-sidrops-avoid-rpki-state-in-bgp-00
Network Working Group                                        J. Snijders
Internet-Draft                                                    Fastly
Intended status: Best Current Practice                         T. Fiebig
Expires: 9 August 2024                                           MPI-INF
                                                           M. A. Stucchi
                                                             AS58280.net
                                                         6 February 2024

Guidance to Avoid Carrying RPKI Validation States in Transitive BGP Path
                               Attributes
           draft-spaghetti-sidrops-avoid-rpki-state-in-bgp-00

Abstract

   This document provides guidance to avoid carrying Resource Public Key
   Infrastructure (RPKI) derived Validation States in Transitive Border
   Gateway Protocol (BGP) Path Attributes.  Annotating routes with
   attributes signalling validation state may flood needless BGP UPDATE
   messages through the global Internet routing system, when, for
   example, Route Origin Authorizations are issued, revoked, or RPKI-To-
   Router sessions are terminated.

   Operators SHOULD ensure Validation States are not signalled in
   transitive BGP Path Attributes.  Specifically, Operators SHOULD NOT
   group BGP routes by their Prefix Origin Validation state into
   distinct BGP Communities.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 9 August 2024.

Snijders, et al.          Expires 9 August 2024                 [Page 1]
Internet-Draft           Avoid RPKI State in BGP           February 2024

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Risks of Signaling Validation State With Transitive
           Attributes  . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Triggers for Large-Scale Validation Changes . . . . . . .   4
       3.1.1.  ROA Issuance  . . . . . . . . . . . . . . . . . . . .   4
       3.1.2.  ROA Revocation  . . . . . . . . . . . . . . . . . . .   5
       3.1.3.  Validator Loss  . . . . . . . . . . . . . . . . . . .   5
       3.1.4.  Outage Scenario Summary . . . . . . . . . . . . . . .   6
     3.2.  Scaling issues  . . . . . . . . . . . . . . . . . . . . .   6
     3.3.  Cascading of BGP UPDATES  . . . . . . . . . . . . . . . .   6
     3.4.  Observed data . . . . . . . . . . . . . . . . . . . . . .   7
     3.5.  Lacking Value of Signaling Validation State . . . . . . .   7
   4.  Advantages of Dissociating Validation States and BGP Path
           Attributes  . . . . . . . . . . . . . . . . . . . . . . .   7
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   8
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   The Resource Public Key Infrastructure (RPKI) [RFC6480] allows for
   validating received routes, e.g., for their Route Origin Validation
   (ROV) state.  Some operators and vendors suggest to use distinct BGP
   Communities [RFC1997] [RFC8092] to annotate received routes with
   their validations state.  The claim is that this practice is useful,
   as validation state can be signalled, e.g., to iBGP speakers, without

Snijders, et al.          Expires 9 August 2024                 [Page 2]
Internet-Draft           Avoid RPKI State in BGP           February 2024

   requirering each iBGP speaker to perform their own route origin
   validation.

   However, annotating a route with a transitive attribute means that a
   BGP update message has to be send to each neighbor if such an
   attribute changes.  This means that when, for example, Route Origin
   Authorizations [RFC6482] are issued, revoked, or RPKI-To-Router
   [RFC8210] sessions are terminated, a BGP UPDATE message will be sent
   for a route that was previously annotated with a BGP Community.
   Furthermore, given that BGP Communities are a transitive attribute,
   this BGP UPDATE will have to propagate through the whole default free
   zone (DFZ).

   Hence, this document provides guidance to avoid carrying Resource
   Public Key Infrastructure (RPKI) [RFC6480] derived Validation States
   in Transitive Border Gateway Protocol (BGP) Path Attributes Section 5
   of [RFC4271].  Specifically, Operators are SHOULD NOT group BGP
   routes by their Prefix Origin Validation state [RFC6811] into
   distinct BGP Communities [RFC1997] [RFC8092].  Not using BGP
   Communities to signal RPKI validation state prevent needless BGP
   UPDATE messages from being flooded through the global Internet
   routing system.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Scope

   This document discusses signalling of RPKI validation state to BGP
   neighbors using transitive BGP attributes.  At the time of writing,
   this pertains to the use of BGP Communities [RFC1997] [RFC8092] to
   signal RPKI ROV using ROAs.  Note that this includes all operator
   specific BGP Communities to signal validation state, as well as any
   current or future documented well-known BGP Communities marking
   validation state, as, e.g., described for extended BGP Communities in
   [RFC8097].

Snijders, et al.          Expires 9 August 2024                 [Page 3]
Internet-Draft           Avoid RPKI State in BGP           February 2024

   However, beyond that, this document also applies to all current and
   future transitive BGP attributes that may be implicitly or explicitly
   used to signal validation state to neighbors.  Similarly, it applies
   to all future validation mechanics of RPKI, e.g., ASPA
   [I-D.ietf-sidrops-aspa-profile] and any other future validation
   mechanic build upon the RPKI.

3.  Risks of Signaling Validation State With Transitive Attributes

   This section outlines the risks of signalling RPKI Validation State
   using BGP Communities.  While the current description is specific to
   BGP communities, the observations hold similar for all transitive
   attributes that may be added to a route.  Furthermore, we will
   present data on the measured current impact of BGP Communities being
   used to signal RPKI Validation state.

3.1.  Triggers for Large-Scale Validation Changes

   Here, we describe examples for how a large amount of RPKI ROV changes
   may occur in a short time, leading to a large amount of BGP Updates
   being send.

3.1.1.  ROA Issuance

   Large-Scale ROA issuance should be a comparatively rare event for
   individual networks.  However, several cases exist where issuance by
   individual operators or (malicious) coordinated issuance of ROAs by
   multiple operators may lead to a high churn triggering a continuous
   flow of BGP Update messages caused by operators using transitive BGP
   attributes to signal RPKI validation state.

   Specifically:

   *  When one large operator newly starts issuing ROAs for their
      netblocks, possibly by issuing one ROA with a long maxLength
      covering a large number of prefixes.  This may also occur when
      incorrectly migrating to minimally covering ROAs [RFC9319], i.e.,
      when the previous ROA is first revoked (see Section 3.1.2) and the
      new ROAs are only issued after this revocation has been
      propagated, e.g., due to an operational error or bug in the
      issuance pipeline used by the operator.

   *  When multiple smaller operators coordinate to issue new ROAs at
      the same time.

   *  When a CA has been unavailable or unable to publish for some time,
      but then publishes all updates at once, or--as unlikely as it is--
      if a key-rollover encounters issues.

Snijders, et al.          Expires 9 August 2024                 [Page 4]
Internet-Draft           Avoid RPKI State in BGP           February 2024

3.1.2.  ROA Revocation

   Large-Scale ROA revocation should be a comparatively rare event for
   individual networks.  However, several cases exist where revocations
   by individual operators or (malicious) coordinated revocation of ROAs
   by multiple operators may lead to a high churn triggering a
   continuous flow of BGP Update messages caused by operators using
   transitive BGP attributes to signal RPKI validation state.

   Specifically:

   *  When one large operator revokes all ROAs for their netblocks at
      once, for example, when migrating to minimally covering ROAs
      [RFC9319], or when revoking one ROA with a long maxLength covering
      a large number of prefixes.

   *  When multiple smaller operators coordinate to revoke ROAs at the
      same time.

   *  When a CA becomes unavailable or unable to publish for some time,
      e.g., due to the CA expiring ([CA-Outage1], [CA-Outage2],
      [CA-Outage3], [CA-Outage4]).

3.1.3.  Validator Loss

   Similar to the issuance/revocation of routes, the validation pipeline
   of an operator may encounter issues.  For example, any of the
   following events may lead to RTR services used by an operator no
   longer providing validation state to routers, leading to routes
   changing from VALID to UNKNOWN:

   *  The RTR service may have to be taken offline due to local issues
      ([CVE-2021-3761], [CVE-2021-41531], [CVE-2021-43114]), or, even
      worse, a misconfiguration may lead to the service flapping, e.g.,
      when the system runs out of memory after a few minutes of
      communicating validation state to routers.

   *  Validation state may seemingly lapse due to issues with time
      syncronization if, e.g., the clock of the validator diverts
      significantly, starting to consider CA's certificates invalid.

   *  Multiple operators use one central RTR service hosted by an
      external party, or depend on a similar validator, which becomes
      unavailable, e.g., due to maintenance or an outage, and local
      instances are not able to handle loss of this external service
      without changing validation state, i.e., do not serve from cache.

Snijders, et al.          Expires 9 August 2024                 [Page 5]
Internet-Draft           Avoid RPKI State in BGP           February 2024

3.1.4.  Outage Scenario Summary

   The above non-exhaustive listing suggests that issues in general
   operations, CA operations, and RPKI cache implementations simply are
   unavoidable.  Hence, Operators MUST plan and design accordingly.

3.2.  Scaling issues

   Following an RPKI service affecting outage (Section 3.1), and
   considering roughly half the global Internet routing table nowadays
   is covered by RPKI ROAs [NIST], any Autonomous System in which the
   local routing policy sets a BGP Community based on the ROV-Valid
   validation state, would need to send BGP UPDATE messages for roughly
   half the global Internet routing table if the validation state
   changes to ROV-NotFound.  The same, reversed case, would be true for
   every new ROA created by the address space holders, whereas a new BGP
   update would be generated, as the validation state would change to
   ROV-Valid.

   As the global Internet routing table currently contains close to
   1,000,000 prefixes [CIDR_Report], such convergence events represent a
   significant burden.  See [How-to-break] for an elaboration on this
   phenomenon.

   Furthermore, adding additional attributes to routes increases their
   size and memory consumption in the RIB of BGP routers.  Given the
   continuous growth of the global routing table, operators should be--
   in general--conservative regarding the additional information they
   add to routes.

3.3.  Cascading of BGP UPDATES

   The aforementioned issues that may lead to changes in validation
   state for a large number of routes are not confined to singular
   UPDATE events.  Instead, given that routers' view of the RPKI with
   RTR is only eventually consistent, update messages may cascade, i.e.,
   one change in validation state may actually trigger multiple
   subsequent BGP UPDATE storms.  If, for example, AS65536 is a
   downstream of AS65537 (both annotating validation state with BGP
   Communities), and a major CA fails, but AS65537 has their validator's
   cache updated before AS65536, AS65536 will first receive updates for
   all formerly valid routes learned from AS65537 when validation state
   changes there, and propagate these down its cone.  Then, when the
   cache of AS65536 is updated as well, the community of AS65536 will
   again change for these routes, while also being propagated down the
   cone again.

Snijders, et al.          Expires 9 August 2024                 [Page 6]
Internet-Draft           Avoid RPKI State in BGP           February 2024

3.4.  Observed data

   In February 2024, a data-gathering initiative [Side-Effect] reported
   that between 8% and 10% of BGP updates seen on the Routing
   Information Service - RIS, contained well-known communities from
   large ISPs signalling either ROV-NotFound or ROV-Valid BGP Validation
   states.  The study also demonstrated that the creation or removal of
   a ROA object triggered a chain of updates in a period of circa 1 hour
   following the change.

   Such a high percentage of unneeded BGP updates constitutes a
   considerable level of noise, impacting the capacity of the global
   routing system while generating load on router CPUs and occupying
   more RAM than necessary.  Keeping this information inside the realms
   of the single autonomous system would help reduce the burden on the
   rest of the global routing platform, reducing workload and noise.

3.5.  Lacking Value of Signaling Validation State

   RTR has been developped to communicate validation information to
   routers.  BGP Attributes are not signed, and provide no assurance
   against third parties adding them, apart from BGP communities--
   ideally--being filtered at a networks edge.  So, even in iBGP
   scenarios, their benefit in comparison to using RTR on all BGP
   speakers is limited.

   For eBGP, given they are not signed, they provide even less
   information to other parties except introspection into an ASes
   internal validation mechanics.  Crucially, they provide no actionable
   information for BGP neighbors.  If an AS validates and enforces based
   on RPKI, INVALID routes should never be imported and, hence, never be
   send to neighbors.  Hence, the argument that adding validation state
   to communities enables, e.g., downstreams to filter RPKI INVALID
   routes is mute, as the only routes a downstream should see are
   UNKNOWN and VALID.  Furthermore, in any case, the operators SHOULD
   run their own validation infrastructure and not rely on centralized
   services or attributes communicated by their neighbors.  Everything
   else circumvents the purpose of RPKI.

4.  Advantages of Dissociating Validation States and BGP Path Attributes

   As outlined in Section 3, signalling validation state with transitive
   attributes carries significant risks for the stability of the global
   routing ecosystem.  Not signalling validation state, hence, has
   tangible benefits, specifically:

   *  Reduction of memory consumption on customer/peer facing PE routers
      (less BGP communities == less memory pressure).

Snijders, et al.          Expires 9 August 2024                 [Page 7]
Internet-Draft           Avoid RPKI State in BGP           February 2024

   *  No effect on the age of a BGP route when a ROA or ASPA
      [I-D.ietf-sidrops-aspa-profile] is issued or revoked.

   *  Avoids having to resend, e.g., more than 500,000 BGP routes
      towards BGP neighbors (for the own cone to peers and upstreams,
      for the full table towards customers) if the RPKI cache crashes
      and RTR sessions are terminated, or if flaps in validation are
      caused by other events.

   Hence, operators SHOULD NOT signal RPKI validation state using
   transitive BGP attributes.

5.  Security Considerations

   The use of transitive attributes to signal RPKI validation state may
   enable attackers to cause notable route churn by issuing and
   widthdrawing, e.g., ROAs for their prefixes.  DFZ routers may not be
   equiped to handle churn in all directions at global scale, especially
   if said churn cascades or repeats periodically.

   To prevent this, operators SHOULD NOT signal validation state to
   neighbors.  Furthermore, validation state signaling SHOULD NOT be
   accepted from a neighbor AS.  Instead, the validation state of a
   received announcement has only local scope due to issues such as
   scope of trust and RPKI synchrony.

6.  IANA Considerations

   None.

7.  Acknowledgements

   The authors would like to thank ...  and ...  for their helpful
   review of this document.

8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

Snijders, et al.          Expires 9 August 2024                 [Page 8]
Internet-Draft           Avoid RPKI State in BGP           February 2024

8.2.  Informative References

   [CA-Outage1]
              ARIN, "RPKI Service Notice Update", August 2020,
              <https://www.arin.net/announcements/20200813/>.

   [CA-Outage2]
              RIPE NCC, "Issue affecting rsync RPKI repository
              fetching", April 2021,
              <https://www.ripe.net/ripe/mail/archives/routing-
              wg/2021-April/004314.html>.

   [CA-Outage3]
              Snijders, J., "problemas con el TA de RPKI de LACNIC",
              April 2023, <https://mail.lacnic.net/pipermail/
              lacnog/2023-April/009471.html>.

   [CA-Outage4]
              Snijders, J., "AFRINIC RPKI VRP graph for November 2023 -
              heavy fluctuations affecting 2 members", November 2023,
              <https://lists.afrinic.net/pipermail/
              dbwg/2023-November/000493.html>.

   [CIDR_Report]
              Huston, G., "CIDR REPORT", January 2024,
              <https://www.cidr-report.org/as2.0/>.

   [CVE-2021-3761]
              "OctoRPKI lacks contextual out-of-bounds check when
              validating RPKI ROA maxLength values", September 2021,
              <https://github.com/cloudflare/cfrpki/security/advisories/
              GHSA-c8xp-8mf3-62h9>.

   [CVE-2021-41531]
              NLnet Labs, "Routinator prior to 0.10.0 produces invalid
              RTR payload if an RPKI CA uses too large values in the
              max-length parameter in a ROA", September 2021,
              <https://www.nlnetlabs.nl/downloads/routinator/CVE-
              2021-41531.txt>.

   [CVE-2021-43114]
              FORT project, "FORT Validator versions prior to 1.5.2 will
              crash if an RPKI CA publishes an X.509 EE certificate",
              November 2021, <https://cve.mitre.org/cgi-bin/
              cvename.cgi?name=CVE-2021-43114>.

Snijders, et al.          Expires 9 August 2024                 [Page 9]
Internet-Draft           Avoid RPKI State in BGP           February 2024

   [How-to-break]
              Snijders, J., "How to break the Internet: a talk about
              outages that never happened", CERN Academic Training
              Lecture Regular Programme; 2021-2022, March 2022,
              <https://cds.cern.ch/record/2805326>.

   [I-D.ietf-sidrops-aspa-profile]
              Azimov, A., Uskov, E., Bush, R., Snijders, J., Housley,
              R., and B. Maddison, "A Profile for Autonomous System
              Provider Authorization", Work in Progress, Internet-Draft,
              draft-ietf-sidrops-aspa-profile-17, 7 November 2023,
              <https://datatracker.ietf.org/doc/html/draft-ietf-sidrops-
              aspa-profile-17>.

   [NIST]     NIST, "NIST RPKI Monitor", January 2024,
              <https://rpki-monitor.antd.nist.gov/>.

   [RFC1997]  Chandra, R., Traina, P., and T. Li, "BGP Communities
              Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996,
              <https://www.rfc-editor.org/info/rfc1997>.

   [RFC4271]  Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
              Border Gateway Protocol 4 (BGP-4)", RFC 4271,
              DOI 10.17487/RFC4271, January 2006,
              <https://www.rfc-editor.org/info/rfc4271>.

   [RFC6480]  Lepinski, M. and S. Kent, "An Infrastructure to Support
              Secure Internet Routing", RFC 6480, DOI 10.17487/RFC6480,
              February 2012, <https://www.rfc-editor.org/info/rfc6480>.

   [RFC6482]  Lepinski, M., Kent, S., and D. Kong, "A Profile for Route
              Origin Authorizations (ROAs)", RFC 6482,
              DOI 10.17487/RFC6482, February 2012,
              <https://www.rfc-editor.org/info/rfc6482>.

   [RFC6811]  Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R.
              Austein, "BGP Prefix Origin Validation", RFC 6811,
              DOI 10.17487/RFC6811, January 2013,
              <https://www.rfc-editor.org/info/rfc6811>.

   [RFC8092]  Heitz, J., Ed., Snijders, J., Ed., Patel, K., Bagdonas,
              I., and N. Hilliard, "BGP Large Communities Attribute",
              RFC 8092, DOI 10.17487/RFC8092, February 2017,
              <https://www.rfc-editor.org/info/rfc8092>.

Snijders, et al.          Expires 9 August 2024                [Page 10]
Internet-Draft           Avoid RPKI State in BGP           February 2024

   [RFC8097]  Mohapatra, P., Patel, K., Scudder, J., Ward, D., and R.
              Bush, "BGP Prefix Origin Validation State Extended
              Community", RFC 8097, DOI 10.17487/RFC8097, March 2017,
              <https://www.rfc-editor.org/info/rfc8097>.

   [RFC8210]  Bush, R. and R. Austein, "The Resource Public Key
              Infrastructure (RPKI) to Router Protocol, Version 1",
              RFC 8210, DOI 10.17487/RFC8210, September 2017,
              <https://www.rfc-editor.org/info/rfc8210>.

   [RFC9319]  Gilad, Y., Goldberg, S., Sriram, K., Snijders, J., and B.
              Maddison, "The Use of maxLength in the Resource Public Key
              Infrastructure (RPKI)", BCP 185, RFC 9319,
              DOI 10.17487/RFC9319, October 2022,
              <https://www.rfc-editor.org/info/rfc9319>.

   [Side-Effect]
              Stucchi, M., "A BGP Side Effect of RPKI", February 2024,
              <https://labs.ripe.net/author/stucchimax/a-bgp-side-
              effect-of-rpki/>.

Authors' Addresses

   Job Snijders
   Fastly
   Amsterdam
   Netherlands
   Email: job@fastly.com

   Tobias Fiebig
   Max-Planck-Institut fuer Informatik
   Campus E14
   66123 Saarbruecken
   Germany
   Phone: +49 681 9325 3527
   Email: tfiebig@mpi-inf.mpg.de

   Massimiliano Stucchi
   AS58280.net
   CH- Bruettisellen
   Switzerland
   Email: max@stucchi.ch

Snijders, et al.          Expires 9 August 2024                [Page 11]