Skip to main content

BGP Extension for 5G Edge Service Metadata
draft-ietf-idr-5g-edge-service-metadata-07

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Active".
Authors Linda Dunbar , Kausik Majumdar , Haibo Wang , Gyan Mishra , Zongpeng Du
Last updated 2023-08-09 (Latest revision 2023-07-31)
RFC stream Internet Engineering Task Force (IETF)
Formats
Additional resources Mailing list discussion
Stream WG state WG Document
Document shepherd (None)
IESG IESG state I-D Exists
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-ietf-idr-5g-edge-service-metadata-07
Network Working Group                                   L. Dunbar
Internet Draft                                          Futurewei
Intended status: standard track                        K. Majumdar
Expires: February 9, 2024                              Microsoft
                                                          H. Wang
                                                           Huawei
                                                        G. Mishra
                                                          Verizon
                                                            Z. Du
                                                     China Mobile
                                                   August 9, 2023

             BGP Extension for 5G Edge Service Metadata
             draft-ietf-idr-5g-edge-service-metadata-07

Abstract

   This draft describes a new Metadata Path Attribute and some
   Sub-TLVs for egress routers to advertise the Metadata about
   the attached edge services (ES). The Edge Service Metadata can
   be used by the ingress routers in the 5G Local Data Network to
   make path selections not only based on the routing cost but
   also the running environment of the edge services. The goal is
   to improve latency and performance for 5G edge services.

   The extension enables an edge service at one specific location
   to be more preferred than the others with the same IP address
   (ANYCAST) to receive data flow from a specific source, like a
   specific User Equipment (UE).

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79. This document may not be
   modified, and derivative works of it may not be created,
   except to publish it as an RFC and to translate it into
   languages other than English.

   Internet-Drafts are working documents of the Internet
   Engineering Task Force (IETF), its areas, and its working
   groups.  Note that other groups may also distribute working
   documents as Internet-Drafts.

xxx, et al.            Expires February 9, 2024          [Page 1]
Internet-Draft    BGP extension for 5G Edge Services

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-
   Drafts as reference material or to cite them other than as
   "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed
   at http://www.ietf.org/shadow.html

   This Internet-Draft will expire on April 7, 2021.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as
   the document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date
   of publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document. Code Components extracted from this
   document must include Simplified BSD License text as described
   in Section 4.e of the Trust Legal Provisions and are provided
   without warranty as described in the Simplified BSD License.

Table of Contents

   1. Introduction.............................................. 3
   2. Conventions used in this document......................... 4
   3. BGP Protocol Extension for Edge Service Metadata.......... 5
      3.1. Ingress Node BGP Path Selection Behavior............. 6
         3.1.1. Edge Service Metadata Influenced BGP Path
         Selection.............................................. 6
         3.1.2. Ingress Router Forwarding Behavior.............. 6
         3.1.3. Forwarding Behavior when UEs moving to new 5G
         Sites.................................................. 6
   4. Edge Service Metadata Encoding............................ 7

Dunbar, et al.         Expires February 9, 2024          [Page 2]
Internet-Draft    BGP extension for 5G Edge Services

      4.1. Metadata Path Attribute.............................. 7
      4.2. The Site Preference Index Sub-TLV format............. 8
      4.3. Capacity Availability Index Metadata................. 9
         4.3.1. Site Index Associated to Routes................ 11
         4.3.2. BGP UPDATE with standalone Site Availability
         Index................................................. 11
      4.4. Service Delay Prediction Index...................... 11
         4.4.1. Service Delay Prediction Sub-TLV............... 12
         4.4.2. Service Delay Prediction Based on Load
         Measurement........................................... 13
         4.4.3. Raw Load Measurement Sub-TLV................... 15
   5. Service Metadata Influenced Decision Process............. 15
      5.1. Integrating Network Delay with the Service Metrics.. 15
      5.2. Integrating with BGP decision process............... 16
   6. Service Metadata Propagation Scope....................... 18
   7. Minimum Interval for Metrics Change Advertisement........ 18
   8. Validation and Error Handling............................ 18
   9. Manageability Considerations............................. 19
   10. Security Considerations................................. 19
   11. IANA Considerations..................................... 19
      11.1. Metadata Path Attribute............................ 19
      11.2. Metadata Path Attribute Sub-Types.................. 20
   12. References.............................................. 20
      12.1. Normative References............................... 20
      12.2. Informative References............................. 21
   13. Appendix A.............................................. 22
      13.1. Example of Flow Affinity........................... 22
   14. Acknowledgments......................................... 23

1. Introduction

   [5G-Edge-Service] describes the 5G Edge Computing background
   and how BGP can be used to advertise the running status and
   environment of the directly attached 5G edge services. Besides
   the Radio Access, 5G is characterized by having edge services
   closer to the Cell Towers reachable by Local Data Networks
   (LDN) [3GPP TS 23.501]. From IP network perspective, the 5G
   LDN is a limited domain [RFC8799] with edge services a few
   hops away from the ingress nodes. Only selective UE services
   are considered as 5G low latency Edge Services.

   This document describes a new Metadata Path Attribute added to
   a BGP UPDATE message [RFC4271] for egress routers to advertise
   the Metadata about the directly attached edge services. The

Dunbar, et al.         Expires February 9, 2024          [Page 3]
Internet-Draft    BGP extension for 5G Edge Services

   Edge Service Metadata in this document includes the site
   availability index, the site preference, and the service delay
   prediction index, which are further explained in Section 4.

   Note: The proposed Edge Service Metadata are not intended for
   the best-effort services reachable via the public internet.
   The Edge Service Metadata can be used by the ingress routers
   to make path selections for selective low latency services
   based on not only the network distance but also the running
   environment of the edge cloud sites. The goal is to improve
   latency and performance for 5G ultra-low latency services.

   The extension is targeted for a single domain with RR
   controlling the propagation of the BGP UPDATE.  The Edge
   Service Metadata is only attached to the services (routes)
   hosted in the 5G edge cloud sites, which are only a small
   subset of services initiated from UEs. E.g., not for UEs
   accessing many internet sites.

2. Conventions used in this document

   Application Server: An application server is a physical or
               virtual server that hosts the software system for
               the application.

   Application Server Location: represents a cluster of servers
               at one location serving the same application. One
               application may have a Layer 7 Load balancer,
               whose address(es) are reachable from an external
               IP network, in front of a set of application
               servers. From an IP network perspective, this
               whole group of servers is considered as the
               Application Server at the location.

   Edge Application Server: is used interchangeably with
               Application Server throughout this document.

   Edge Hosting Environment: An environment providing the support
               required for Edge Application Server's execution.

               NOTE: The above terminologies are the same as
               those used in 3GPP TR 23.758

Dunbar, et al.         Expires February 9, 2024          [Page 4]
Internet-Draft    BGP extension for 5G Edge Services

   Edge DC:    Edge Data Center, which provides the Hosting
               Environment for the edge services. An Edge DC
               might host 5G core functions in addition to the
               frequently used application servers.

   gNB         next generation Node B

   RTT:        Round-trip Time

   PSA:        PDU Session Anchor (UPF)

   SSC:        Session and Service Continuity

   UE:         User Equipment

   UPF:        User Plane Function

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
   RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
   interpreted as described in BCP 14 [RFC8174] when, and only
   when, they appear in all capitals, as shown here.

3. BGP Protocol Extension for Edge Service Metadata

    The goal of this Edge Service Metadata Path Attribute is for
    egress routers to propagate the metrics about their running
    environment to ingress routers. Here are some examples of the
    metrics propagated by the egress routers:
    - the site capacity availability index,
    - the site preference index, and
    - the service delay predication index for the attached edge
      services.

    This section specifies how these three types of Metadata
    impact the ingress nodes' path selections.

Dunbar, et al.         Expires February 9, 2024          [Page 5]
Internet-Draft    BGP extension for 5G Edge Services

 3.1. Ingress Node BGP Path Selection Behavior

 3.1.1. Edge Service Metadata Influenced BGP Path Selection

   When an ingress router receives BGP updates for the same IP
   prefix from multiple egress routers, all these egress routers'
   loopback addresses are considered as the next hops for the IP
   prefix. For the selected low latency edge services, the
   ingress router's BGP engine would call an Edge Service
   Management function that can select paths based on the Edge
   Service Metadata received. [5G-Edge-Service] has an exemplary
   algorithm to compute the weighted path cost based on the Edge
   Service Metadata carried by the Sub-TLV(s) specified in this
   document.

   Section 5 has the detailed description of the Edge Service
   Metadata influenced optimal path selection.

 3.1.2. Ingress Router Forwarding Behavior

   When the ingress router receives a packet and does a lookup on
   the route in the FIB, it gets the destination prefix's whole
   path. It encapsulates the packet destined towards the optimal
   egress node.

   For subsequent packets belonging to the same flow, the ingress
   router needs to forward them to the same egress router unless
   the selected egress router is no longer reachable. Keeping
   packets from one flow to the same egress router, a.k.a. Flow
   Affinity, is supported by many commercial routers. Most
   registered EC services have relatively short flows.

   How Flow Affinity is implemented is out of the scope for this
   document. Appendix A has one example illustrating achieving
   flow affinity.

 3.1.3. Forwarding Behavior when UEs moving to new 5G Sites

   When a UE moves to a new 5G gNB which is anchored to the same
   UPF, the packets from the UE traverse to the same ingress
   router. Path selection and forwarding behavior are same as
   before.

Dunbar, et al.         Expires February 9, 2024          [Page 6]
Internet-Draft    BGP extension for 5G Edge Services

   If the UE maintains the same IP address when anchored to a new
   UPF, the directly connected ingress router might use the
   information passed from a neighboring router to derive the
   optimal Next Hop for this route. The detailed algorithm is out
   of the scope of this document.

4. Edge Service Metadata Encoding

4.1. Metadata Path Attribute

   The Metadata Path Attribute is an optional transitive BGP Path
   attribute to carry metrics and metadata about the edge
   services attached to the egress router. The Metadata Path
   Attribute, to be assigned by IANA [RFC2042], consists of a set
   of Sub-TLVs, and each Sub-TLV contains information for
   specific metrics of the edge services.

   Most BGP UPDATE messages don't include the Metadata Path
   Attribute. For the limited edge services that need to
   advertise the metadata about the services, the Metadata Path
   Attribute can be included in a BGP UPDATE message [RFC4271]
   together with other BGP Path Attributes [IANA-BGP-PARAMS],
   such as Communities [RFC4360], NEXT_HOP, Tunnel Encapsulation
   Path Attribute [RFC9012], etc. The metrics Sub-TLVs included
   in the Metadata Path Attribute apply to all the address
   families carried in the NLRI field of the BGP UPDATE message
   [RFC4271]. For a multi-protocol BGP UPDATE message [RFC4760]
   [RFC7606], the metrics Sub-TLVs included in the Metadata Path
   Attribute apply to all the AFIs/SAFIs address families carried
   by the MP_REACH_NLRI.

   A BGP UPDATE message that includes the Metadata Path Attribute
   doesn't change the BGP Error Handling procedure specified in
   the [RFC7606]. When a value in one of the Sub-TLVs within the
   Metadata Path Attribute is out of its specified ranges, only
   the Sub-TLV is ignored by the BGP receiver; all other Sub-TLVs
   within the Metadata path Attribute are still valid.

Dunbar, et al.         Expires February 9, 2024          [Page 7]
Internet-Draft    BGP extension for 5G Edge Services

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |   Attr. Flags |MetadataPathAtt|        Length (2 Octets)      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     |         Value (multiple Metadata Sub-TLVs)                    |
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
              Figure 1: Metadata Path Attribute

   Attr. Flags are defined as:
     o The high-order bit (bit 0): set to 1.
     o The second high-order bit (bit 1): set to 0 to indicate
        that the service-metadata is not transitive. Only
        intended for the receiving router.
     o The third high-order bit (bit 2): same as specified by
        RFC4721.
     o The fourth high-order bit (bit 3): set to 1 to indicate
        there are two octets for the Length field.

   MetadataPathAtt: Metadata Path Attribute: TBD1(assigned by
   IANA.)

   Length (2 octets): the total number of octets of the value
   field.

   Value (variable): comprised of multiple Sub-TLVs.

   The Metadata Sub-TLVs specified by this document include the
   following: the Capacity Availability Index Value, the Site
   Preference Index Value, the Service Delay Predication Index,
   and the Load Measurement. One or more Metadata Sub-TLVs can be
   included in a Metadata Path Attribute in one BGP UPDATE
   message.

   All values in the Sub-TLVs are unsigned 32 bits integers.

4.2. The Site Preference Index Sub-TLV format

Dunbar, et al.         Expires February 9, 2024          [Page 8]
Internet-Draft    BGP extension for 5G Edge Services

   Different services might have different preference index
   values configured for the same site. For example, Service-A
   requires high computing power, Service-B requires high
   bandwidth among its microservices, and Service-C requires high
   volume storage capacity. For a DC with relatively low storage
   capacity but high bisectional bandwidth, its preference index
   value for Service-B is higher and lower for Service-C. Site
   Preference Index can also be used to achieve stickiness for
   some services.

   It is out of the scope of this document how the preference
   index is determined or configured.

   The Preference Index Sub-TLV has the following format:

      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |Site-Preference-Index Sub-Type |               Length          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   Preference Index value                      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    Figure 2: Preference Index Sub-TLV

  - Site-Preference-Index Sub-Type =1 (specified in this
     document).

  - Preference Index value: 1-100, with 1 being the least
     preferred, and 100 being the most preferred.

     When the Preference Index value is outside the range of 1-
     100, the value carried in this Sub-TLV is ignored.

4.3. Capacity Availability Index Metadata

   Capacity Availability Index indicates if an edge site, which
   can be a building, a floor, a pod, a row of server racks,
   etc., has full capacity, reduced capacity, or is completely
   out of service. Therefore, the value is 0-100, with 100%
   indicating the site is fully functional, 0% indicating the
   site is entirely out of service, and 50% indicating the site
   is 50% degraded.

Dunbar, et al.         Expires February 9, 2024          [Page 9]
Internet-Draft    BGP extension for 5G Edge Services

   Cloud Site/Pod failures and degradation include but are not
   limited to, a site capacity degradation or an entire site
   going down caused by a variety of reasons, such as fiber cut
   connecting to the site or among pods, cooling failures,
   insufficient backup power, cyber threats attacks, too many
   changes outside of the maintenance window, etc. Fiber-cut is
   not uncommon within a Cloud site or between sites.

   When those failure events happen, the edge (egress) router is
   running fine. Therefore, the ingress routers with paths to the
   egress router can't use BFD to detect the failures.

   When there is a failure occurring at an edge site (or a pod),
   many instances can be impacted. In addition, the routes (i.e.,
   the IP addresses) in the site might not be aggregated nicely.
   Instead of many BGP UPDATE messages to the ingress routers for
   all the instances impacted, the egress router can send one
   single BGP UPDATE indicating the capacity availability of the
   site. The ingress routers can switch all or a portion of the
   instances that are associated with the site depending on how
   much the site is degraded.

   The Capacity Availability Index Sub-TLV:

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      CapAvailIdx Sub-Type     |         Reserved              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        Site-ID (2 octets)     | Site Availability Percentage  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          Figure 3: Capacity Availability Index Sub-TLV

  - CapAvailIdx (Capability-Availability-Index) Sub-Type = 2
     (Specified in this document).

  - Site ID: identifier for a site, which can be a pod, a row of
     server racks, a floor, or an entire DC. There could be
     multiple sites connected to the egress router (a.k.a. Edge DC
     GW)

  - Site Availability Percentage: represent the percentage of the
     site availability, e.g., 100%, 50%, or 0%. When a site goes
     dark, the Index is set to 0.  50 means 50% capacity

Dunbar, et al.         Expires February 9, 2024         [Page 10]
Internet-Draft    BGP extension for 5G Edge Services

     functioning. When the value is outside the 0-100% range, the
     value carried in this Sub-TLV is ignored.

4.3.1. Site Index Associated to Routes

  An egress router must append the Site Capacity Availability
  Index Sub-TLV with a BGP ROUTE UPDATE message for the
  registered low latency edge services so that the ingress
  routers can associate the Site reference Identifier to the
  route in the Routing table.

  However, it is unnecessary to include the Site Capacity
  Availability Index for every BGP Update message if there is no
  change to the site-reference identifier or the Capacity
  Availability value for the service instances.

4.3.2. BGP UPDATE with standalone Site Availability Index

  When an ingress router receives a BGP update message from
  Router-X with a prefix of the loopback for Router-X and the
  Metadata Path Attribute with the Capability Availability Index
  Sub-TLV, the new capability availability index value is
  applied to all route that have the following two constraints:
  a) have router-X as their next hop, and b) associated with
  site-ID. When there are failures or degradation to a site, the
  corresponding egress router can send one BGP UPDATE with the
  Capacity Availability Site Index with the egress router's
  loopback address.

   4.4. Service Delay Prediction Index

  It is desirable for an ingress router to select a site with
  the shortest processing time for an ultra-low latency service.
  But it is not easy to predict which site has "the fastest
  processing time" or "the shortest processing delay" for an
  incoming service request because:

  - The given service instance shares the same physical
     infrastructure with many other applications & service
     instances. Service requests by other applications, UEs, or
     applications running behavior can impact the processing time
     for the given service instance.

Dunbar, et al.         Expires February 9, 2024         [Page 11]
Internet-Draft    BGP extension for 5G Edge Services

  - The given service instance can be served by a cluster of
     servers behind a Load Balancer. To the network, the service
     is identified by one service ID.
  - The service complexity is different. One service may call
     many microservices, need to access multiple backend
     databases, and need to go through sophisticated security
     scrubbing functions, etc. Another service can be processed
     by a few simple steps. Without the application internal
     logic, it is not easy to estimate the processing time for
     future service requests.

   Even though utilization measurements, like those below, are
   collected by most data centers, they cannot indicate which
   site has the shortest processing time. A service request might
   be processed faster on Site-A even if Site-A is overutilized.
     o Server utilization for the server where the instance is
        instantiated.
     o The network utilization for the links to the server where the
        instance is instantiated.
     o The number of databases that the service instance will access.
     o The memory utilization of the databases

   The remaining available resource at a site is a more
   reasonable indication of process delay for future service
   requests.
     o The remaining available Server resources.
     o The remaining available network utilization for the links to
        the server where the instance is instantiated.
     o The number of databases that the service instance will access.
     o The remaining storage available for the databases.

   The Service Delay Prediction Index is a value that predicts
   processing delays at the site for future service requests. The
   higher the value, the longer of the delay.

4.4.1. Service Delay Prediction Sub-TLV

   While out of scope, we assume there is an algorithm that can
   derive the Service Delay Prediction Index that can be assigned
   to the egress router. When the Service Delay Prediction value
   is updated, which can be triggered by the available resources
   change, etc., the egress router can attach the updated Service
   Delay Predication value in a Sub-TLV under the Metadata Path

Dunbar, et al.         Expires February 9, 2024         [Page 12]
Internet-Draft    BGP extension for 5G Edge Services

   Attribute of the BGP Route UPDATE message to the ingress
   routers.

      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | ServiceDelayPredict Sub-Type  |               Length          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |         Service Delay Predication Value                       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          Figure 4: Service Delay Prediction Index Sub-TLV

  - ServiceDelayPredict(Service Delay Predication) Sub-type=3
     (specified in this document).

  - The Service Delay Predication Value is an integer 0-100,
     with 0 indicating that the service delay is negligible and
     100 indicating that the site has the most significant delay
     compared to all other sites for the same service. When the
     value is outside the 0-100 range, the value carried in this
     Sub-TLV is ignored.

4.4.2. Service Delay Prediction Based on Load Measurement

   When data centers detailed running status are not exposed to
   the network operator, historic traffic patterns through the
   egress nodes can be utilized to predict the load to a specific
   service. For example, when traffic volume to one service at
   one data center suddenly increases a huge percentage compared
   with the past 24 hours average, it is likely caused by a
   larger than normal demand for the service. When this happens,
   another data center with lower-than-average traffic volume for
   the same service might have a shorter processing time for the
   same service.

   Here are some measurements that can be utilized to derive the
   Service Delay Predication for a service ID:

     - Total number of packets to the attached service instance
        (ToPackets);

Dunbar, et al.         Expires February 9, 2024         [Page 13]
Internet-Draft    BGP extension for 5G Edge Services

     - Total number of packets from the attached service
        instance (FromPackets);

     - Total number of Bytes to the attached service instance
        (ToBytes);

     - Total number of bytes from the attached service instance
        (FromBytes);

     - The actual load measurement to the service instance
        attached to a CATS-ER can be based on one of the metrics
        above or including all four metrics with different
        weights applied to each, such as:

     - LoadIndex =
        w1*ToPackets+w2*FromPackes+w3*ToBytes+w4*FromBytes

     - Where 0<= wi <=1 and w1+ w2+ w3+ w4 = 1.

     - The weights of each metric contributing to the index of
        the service instance attached to a CATS-ER can be
        configured or learned by self-adjusting based on user
        feedbacks.

   The Service Delay Prediction Index can be derived from
   LoadIndex/24Hour-Average. A higher value means a longer delay
   prediction. The egress router can use the ServiceDelayPred
   sub-TLV to indicate to the ingress routers of the delay
   prediction derived from the traffic pattern.

   Note: The proposed IP layer load measurement is only an
   estimate based on the amount of traffic through the egress
   router, which might not truly reflect the load of the servers
   attached to the egress routers. They are listed here only for
   some special deployments where those metrics are helpful to
   the ingress routers in selecting the optimal paths.

Dunbar, et al.         Expires February 9, 2024         [Page 14]
Internet-Draft    BGP extension for 5G Edge Services

4.4.3. Raw Load Measurement Sub-TLV

   When ingress routers have embedded analytics tool relying on
   the raw measurements, it is useful for the egress router to
   send the raw measurement.

   Raw Load Measurement Sub-TLV has the following format:

     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Raw-Load-Measurement Sub-Type |               Length          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   Measurement Period                          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           total number of packets to the Edge Service         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           total number of packets from the Edge Service       |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           total number of bytes to the Edge Service           |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |           total number of bytes from the Edge Service         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                  Figure 5: Raw Load Measurement Sub-TLV

     Raw-Load-Measurement Sub-Type =4 (specified in this
     document): Raw measurements of packets/bytes to/from the
     Edge Service address.

     The receiver nodes can compute the Service Delay Prediction
     for the Service based on the raw measurements sent from the
     egress node and preconfigured algorithms.

     Measurement Period: BGP Update period in Seconds or user-
     specified period.

5. Service Metadata Influenced Decision Process

5.1. Integrating Network Delay with the Service Metrics

   As the service metrics and network delays are in different
   units, here is an exemplary algorithm for an ingress router to
   compare the cost to reach the service instances at Site-i or
   Site-j.

Dunbar, et al.         Expires February 9, 2024         [Page 15]
Internet-Draft    BGP extension for 5G Edge Services

               SerD-i * CP-j               Pref-j * NetD-i
Cost-i=min(w *(----------------) + (1-w) *(------------------))
              ServD-j * CP-i               Pref-i * NetD-j

      CP-i: Capacity Availability Index at Site-i. A higher value
      means higher capacity available.

      NetD-i: Network latency measurement (RTT) to the Egress
      Router at the site-i.

      Pref-i: Preference Index for Site-i, a higher value means
      higher preference.

      ServD-i: Service Delay Predication Index at Site-i for the
      service, i.e., the ANYCAST address [RFC4786] for the
      service.

      w: Weight is a value between 0 and 1. If smaller than 0.5,
      Network latency and the site Preference have more
      influence; otherwise, Service Delay and capacity
      availability have more influence.

   5.2. Integrating with BGP decision process

  When an ingress router receives BGP updates for the same IP
  address from multiple egress routers, all those egress routers
  are considered as the next hops for the IP address. For the
  selected services configured to be influenced by the Edge
  Service Metadata, the ingress router's BGP Decision process
  [IDR-CUSTOM-DECISION] would trigger the Edge Service
  Management function to compute the weight to be applied to the
  route's next hop in the forwarding plane. The decision process
  is influenced by the Edge Service Metadata associated with the
  client routes, such as Capacity Availability Index, Site
  Preference, and Service Delay Prediction Index, in addition to
  the traditional BGP multipath computation algorithm, such as
  the Weight, Local preference, Origin, MED, etc., shown below:

                      BGP ANYCAST Update
      +--------+ with Metadata    +---------------+
      | BGP    |----------------->| EdgeServiceMgn|
      |Decision|< - - - - - - - - |               |

Dunbar, et al.         Expires February 9, 2024         [Page 16]
Internet-Draft    BGP extension for 5G Edge Services

      +---^-|--+                  +-------|-------+
          | | BGP ANYCAST                 | Update Anycast
          | | Route                       | Route Nexthops
          | | Multi-path NH install       | with weight
      +---|-V--+                          |
      |   RIB  |                          |
      +----+---+                          |
           |                              |
       +---V------------------------------V-------+
       |               Forwarding Plane           |
       |                                          |
       +------------------------------------------+
            Figure 6: Metadata Influenced Decision

  When any of those metadata value goes to 0, the effect is the
  same as the routes becoming ineligible via the egress router
  who originates the metadata UPDATE. But when any of those
  metadata just degrade, there is possibility, even though
  smaller, for the egress router to continue as the optimal next
  hop.

  Suppose a destination address for aa08::4450 can be reached by
  three next hops (R1, R2, R3). Further, suppose the local BGP's
  Decision Process based on the traditional network layer
  policies & metrics identifies the R1 as the optimal next hop
  for this destination (aa08::4450). The Edge Service Metadata
  might result in R2 as the optimal next hop for the prefix and
  influence the Forwarding Plane.

  The Edge Service Metadata influencing next hop selection is
  different from the metric (or weight) to the next hop. The
  metric to a next hop can impact many (sometimes, tens of
  thousands) routes that have the node as their next hop. while
  as the Edge Service Metadata only impact the optimal next hop
  selection for a subset of client routes that are identified as
  the edge services.

  When the BGP custom decision [idr-custom-decision] is used,
  the Edge Service Management function would have algorithm to
  combine the Edge Service Metadata attributes with the custom
  decision to derive the optimal next hop for the Edge service
  routes.

Dunbar, et al.         Expires February 9, 2024         [Page 17]
Internet-Draft    BGP extension for 5G Edge Services

   Note: For a BGP UPDATE message that includes the Edge Service
   Path Attribute with the egress router's loopback prefix, the
   Site Capacity Availability Index value is applied to all the
   NLRIs with the Site-ID indicated in the Edge Service Metadata
   Path Attribute.

6. Service Metadata Propagation Scope

   Service Metadata are only distributed to the relevant ingress
   nodes interested in the Service, which can be configured or
   automatically formed.

   For each registered low-latency Service, BGP RT Constrained
   Distribution [RFC4684] can be used to form the Group
   interested in the Service. The "Service ID," an IP address
   prefix, is the Route Target. When an ingress router receives
   the first packet of a flow destined to a Service ID, the
   ingress router sends a BGP UPDATE that advertises the Route
   Target membership NLRI per RFC4684. The ingress router must
   assign a Timer for the Service ID, as the UE that uses the
   Service ID might move away. Upon receiving a packet destined
   for the Service ID, the ingress router must refresh the Timer.
   The ingress router must send a BGP Withdraw UPDATE for the
   Service ID upon expiration of the Timer.

7. Minimum Interval for Metrics Change Advertisement

   As the metrics change can impact the path selection, the
   Minimum Interval for Metrics Change Advertisement is
   configured to control the update frequency to avoid route
   oscillations. Default is 30s.

   Significant load changes at EC data centers can be triggered
   by short-term gatherings of UEs, like conventions, lasting a
   few hours or days, which are too short to justify adjusting EC
   server capacities among DCs. Therefore, the load metrics
   change rate can be in the magnitude of hours or days.

8. Validation and Error Handling

   The Metadata Path Attribute contains a sequence of Sub-TLVs.
   The Metadata Path Attribute's length determines the total

Dunbar, et al.         Expires February 9, 2024         [Page 18]
Internet-Draft    BGP extension for 5G Edge Services

   number of octets for all the Sub-TLVs under the Metadata Path
   Attribute. The sum of the lengths from all the Sub-TLVs under
   the Metadata Path Attribute should equal the length of the
   Metadata Path Attribute.  If this is not the case, the TLV
   should be considered malformed, and the "Treat-as-withdraw"
   procedure of [RFC7606] is applied.

   If a Metadata Path attribute can be parsed correctly but
   contains a Sub-TLV whose type is not recognized by a
   particular BGP speaker, that BGP speaker MUST NOT consider the
   attribute to be malformed. Rather, it MUST interpret the
   attribute as if that Sub-TLV had not been present. If the
   route carrying the Metadata path attribute is propagated with
   the attribute, the unrecognized Sub-TLV remains in the
   attribute.

9. Manageability Considerations

   The Edge Service Metadata described in this document are only
   intended for propagating between Ingress and egress routers of
   one single BGP domain, i.e., the 5G Local Data Networks, which
   is a limited domain with edge services a few hops away from
   the ingress nodes. Only the selective services by UEs are
   considered as 5G Edge Services.  The 5G LDN is usually managed
   by one operator, even though the routers can be by different
   vendors.

10. Security Considerations

   The proposed Edge Service Metadata are advertised within the
   trusted domain of 5G LDN's ingress and egress routers. There
   are no extra security threats compared with iBGP.

11. IANA Considerations

11.1. Metadata Path Attribute

   IANA is requested to assign a new path attribute from the "BGP
   Path Attributes" registry. The symbolic name of the attribute
   is "Metadata", and the reference is [This Document].

    +=======+======================================+=================+
    | Value |             Description              |    Reference    |

Dunbar, et al.         Expires February 9, 2024         [Page 19]
Internet-Draft    BGP extension for 5G Edge Services

    +=======+======================================+=================+
    |  TDB1 |      Metadata Path Attribute         | [this document] |
   +-------+--------------------------------------+-----------------+

11.2. Metadata Path Attribute Sub-Types

   IANA is requested to create a new sub-registry under the
   Metadata Path Attribute registry as follows:

   Name: Sub-TLVs under the "Metadata Path Attribute"

   Registration Procedure: Expert Review [RFC8126].

     Detailed Expert Review procedure will be added per RFC8126.

   Reference: [this document]

     +==========+==========================+=================+
     | Sub-Type |   Description            | Reference       |
     +==========+==========================+=================+
     |        0 | reserved                 | [this document] |
     +----------+--------------------------+-----------------+
     |        1 | Site Preference Index    | [this document] |
     +----------+--------------------------+-----------------+
     |        2 | Site Availability Index  | [this document] |
     +----------+--------------------------+-----------------+
     |        3 | Service Delay Predication| [this document] |
     +----------+--------------------------+-----------------+
     |        4 | Raw Load Measurement     | [this document] |
     +----------+--------------------------+-----------------+
     |    5-254 | unassigned               | [this document] |
     +----------+--------------------------+-----------------+
     |      255 | reserved                 | [this document] |
     +----------+--------------------------+-----------------+

12. References

12.1. Normative References

   [RFC4271] Y. Rekhter and S. Hares, "A Border Gateway Protocol
             4", RFC4271, Jan. 2006.

Dunbar, et al.         Expires February 9, 2024         [Page 20]
Internet-Draft    BGP extension for 5G Edge Services

   [RFC4360] S. Sangli, et al, "BGP Extended Communities
             Attribute", RFC4360, Feb. 2006.

   [RFC4760] T. Bates, et al, "Multiprotocol Extensions for BGP-
             4", RFC4760, Jan. 2007.

   [RFC4786] J. Abley and K. Lindqvist, "Operation of Anycast
             Services", RFC 4786, Dec. 2006

   [RFC7606] E. Chen, et al, "Revised Error Handling for BGP
             UPDATE Messages", RFC7606, August 2015.

   [RFC8126] M. Cotton, et al., "Guidelines for Writing an IANA
             Considerations Section in RFCs", RFC8126, June 2017.

   [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in
             RFC 2119 Key Words", BCP 14, RFC 8174, DOI
             10.17487/RFC8174, May 2017, <https://www.rfc-
             editor.org/info/rfc8174>.

   [RFC9012] K. Patel, et al, "The BGP Tunnel Encapsulation
             Attribute", RFC9012, April 2021.

12.2. Informative References

   [IANA-BGP-PARAMS] IANA, "BGP Path Attributes",
             https://www.iana.org/assignments/bgp-parameters/

   [RFC2042] B. Manning, "Registering New BGP Attribute Types",
             RFC2042, Jan. 1997.

   [RFC4684] P. Marques, et al, "Constrained Route Distribution
             for Border Gateway Protocol/MultiProtocol Label
             Switching (BGP/MPLS) Internet Protocol (IP) Virtual
             Private Networks (VPNs)", RFC4684, Nov 2006.

   [RFC8799] B. Carpenter and B. Liu, "Limited Domains and
             Internet Protocols", RFC8799, July 2020.

Dunbar, et al.         Expires February 9, 2024         [Page 21]
Internet-Draft    BGP extension for 5G Edge Services

   [3GPP TS 23.501]  3rd Generation Partnership Project;
             Technical Specification Group Services and System
             Aspects; System architecture for the 5G System (5GS)

   [5G-Edge-Service] L. Dunbar, K. Majumdar, H. Wang, and G.
             Mishra, "5G Edge Service use Cases", draft-dunbar-
             cats-edge-service-metrics-01, work-in-progress, July
             2023.

   [IDR-CUSTOM-DECISION] A. Retana, R. White, "BGP Custom
             Decision Process", draft-ietf-idr-custom-decision-
             08, Feb 2017.

13. Appendix A
 13.1. Example of Flow Affinity

   Here is one example to illustrate how Flow Affinity can be
   achieved. This illustration is an informational example.

   For the registered EC services, the ingress node keeps a table
   of

   -  Service ID (i.e., IP address)
   -  Flow-ID
   -  Sticky Egress ID (egress router loopback address)
   -  A timer

   The Flow-ID in this table is to identify a flow, initialized
   to NULL. How Flow-ID is constructed is out of the scope for
   this document. Here is one example of constructing the Flow-
   ID:

   -  For IPv6, the Flow-ID can be the Flow-ID extracted from the
   IPv6 packet header with or without the source address.

   -  For IPv4, the Flow-ID can be the combination of the Source
   Address with or without the TCP/UDP Port number.

   The Sticky Egress ID is the egress node address for the same
   flow.

   The Timer is always refreshed when a packet with the matching
   EC Service ID (IP address) is received by the node.

Dunbar, et al.         Expires February 9, 2024         [Page 22]
Internet-Draft    BGP extension for 5G Edge Services

   If there is no Stick Egress ID present in the table for the EC
   Service ID, the forwarding plane can select a NextHop
   influenced by the Cost Compute Engine. The forwarding plane
   encapsulates the packet with a path to the chosen NextHop. The
   chosen NextHop and the Flow ID are recorded in the EC Service
   table entry.

   When the selected optimal NextHop (egress router) is no longer
   reachable, ingress router needs to select another path.

14. Acknowledgments

   Acknowledgements to Adrian Farrel, Alvaro Retana, Robert
   Raszuk, Sue Hares, Shunwan Zhuang, Donald Eastlake, Dhruv
   Dhody, Cheng Li, and Vincent Shi for their suggestions and
   contributions.

   This document was prepared using 2-Word-v2.0.template.dot.

Dunbar, et al.         Expires February 9, 2024         [Page 23]
Internet-Draft    BGP extension for 5G Edge Services

Authors' Addresses

   Linda Dunbar
   Futurewei
   Email: ldunbar@futurewei.com

   Kausik Majumdar
   Microsoft
   Email: kmajumdar@microsoft.com

   Haibo Wang
   Huawei
   Email: rainsword.wang@huawei.com

   Gyan Mishra
   Verizon
   Email: gyan.s.mishra@verizon.com

   Zongpeng Du
   China Mobile
   Email: duzongpeng@foxmail.com

Contributors' Addresses
   Cheng Li
   Huawei
   Email: c.l@huawei.com

Dunbar, et al.         Expires February 9, 2024         [Page 24]