Network Working Group L. Dunbar
Internet Draft Futurewei
Intended status: Standard K. Majumdar
Expires: January 31, 2024 Microsoft
H. Wang
Huawei
G. Mishra
Verizon
Z. Du
China Mobile
July 31, 2023
BGP Extension for 5G Edge Service Metadata
draft-ietf-idr-5g-edge-service-metadata-06
Abstract
This draft describes a new Metadata Path Attribute and some
sub-TLVs for egress routers to advertise the Metadata about
the attached edge services (ES). The Edge Service Metadata can
be used by the ingress routers in the 5G Local Data Network to
make path selections not only based on the routing cost but
also the running environment of the edge services. The goal is
to improve latency and performance for 5G edge services.
The extension enables an edge service at one specific location
to be more preferred than the others with the same IP address
(ANYCAST) to receive data flow from a specific source, like a
specific User Equipment (UE).
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be
modified, and derivative works of it may not be created,
except to publish it as an RFC and to translate it into
languages other than English.
Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.
xxx, et al. Expires January 31, 2024 [Page 1]
Internet-Draft BGP extension for 5G Edge Services
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
"work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed
at http://www.ietf.org/shadow.html
This Internet-Draft will expire on April 7, 2021.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as
the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date
of publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described
in Section 4.e of the Trust Legal Provisions and are provided
without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction.............................................. 3
2. Conventions used in this document......................... 4
3. BGP Protocol Extension for Edge Service Metadata.......... 5
3.1. Ingress Node BGP Path Selection Behavior............. 6
3.1.1. Edge Service Metadata Influenced BGP Path
Selection.............................................. 6
3.1.2. Ingress Router Forwarding Behavior.............. 6
3.1.3. Forwarding Behavior when UEs moving to new 5G
Sites.................................................. 6
4. Edge Service Metadata Encoding............................ 7
Dunbar, et al. Expires January 31, 2024 [Page 2]
Internet-Draft BGP extension for 5G Edge Services
4.1. Metadata Path Attribute.............................. 7
4.2. The Site Preference Index Sub-TLV format............. 8
4.3. Capacity Availability Index Metadata................. 8
4.3.1. Site Index Associated to Routes................ 10
4.3.2. BGP UPDATE with standalone Site Availability
Index................................................. 10
4.4. Service Delay Prediction Index...................... 10
4.4.1. Service Delay Prediction Sub-TLV............... 12
4.4.2. Service Delay Prediction Based on Load
Measurement........................................... 12
4.4.3. Raw Load Measurement Sub-TLV................... 14
5. Service Metadata Influenced Decision Process............. 14
5.1. Integrating Network Delay with the Service Metrics.. 14
5.2. Integrating with BGP decision process............... 15
6. Service Metadata Propagation Scope....................... 17
7. Minimum Interval for Metrics Change Advertisement........ 17
8. Manageability Considerations............................. 18
9. Security Considerations.................................. 18
10. IANA Considerations..................................... 18
10.1. Metadata Path Attribute............................ 18
10.2. Metadata Path Attribute Sub-Types.................. 18
11. References.............................................. 19
11.1. Normative References............................... 19
11.2. Informative References............................. 19
12. Appendix A.............................................. 20
12.1. Example of Flow Affinity........................... 20
13. Acknowledgments......................................... 21
1. Introduction
[5G-Edge-Service] describes the 5G Edge Computing background
and how BGP can be used to advertise the running status and
environment of the directly attached 5G edge services. Besides
the Radio Access, 5G is characterized by having edge services
closer to the Cell Towers reachable by Local Data Networks
(LDN) [3GPP TS 23.501]. From IP network perspective, the 5G
LDN is a limited domain with edge services a few hops away
from the ingress nodes. Only selective UE services are
considered as 5G low latency Edge Services.
This document describes a new Metadata Path Attribute for
egress routers to advertise the Metadata about the directly
attached edge services. The Edge Service Metadata in this
document includes the site availability index, the site
Dunbar, et al. Expires January 31, 2024 [Page 3]
Internet-Draft BGP extension for 5G Edge Services
preference, and the service delay prediction index, which are
further explained in Section 4.
Note: The proposed Edge Service Metadata are not intended for
the best-effort services reachable via the public internet.
The Edge Service Metadata can be used by the ingress routers
to make path selections for selective low latency services
based on not only the network distance but also the running
environment of the edge cloud sites. The goal is to improve
latency and performance for 5G ultra-low latency services.
The extension is targeted for a single domain with RR
controlling the propagation of the BGP UPDATE. The Edge
Service Metadata is only attached to the services (routes)
hosted in the 5G edge cloud sites, which are only a small
subset of services initiated from UEs. E.g., not for UEs
accessing many internet sites.
2. Conventions used in this document
Application Server: An application server is a physical or
virtual server that hosts the software system for
the application.
Application Server Location: represents a cluster of servers
at one location serving the same application. One
application may have a Layer 7 Load balancer,
whose address(es) are reachable from an external
IP network, in front of a set of application
servers. From an IP network perspective, this
whole group of servers is considered as the
Application Server at the location.
Edge Application Server: is used interchangeably with
Application Server throughout this document.
Edge Hosting Environment: An environment providing the support
required for Edge Application Server's execution.
NOTE: The above terminologies are the same as
those used in 3GPP TR 23.758
Dunbar, et al. Expires January 31, 2024 [Page 4]
Internet-Draft BGP extension for 5G Edge Services
Edge DC: Edge Data Center, which provides the Hosting
Environment for the edge services. An Edge DC
might host 5G core functions in addition to the
frequently used application servers.
gNB next generation Node B
RTT: Round-trip Time
PSA: PDU Session Anchor (UPF)
SSC: Session and Service Continuity
UE: User Equipment
UPF: User Plane Function
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in BCP 14 [RFC8174] when, and only
when, they appear in all capitals, as shown here.
3. BGP Protocol Extension for Edge Service Metadata
The goal of this Edge Service Metadata Path Attribute BGP
extension is for egress routers to propagate the metrics
about their running environment to ingress routers. Here are
some examples of the metrics propagated by the egress
routers:
- the site capacity availability index,
- the site preference index, and
- the service delay predication index for the attached edge
services.
This section specifies how these three types of Metadata
impact the ingress nodes' path selections.
Dunbar, et al. Expires January 31, 2024 [Page 5]
Internet-Draft BGP extension for 5G Edge Services
3.1. Ingress Node BGP Path Selection Behavior
3.1.1. Edge Service Metadata Influenced BGP Path Selection
When an ingress router receives BGP updates for the same IP
prefix from multiple egress routers, all these egress routers'
loopback addresses are considered as the next hops for the IP
prefix. For the selected low latency edge services, the
ingress router's BGP engine would call an Edge Service
Management function that can select paths based on the Edge
Service Metadata received. [5G-Edge-Service] has an exemplary
algorithm to compute the weighted path cost based on the Edge
Service Metadata carried by the sub-TLVs specified in this
document.
Section 5 has the detailed description of the Edge Service
Metadata influenced optimal path selection.
3.1.2. Ingress Router Forwarding Behavior
When the ingress router receives a packet and does a lookup on
the route in the FIB, it gets the destination prefix's whole
path. It encapsulates the packet destined towards the optimal
egress node.
For subsequent packets belonging to the same flow, the ingress
router needs to forward them to the same egress router unless
the selected egress router is no longer reachable. Keeping
packets from one flow to the same egress router, a.k.a. Flow
Affinity, is supported by many commercial routers. Most
registered EC services have relatively short flows.
How Flow Affinity is implemented is out of the scope for this
document. Appendix A has one example illustrating achieving
flow affinity.
3.1.3. Forwarding Behavior when UEs moving to new 5G Sites
When a UE moves to a new 5G gNB which is anchored to the same
UPF, the packets from the UE traverse to the same ingress
router. Path selection and forwarding behavior are same as
before.
Dunbar, et al. Expires January 31, 2024 [Page 6]
Internet-Draft BGP extension for 5G Edge Services
If the UE maintains the same IP address when anchored to a new
UPF, the directly connected ingress router might use the
information passed from a neighboring router to derive the
optimal Next Hop for this route. The detailed algorithm is out
of the scope of this document.
4. Edge Service Metadata Encoding
4.1. Metadata Path Attribute
The Metadata Path Attribute is an optional transitive BGP Path
attribute to carry metrics and metadata about the edge
services attached to the egress router. The Metadata Path
Attribute (type TBD1) consists of a set of sub-TLVs and each
sub-TLV contains information for a specific metrics of the
edge services.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Attr. Flags |MetadataPathAtt| Length (2 Octets) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Value (multiple Metadata sub-TLVs) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: Metadata Path Attribute
Attr. Flags are defined as:
o The high-order bit (bit 0): set to 1.
o The second high-order bit (bit 1): set to 0 to indicate
that the service-metadata is not transitive. Only
intended for the receiving router.
o The third high-order bit (bit 2): same as specified by
RFC4721.
o The fourth high-order bit (bit 3): set to 1 to indicate
there are two octets for the Length field.
MetadataPathAtt: Metadata Path Attribute: TBD1(assigned by
IANA.)
Length (2 octets): the total number of octets of the value
field.
Dunbar, et al. Expires January 31, 2024 [Page 7]
Internet-Draft BGP extension for 5G Edge Services
Value (variable): comprised of multiple sub-TLVs.
The Metadata sub-TLVs specified by this document include the
following: the Capacity Availability Index Value, the Site
Preference Index Value, the Service Delay Predication Index,
and the Load Measurement.
All values in the sub-TLVs are unsigned 32 bits integers.
4.2. The Site Preference Index Sub-TLV format
Difference services might have different preference index
values configured for the same site. For example, Service-A
requires high computing power, Service-B requires high
bandwidth among its microservices, and Service-C requires high
volume storage capacity. For a DC with relatively low storage
capacity but high bisectional bandwidth, its preference index
value for Service-B is higher and lower for Service-C. Site
Preference Index can also be used to achieve stickiness for
some services.
It is out of the scope of this document how the preference
index is determined or configured.
The Preference Index sub-TLV has the following format:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Site-Preference Sub-Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Preference Index value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: Preference Index sub-TLV
- Site-Preference Sub-Type =1 (specified in this document).
- Preference Index value: 1-100, with 1 being the least
preferred, and 100 being the most preferred.
When the Preference Index value is outside the range of 1-
100, the value carried in this sub-TLV is ignored.
4.3. Capacity Availability Index Metadata
Dunbar, et al. Expires January 31, 2024 [Page 8]
Internet-Draft BGP extension for 5G Edge Services
Capacity Availability Index indicates if an edge site, which
can be a building, a floor, a pod, a row of server racks,
etc., has full capacity, reduced capacity, or is completely
out of service. Therefore, the value is 0-100, with 100%
indicating the site is fully functional, 0% indicating the
site is entirely out of service, and 50% indicating the site
is 50% degraded.
Cloud Site/Pod failures and degradation include but are not
limited to, a site capacity degradation or an entire site
going down caused by a variety of reasons, such as fiber cut
connecting to the site or among pods, cooling failures,
insufficient backup power, cyber threats attacks, too many
changes outside of the maintenance window, etc. Fiber-cut is
not uncommon within a Cloud site or between sites.
When those failure events happen, the edge (egress) router is
running fine. Therefore, the ingress routers with paths to the
egress router can't use BFD to detect the failures.
When there is a failure occurring at an edge site (or a pod),
many instances can be impacted. In addition, the routes (i.e.,
the IP addresses) in the site might not be aggregated nicely.
Instead of many BGP UPDATE messages to the ingress routers for
all the instances impacted, the egress router can send one
single BGP UPDATE indicating the capacity availability of the
site. The ingress routers can switch all or a portion of the
instances that are associated with the site depending on how
much the site is degraded.
The Capacity Availability Index sub-TLV:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CapAvailIdx Sub-Type | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Site-ID (2 octets) | Site Availability Percentage |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Capacity Availability Index Sub-TLV
- CapAvailIdx Sub-Type = 2 (Specified in this document).
Dunbar, et al. Expires January 31, 2024 [Page 9]
Internet-Draft BGP extension for 5G Edge Services
- Site ID: identifier for a site, which can be a pod, a row of
server racks, a floor, or an entire DC. There could be
multiple sites connected to the egress router (a.k.a. Edge DC
GW)
- Site Availability Percentage: represent the percentage of the
site availability, e.g., 100%, 50%, or 0%. When a site goes
dark, the Index is set to 0. 50 means 50% capacity
functioning. When the value is outside the 0-100% range, the
value carried in this sub-TLV is ignored.
4.3.1. Site Index Associated to Routes
An egress router must append the Site Capacity Availability
Index sub-TLV with a BGP ROUTE UPDATE message for the
registered low latency edge services so that the ingress
routers can associate the Site reference Identifier to the
route in the Routing table.
However, it is unnecessary to include the Site Capacity
Availability Index for every BGP Update message if there is no
change to the site-reference identifier or the Capacity
Availability value for the service instances.
4.3.2. BGP UPDATE with standalone Site Availability Index
When an ingress router receives a BGP update message from
Router-X with a prefix of the loopback for Router-X and the
setadata Path Attribute with the Capability Availability Index
sub-TLV, the new capability availability index value is
applied to all route that have the following two constraints:
a) have router-X as their next hop, and b) associated with
site-ID. When there are failures or degradation to a site, the
corresponding egress router can send one BGP UPDATE with the
Capacity Availability Site Index with the egress router's
loopback address.
4.4. Service Delay Prediction Index
It is desirable for an ingress router to select a site with
the shortest processing time for an ultra-low latency service.
But it is not easy to predict which site has "the fastest
Dunbar, et al. Expires January 31, 2024 [Page 10]
Internet-Draft BGP extension for 5G Edge Services
processing time" or "the shortest processing delay" for an
incoming service request because:
- The given service instance shares the same physical
infrastructure with many other applications & service
instances. Service requests by other applications, UEs, or
applications running behavior can impact the processing time
for the given service instance.
- The given service instance can be served by a cluster of
servers behind a Load Balancer. To the network, the service
is identified by one service ID.
- The service complexity is different. One service may call
many microservices, need to access multiple backend
databases, and need to go through sophisticated security
scrubbing functions, etc. Another service can be processed
by a few simple steps. Without the application internal
logic, it is not easy to estimate the processing time for
future service requests.
Even though utilization measurements, like those below, are
collected by most data centers, they cannot indicate which
site has the shortest processing time. A service request might
be processed faster on Site-A even if Site-A is overutilized.
o Server utilization for the server where the instance is
instantiated.
o The network utilization for the links to the server where the
instance is instantiated.
o The number of databases that the service instance will access.
o The memory utilization of the databases
The remaining available resource at a site is a more
reasonable indication of process delay for future service
requests.
o The remaining available Server resources.
o The remaining available network utilization for the links to
the server where the instance is instantiated.
o The number of databases that the service instance will access.
o The remaining storage available for the databases.
The Service Delay Prediction Index is a value that predicts
processing delays at the site for future service requests. The
higher the value, the longer of the delay.
Dunbar, et al. Expires January 31, 2024 [Page 11]
Internet-Draft BGP extension for 5G Edge Services
4.4.1. Service Delay Prediction Sub-TLV
While out of scope, we assume there is an algorithm that can
derive the Service Delay Prediction Index that can be assigned
to the egress router. When the Service Delay Prediction value
is updated, which can be triggered by the available resources
change, etc., the egress router can attach the updated Service
Delay Predication value in a sub-TLV under the Metadata Path
Attribute of the BGP Route UPDATE message to the ingress
routers.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ServiceDelayPredict Sub-Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Service Delay Predication Value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Service Delay Prediction Index Sub-TLV
- ServiceDelayPredict(Service Delay Predication) Sub-type=3
(specified in this document).
- The Service Delay Predication Value is an integer 0-100,
with 0 indicating that the service delay is negligible and
100 indicating that the site has the most significant delay
compared to all other sites for the same service. When the
value is outside the 0-100 range, the value carried in this
sub-TLV is ignored.
4.4.2. Service Delay Prediction Based on Load Measurement
When data centers detailed running status are not exposed to
the network operator, historic traffic patterns through the
egress nodes can be utilized to predict the load to a specific
service. For example, when traffic volume to one service at
one data center suddenly increases a huge percentage compared
with the past 24 hours average, it is likely caused by a
larger than normal demand for the service. When this happens,
another data center with lower-than-average traffic volume for
Dunbar, et al. Expires January 31, 2024 [Page 12]
Internet-Draft BGP extension for 5G Edge Services
the same service might have a shorter processing time for the
same service.
Here are some measurements that can be utilized to derive the
Service Delay Predication for a service ID:
- Total number of packets to the attached service instance
(ToPackets);
- Total number of packets from the attached service
instance (FromPackets);
- Total number of Bytes to the attached service instance
(ToBytes);
- Total number of bytes from the attached service instance
(FromBytes);
- The actual load measurement to the service instance
attached to a CATS-ER can be based on one of the metrics
above or including all four metrics with different
weights applied to each, such as:
- LoadIndex =
w1*ToPackets+w2*FromPackes+w3*ToBytes+w4*FromBytes
- Where 0<= wi <=1 and w1+ w2+ w3+ w4 = 1.
- The weights of each metric contributing to the index of
the service instance attached to a CATS-ER can be
configured or learned by self-adjusting based on user
feedbacks.
The Service Delay Prediction Index can be derived from
LoadIndex/24Hour-Average. A higher value means a longer delay
prediction. The egress router can use the ServiceDelayPred
sub-TLV to indicate to the ingress routers of the delay
prediction derived from the traffic pattern.
Note: The proposed IP layer load measurement is only an
estimate based on the amount of traffic through the egress
router, which might not truly reflect the load of the servers
Dunbar, et al. Expires January 31, 2024 [Page 13]
Internet-Draft BGP extension for 5G Edge Services
attached to the egress routers. They are listed here only for
some special deployments where those metrics are helpful to
the ingress routers in selecting the optimal paths.
4.4.3. Raw Load Measurement Sub-TLV
When ingress routers have embedded analytics tool relying on
the raw measurements, it is useful for the egress router to
send the raw measurement.
Raw Load Measurement sub-TLV has the following format:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Raw-Measurements Sub-Type| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Measurement Period |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| total number of packets to the Edge Service |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| total number of packets from the Edge Service |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| total number of bytes to the Edge Service |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| total number of bytes from the Edge Service |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5: Raw Load Measurement Sub-TLV
Raw-Measurement Sub-Type =4 (specified in this document):
Raw measurements of packets/bytes to/from the Edge Service
address.
The receiver nodes can derive the Service Delay Prediction
for the Service based on the raw measurements sent from the
egress node.
Measure Period: BGP Update period in Seconds or user-
specified period.
5. Service Metadata Influenced Decision Process
5.1. Integrating Network Delay with the Service Metrics
As the service metrics and network delays are in different
units, here is an exemplary algorithm for an ingress router to
Dunbar, et al. Expires January 31, 2024 [Page 14]
Internet-Draft BGP extension for 5G Edge Services
compare the cost to reach the service instances at Site-i or
Site-j.
SerD-i * CP-j Pref-j * NetD-i
Cost-i=min(w *(----------------) + (1-w) *(------------------))
ServD-j * CP-i Pref-i * NetD-j
CP-i: Capacity Availability Index at Site-i. A higher value
means higher capacity available.
NetD-i: Network latency measurement (RTT) to the Egress
Router at the site-i.
Pref-i: Preference Index for Site-i, a higher value means
higher preference.
ServD-i: Service Delay Predication Index at Site-i for the
service (i.e., the ANYCAST address for the service).
w: Weight is a value between 0 and 1. If smaller than 0.5,
Network latency and the site Preference have more
influence; otherwise, Service Delay and capacity
availability have more influence.
5.2. Integrating with BGP decision process
When an ingress router receives BGP updates for the same IP
address from multiple egress routers, all those egress routers
are considered as the next hops for the IP address. For the
selected services configured to be influenced by the Edge
Service Metadata, the ingress router's BGP Decision process
[IDR-CUSTOM-DECISION] would trigger the Edge Service
Management function to compute the weight to be applied to the
route's next hop in the forwarding plane. The decision process
is influenced by the Edge Service Metadata associated with the
client routes, such as Capacity Availability Index, Site
Preference, and Service Delay Prediction Index, in addition to
the traditional BGP multipath computation algorithm, such as
the Weight, Local preference, Origin, MED, etc., shown below:
BGP ANYCAST Update
+--------+ with Metadata +---------------+
Dunbar, et al. Expires January 31, 2024 [Page 15]
Internet-Draft BGP extension for 5G Edge Services
| BGP |----------------->| EdgeServiceMgn|
|Decision|< - - - - - - - - | |
+---^-|--+ +-------|-------+
| | BGP ANYCAST | Update Anycast
| | Route | Route Nexthops
| | Multi-path NH install | with weight
+---|-V--+ |
| RIB | |
+----+---+ |
| |
+---V------------------------------V-------+
| Forwarding Plane |
| |
+------------------------------------------+
Figure 6: Metadata Influenced Decision
When any of those metadata value goes to 0, the effect is the
same as the routes becoming ineligible via the egress router
who originates the metadata UPDATE. But when any of those
metadata just degrade, there is possibility, even though
smaller, for the egress router to continue as the optimal next
hop.
Suppose a destination address for aa08::4450 can be reached by
three next hops (R1, R2, R3). Further, suppose the local BGP's
Decision Process based on the traditional network layer
policies & metrics identifies the R1 as the optimal next hop
for this destination (aa08::4450). The Edge Service Metadata
might result in R2 as the optimal next hop for the prefix and
influence the Forwarding Plane.
The Edge Service Metadata influencing next hop selection is
different from the metric (or weight) to the next hop. The
metric to a next hop can impact many (sometimes, tens of
thousands) routes that have the node as their next hop. while
as the Edge Service Metadata only impact the optimal next hop
selection for a subset of client routes that are identified as
the edge services.
When the BGP custom decision [idr-custom-decision] is used,
the Edge Service Management function would have algorithm to
combine the Edge Service Metadata attributes with the custom
Dunbar, et al. Expires January 31, 2024 [Page 16]
Internet-Draft BGP extension for 5G Edge Services
decision to derive the optimal next hop for the Edge service
routes.
Note: For a BGP UPDATE message that includes the Edge Service
Path Attribute with the egress router's loopback prefix, the
Site Capacity Availability Index value is applied to all the
NLRIs with the Site-ID indicated in the Edge Service Metadata
Path Attribute.
6. Service Metadata Propagation Scope
Service Metadata are only distributed to the relevant ingress
nodes interested in the Service, which can be configured or
automatically formed.
For each registered low-latency Service, BGP RT Constrained
Distribution [RFC4684] can be used to form the Group
interested in the Service. The "Service ID," an IP address
prefix, is the Route Target. When an ingress router receives
the first packet of a flow destined to a Service ID, the
ingress router sends a BGP UPDATE that advertises the Route
Target membership NLRI per RFC4684. The ingress router must
assign a Timer for the Service ID, as the UE that uses the
Service ID might move away. Upon receiving a packet destined
for the Service ID, the ingress router must refresh the Timer.
The ingress router must send a BGP Withdraw UPDATE for the
Service ID upon expiration of the Timer.
7. Minimum Interval for Metrics Change Advertisement
As the metrics change can impact the path selection, the
Minimum Interval for Metrics Change Advertisement is
configured to control the update frequency to avoid route
oscillations. Default is 30s.
Significant load changes at EC data centers can be triggered
by short-term gatherings of UEs, like conventions, lasting a
few hours or days, which are too short to justify adjusting EC
server capacities among DCs. Therefore, the load metrics
change rate can be in the magnitude of hours or days.
Dunbar, et al. Expires January 31, 2024 [Page 17]
Internet-Draft BGP extension for 5G Edge Services
8. Manageability Considerations
The Edge Service Metadata described in this document are only
intended for propagating between Ingress and egress routers of
one single BGP domain, i.e., the 5G Local Data Networks, which
is a limited domain with edge services a few hops away from
the ingress nodes. Only the selective services by UEs are
considered as 5G Edge Services. The 5G LDN is usually managed
by one operator, even though the routers can be by different
vendors.
9. Security Considerations
The proposed Edge Service Metadata are advertised within the
trusted domain of 5G LDN's ingress and egress routers. There
are no extra security threats compared with iBGP.
10. IANA Considerations
10.1. Metadata Path Attribute
IANA is requested to assign a new path attribute from the "BGP
Path Attributes" registry. The symbolic name of the attribute
is "Metadata", and the reference is [This Document].
+=======+======================================+=================+
| Value | Description | Reference |
+=======+======================================+=================+
| TDB1 | Metadata Path Attribute | [this document] |
+-------+--------------------------------------+-----------------+
10.2. Metadata Path Attribute Sub-Types
IANA is requested to create a new sub-registry under the
Metadata Path Attribute registry as follows:
Name: sub-TLVs under the "Metadata Path Attribute"
Registration Procedure: Expert Review [RFC8126].
Detailed Expert Review procedure will be added per RFC8126.
Reference: [this document]
Dunbar, et al. Expires January 31, 2024 [Page 18]
Internet-Draft BGP extension for 5G Edge Services
+==========+==========================+=================+
| Sub-Type | Description | Reference |
+==========+==========================+=================+
| 0 | reserved | [this document] |
+----------+--------------------------+-----------------+
| 1 | Site Preference Index | [this document] |
+----------+--------------------------+-----------------+
| 2 | Site Availability Index | [this document] |
+----------+--------------------------+-----------------+
| 3 | Service Delay Predication| [this document] |
+----------+--------------------------+-----------------+
| 4 | Raw Load Measurement | [this document] |
+----------+--------------------------+-----------------+
| 5-254 | unassigned | [this document] |
+----------+--------------------------+-----------------+
| 255 | reserved | [this document] |
+----------+--------------------------+-----------------+
11. References
11.1. Normative References
[RFC8126] M. Cotton, et al., "Guidelines for Writing an IANA
Considerations Section in RFCs", RFC8126, June 2017.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in
RFC 2119 Key Words", BCP 14, RFC 8174, DOI
10.17487/RFC8174, May 2017, <https://www.rfc-
editor.org/info/rfc8174>.
11.2. Informative References
[RFC4684] P. Marques, et al, "onstrained Route Distribution
for Border Gateway Protocol/MultiProtocol Label
Switching (BGP/MPLS) Internet Protocol (IP) Virtual
Private Networks (VPNs)", RFC4684, Nov 2006.
[3GPP TS 23.501] 3rd Generation Partnership Project;
Technical Specification Group Services and System
Aspects; System architecture for the 5G System (5GS)
Dunbar, et al. Expires January 31, 2024 [Page 19]
Internet-Draft BGP extension for 5G Edge Services
[5G-Edge-Service] L. Dunbar, K. Majumdar, H. Wang, and G.
Mishra, "5G Edge Service use Cases", draft-dunbar-
cats-edge-service-metrics-01, work-in-progress, July
2023.
[IDR-CUSTOM-DECISION] A. Retana, R. White, "BGP Custom
Decision Process", draft-ietf-idr-custom-decision-
08, Feb 2017.
12. Appendix A
12.1. Example of Flow Affinity
Here is one example to illustrate how Flow Affinity can be
achieved. This illustration is an informational example.
For the registered EC services, the ingress node keeps a table
of
- Service ID (i.e., IP address)
- Flow-ID
- Sticky Egress ID (egress router loopback address)
- A timer
The Flow-ID in this table is to identify a flow, initialized
to NULL. How Flow-ID is constructed is out of the scope for
this document. Here is one example of constructing the Flow-
ID:
- For IPv6, the Flow-ID can be the Flow-ID extracted from the
IPv6 packet header with or without the source address.
- For IPv4, the Flow-ID can be the combination of the Source
Address with or without the TCP/UDP Port number.
The Sticky Egress ID is the egress node address for the same
flow.
The Timer is always refreshed when a packet with the matching
EC Service ID (IP address) is received by the node.
If there is no Stick Egress ID present in the table for the EC
Service ID, the forwarding plane can select a NextHop
influenced by the Cost Compute Engine. The forwarding plane
Dunbar, et al. Expires January 31, 2024 [Page 20]
Internet-Draft BGP extension for 5G Edge Services
encapsulates the packet with a path to the chosen NextHop. The
chosen NextHop and the Flow ID are recorded in the EC Service
table entry.
When the selected optimal NextHop (egress router) is no longer
reachable, ingress router needs to select another path.
13. Acknowledgments
Acknowledgements to Adrian Farrel, Alvaro Retana, Robert
Raszuk, Sue Hares, Shunwan Zhuang, Donald Eastlake, Dhruv
Dhody, Cheng Li, and Vincent Shi for their suggestions and
contributions.
This document was prepared using 2-Word-v2.0.template.dot.
Authors' Addresses
Linda Dunbar
Futurewei
Email: ldunbar@futurewei.com
Kausik Majumdar
Microsoft
Email: kmajumdar@microsoft.com
Haibo Wang
Huawei
Email: rainsword.wang@huawei.com
Gyan Mishra
Verizon
Email: gyan.s.mishra@verizon.com
Zongpeng Du
China Mobile
Email: duzongpeng@foxmail.com
Contributors' Addresses
Cheng Li
Dunbar, et al. Expires January 31, 2024 [Page 21]
Internet-Draft BGP extension for 5G Edge Services
Huawei
Email: c.l@huawei.com
Dunbar, et al. Expires January 31, 2024 [Page 22]