Network Working Group                                 K. Majumdar
Internet Draft                                          Microsoft
Intended status: Standard Track                        L. Dunbar
Expires: March 18, 2024                               Futurewei
                                                V.Kasiviswanathan
                                                           Arista
                                                    A. Ramchandra
                                                      Microsoft
                                               September 18, 2023


                 Multi-segment SD-WAN via Cloud DCs
               draft-dmk-rtgwg-multisegment-sdwan-02

Abstract
   The document describes the methods to optimize the stitching
   of multiple SD-WAN segments on Cloud DCs Gateways.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet
   Engineering Task Force (IETF), its areas, and its working
   groups.  Note that other groups may also distribute working
   documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of
   six months and may be updated, replaced, or obsoleted by
   other documents at any time.  It is inappropriate to use
   Internet-Drafts as reference material or to cite them other
   than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed
   at http://www.ietf.org/shadow.html

   This Internet-Draft will expire on Dec 18, 2020.






xxx, et al.             Expires March 18, 2024           [Page 1]


Internet-Draft           Multi-segment SD-WAN


Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as
   the document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date
   of publication of this document. Please review these
   documents carefully, as they describe your rights and
   restrictions with respect to this document. Code Components
   extracted from this document must include Simplified BSD
   License text as described in Section 4.e of the Trust Legal
   Provisions and are provided without warranty as described in
   the Simplified BSD License.

Table of Contents

   1. Introduction..............................................3
   2. Conventions used in this document.........................3
   3. Use Cases.................................................5
      3.1. Multi-segment SD-WAN via Single Cloud GW.............5
      3.2. Multi-segment SD-WAN via Cloud Backbone..............6
      3.3. Analysis of Policy-based Traffic Steering............7
      3.4. End to End Encryption................................8
   4. Data Plane encoding for SD-WAN Transit....................8
      4.1. GENEVE Header Encoding...............................8
      4.2. Multi-Segment SD-WAN Option Class....................9
      4.3. SD-WAN Tunnel Endpoint Sub-TLV.......................9
      4.4. SD-WAN Tunnel Originator Sub-TLV....................10
      4.5. Egress GW Sub-TLV...................................11
      4.6. Include-Transit Sub-TLV.............................11
      4.7. Exclude-Transit Sub-TLV.............................11
   5. IPsec Flow through Cloud GWs Illustration................12
      5.1. Single Hop Cloud GW.................................12
      5.2. Multi-hop Transit GWs...............................13
      5.3. Data Authentication and Integrity Check by Cloud GW.15
   6. Illustration of Traffic from Private VPN to IPsec Tunnel.16
   7. Control Plane considerations.............................18
      7.1. Control Plane for CPEs..............................18
      7.2. Control Plane between CPEs and Cloud GWs............18
   8. Observability Consideration..............................19
   9. Security Considerations..................................19
   10. Manageability Considerations............................21


Dunbar, et al.           Expires Dec 18, 2024            [Page 2]


Internet-Draft           Multi-segment SD-WAN


   11. IANA Considerations.....................................21
   12. References..............................................22
      12.1. Normative References...............................22
      12.2. Informative References.............................23
   13. Acknowledgments.........................................24

1. Introduction

   SD-WAN is widely deployed to connect enterprises' on-premises
   CPEs with services in cloud DCs. As described in [Net2Cloud],
   there are multiple options for enterprises to connect to
   Cloud DCs:

     - Direct Interconnect model,
     - Direct Interconnect model with enterprise's own virtual
        appliances in the Cloud,
     - Indirect Interconnect model via SD-WAN paths, and
     - Managed Hybrid WAN model using Enterprise's existing VPN
        connections.

   For the enterprise branches that have private VPN circuits
   interconnecting with a Cloud GW via IXP (Internet eXchange
   Point), the enterprise can extend into Cloud DC without
   having to set up IPsec paths between their on-premises CPEs
   and the Cloud GWs.

   This document describes a method for a Cloud DCs' gateway
   (GW) to connect multiple SD-WAN segments between the Cloud GW
   and the enterprise's CPEs without the Cloud GW decrypting and
   encrypting the payloads. By integration with Cloud Operators'
   gateways, enterprises can have advanced visibility through
   the Cloud Providers' global network topology, attachment
   level performance metrics, and telemetry data.

2. Conventions used in this document
   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
   NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
   "MAY", and





Dunbar, et al.           Expires Dec 18, 2024            [Page 3]


Internet-Draft           Multi-segment SD-WAN


   "OPTIONAL" in this document are to be interpreted as
   described in BCP14 [RFC2119] [RFC8174] when, and only when,
   they appear in all
   capitals, as shown here.

   The following acronyms and terms are used in this document:



   Cloud DC:   Off-Premises Data Center, managed by the third
               party, that hosts applications, services, and
               workload for different organizations or tenants.

   CPE:        Customer (Edge) Premises Equipment.

   IXP:        Internet exchange points (IXes or IXPs) are
               common grounds of IP networking, allowing
               participant Internet service providers (ISPs) to
               exchange data destined for their respective
               networks.
               (https://en.wikipedia.org/wiki/Internet_exchange_
               point).

   OnPrem:     On Premises data centers and branch offices.

   RR          Route Reflector.

   SD-WAN      An overlay connectivity service that optimizes
               transport of IP Packets over one or more Underlay
               Connectivity Services by recognizing applications
               (Application Flows) and determining forwarding
               behavior by applying Policies to them. [MEF-70.1]

   VPN         Virtual Private Network.










Dunbar, et al.           Expires Dec 18, 2024            [Page 4]


Internet-Draft           Multi-segment SD-WAN




3. Use Cases

3.1. Multi-segment SD-WAN via Single Cloud GW

   For enterprise branches that have established SD-WAN paths to
   a Cloud GW for accessing Cloud services, the Cloud GW can be
   utilized to connect those branches, as shown in Figure 1.
   Here are some reasons for connecting those branches via a
   Cloud GW:
  - The public internet among those branches might have limited
     bandwidth, unpredictable connection performance, or be
     prone to cyber-attacks. In comparison, the network paths
     from CPEs to the Cloud GW have more reliable connections
     and are constantly monitored by sophisticated network
     functions.
  - It is easier to utilize Cloud based security functions,
     such as Firewall, DDoS, etc., to apply consistent policy
     enforcement for workloads/services to the Cloud and across
     the branches.
  - Cloud-based tools and SaaS (Software as a Service) can be
     easily utilized to collect and analyze the threat to the
     traffic.






















Dunbar, et al.           Expires Dec 18, 2024            [Page 5]


Internet-Draft           Multi-segment SD-WAN


                          (^^^^^^^^^^^^)
                        (     Cloud     )
                        ( +----+  +----+  )
                 + -----(-|Edge|  + GW |  )
         Direct  |      ( +----+  +/--\+  )
        Connect  |        (^^^^^^^/^^^^\^)
               {-+---}           /      \  SD-WAN Path CPE<->GW
               { VPN }          /        \
               {-+---}         /          IPsec Tunnel
                 +-------+----/------+    \
                         |   /       |     \
                        ++--/+       |    +-\--+
                        |CPE1|       +----+CPE2|
                        +----+            +----+
       Client Route: 11.1.1.x             10.1.1.x
                     21.1.1.x             20.1.1.x
                                          30.1.1.x
   Figure 1 Multi-Segment SD-WAN stitching via a Cloud GW


3.2. Multi-segment SD-WAN via Cloud Backbone

   For geographic faraway enterprise branches that have
   established SD-WAN paths to their corresponding Cloud GWs to
   access Cloud services in different geographic locations, the
   Cloud backbone can connect those branches, as shown in Figure
   2. The reasons to utilize the Cloud Backbone to interconnect
   those branches are similar to interconnecting multiple
   branches via a single Cloud GW described in the previous
   section.














Dunbar, et al.           Expires Dec 18, 2024            [Page 6]


Internet-Draft           Multi-segment SD-WAN


                       (^^^^^^^^^^^^^^^)
                      (      Cloud      )
                      ( +----+  +----+  )               +-----+
                 + ---(-|Edge|==| GW1|=================== GW2 |
         Direct  |    ( +----+  +/--\+  )               +--|--+
        Connect  |      (^^^^^^^/^^^^\^)                   |
               {-+---}         /      \                    |
               { VPN }        /        \                 +-----+
               {-+---}       /          IPsec Tunnel     |CPE10|
                 +-------+--/--------+   \               +-----+
                         | /         |    \
   10.2.1.x
                        ++/--+       |    +\---+
   20.2.1.x
                        |CPE1|       +----+CPE2|
   30.2.1.x
                        +----+            +----+
       Client Route: 11.1.1.x             10.1.1.x
                     21.1.1.x             20.1.1.x
                                          30.1.1.x

     Figure 2 Multi-Segment SD-WAN Stitching via Cloud Backbone



3.3. Analysis of Policy-based Traffic Steering

     There are many well-developed methods, such as SRv6 or
     MPLS-TE, to steer traffic through specific nodes. Those
     traffic steering methods are effective when the entire
     network domain is under one administrative control.

     However, the traffic from on-premises CPEs to Cloud GWs via
     the public internet can only be forwarded based on the
     packets' destination addresses.

     SD-WAN allows for the setup of multiple links (paths), some
     of which are the Public Internet, from the same SD-WAN
     branch CPE to a Cloud GW; each link (or path) represents a
     dual tunnel connection from a unique public IP of the SD-
     WAN CPE to two different instances of Cloud GW. Using Cloud
     GW to interconnect those on-premises CPEs eliminates the
     need to manage the multiple ISPs' links/paths between the
     CPEs.





Dunbar, et al.           Expires Dec 18, 2024            [Page 7]


Internet-Draft           Multi-segment SD-WAN


3.4. End to End Encryption

   To ensure the confidentiality, integrity, and availability of
   communication among CPEs, the traffic between the CPEs should
   be encrypted by the IPsec SAs if traversing the public
   Internet. When the traffic between the enterprise's CPEs
   doesn't terminate within the Cloud DCs, the processing burden
   on Cloud GWs can be significantly reduced if the Cloud GWs
   don't need to decrypt and re-encrypt transit IPsec encrypted
   traffic among CPEs. This document describes the mechanisms
   for the IPsec encrypted traffic between CPEs to traverse the
   Cloud GWs without being decrypted and re-encrypted by the
   Cloud GWs.

4. Data Plane encoding for SD-WAN Transit

   For Cloud GWs to differentiate the packets destined towards
   their internal hosts/services, which require decryption, and
   transit packets to be forwarded to the respective destination
   branch CPEs, proper marking is needed in the packets' header.
   As the GENEVE Encapsulation [RFC8926] is supported by most
   Cloud Service Providers, GENEVE is chosen as the
   encapsulation header for Cloud GWs to steer IPsec encrypted
   packets among CPEs without decryption.

4.1. GENEVE Header Encoding

   Geneve header shown below is specified by RFC8926:

   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |Ver|  Opt Len  |O|C|    Rsvd.  |          Protocol Type        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        Virtual Network Identifier (VNI)       |    Reserved   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~                    Variable-Length Options                    ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                  Figure 3 GENEVE Header

   VNI (virtual network Identifier) is used to represent the
   Customer Identifier.

   The Protocol Type (16 bits) = 50 (ESP) [RFC4303] indicates
   that IPsec ESP encapsulated data are appended at the end of
   the GENEVE header.


Dunbar, et al.           Expires Dec 18, 2024            [Page 8]


Internet-Draft           Multi-segment SD-WAN


4.2. Multi-Segment SD-WAN Option Class

   A new GENEVE Option Class (Type value=TBD) is used to
   indicate that the Multi-segment SD-WAN relevant Sub-TLVs are
   encoded in the GENEVE header.

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | multi-seg-SD-WAN Option Class |      Type     |R|R|R| Length  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~                  SD-WAN Tunnel Endpoint Sub-TLV               ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~          Optional SD-WAN Tunnel Originator Sub-TLV            ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   ~          Optional Egress GW Sub-TLV                           ~
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   //                                                             //
   //         Optional Type Length Value objects (variable)       //
   //                                                             //
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         Figure 4 Multi Segment SD-WAN Option Class

   Type indicates the various types of multi-segment SD-WAN.

     Type = 1: Single Hop Transit SD-WAN

     Type = 2: Multi-Hop Transit SD-WAN with explicitly
     specified egress Cloud GW.

     Type = 3: Multi-Hop Transit SD-WAN without specified egress
     Cloud GW.



4.3. SD-WAN Tunnel Endpoint Sub-TLV

   The SD-WAN Endpoint sub-TLV indicates the destination CPE of
   the IPsec Tunnel.




Dunbar, et al.           Expires Dec 18, 2024            [Page 9]


Internet-Draft           Multi-segment SD-WAN


   For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown
   in Figure 1, the Tunnel Endpoint Sub-TLV of the Geneve Header
   has the CPE2's IP address.

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |SD-WAN Endpoint| length        |   Reserved    | TTL          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | SD-WAN Dst Addr Family        | Address                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable)                    +
   ~                                                               ~
   |    SD-WAN end point Address                                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
              Figure 5 SD-WAN Endpoint Sub-TLV

   TTL is set by the SD-WAN Tunnel Originator, e.g., CPE1. Each
   transit node or transit region/zone (visible to the CPEs) SHOULD
   decrement the TTL so that the destination CPE can know the number
   of logical transit nodes (cloud regions or zones) the packet has
   traversed. Enterprises can also use TTL to set the maximum transit
   nodes/regions the packets traverse.


4.4. SD-WAN Tunnel Originator Sub-TLV

   The SD-WAN Tunnel Originator Sub-TLV is an optional Sub-TLV
   inside the multi-seg-SD-WAN Option Class to indicate the
   originating CPE of the IPsec Tunnel.

   For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown
   in Figure 1, the Tunnel Originator Sub-TLV inside the Geneve
   Header of the packets indicates CPE1's address.

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |SDWAN Origin   | length        |   reserved    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | SD-WAN Org Addr Family        | Address                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable)                    +
   ~                                                               ~
   |    SD-WAN Tunnel Originator Address                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             Figure 6 SD-WAN Tunnel Originator Sub-TLV





Dunbar, et al.           Expires Dec 18, 2024           [Page 10]


Internet-Draft           Multi-segment SD-WAN


   The Tunnel Originator Sub-TLV in the GENEVE header can assist
   Cloud transit nodes in applying appropriate policies when
   forwarding the packet.



4.5. Egress GW Sub-TLV

   For the multi-segment SD-WAN via Cloud Backbone scenario, the
   originator CPE can use the Egress GW Sub-TLV to specify the
   Egress Cloud GW for reaching the destination CPE.

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |SDWAN EgressGW | length        |   reserved    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Egress GW Addr Family         | Address                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable)                    +
   ~                                                               ~
   |           Egress GW Address                                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
               Figure 7 SD-WAN Egress GW Sub-TLV


   The originator CPE can get the Egress GW address by
   configuration or by control plane protocol exchanged with
   destination CPEs. The Control Plane protocol is out of the
   scope of this document.

4.6. Include-Transit Sub-TLV

   Include-Transit Sub-TLV is an optional Sub-TLV for explicitly
   including a list of Cloud Availability Regions or Zones for
   reasons like:

  - Those regions have certain OAM and security functions for
     the improved visibility.
  - To comply with regulations, etc.

 4.7. Exclude-Transit Sub-TLV

   Exclude-Transit Sub-TLV is an optional Sub-TLV for explicitly
   excluding a list of Cloud Availability Regions or Zones for
   reasons like




Dunbar, et al.           Expires Dec 18, 2024           [Page 11]


Internet-Draft           Multi-segment SD-WAN


  - To comply with regulations,
  - To avoid regions that impose certain risks.

5. IPsec Flow through Cloud GWs Illustration
   This section illustrates Cloud GWs connecting traffic flow
   carried by the IPsec tunnels.

5.1. Single Hop Cloud GW

     Assuming that all CPEs are under one administrative control
     (e.g., iBGP).

     Using Figure 1 as an example:

       - There is a bidirectional IPsec tunnel between CPE1 and
          Cloud GW; with IPsec SA1 for the traffic from the CPE1
          to the Cloud-GW; and IPsec SA2 for the traffic from
          the Cloud-GW to the CPE1.
       - There is a bidirectional IPsec tunnel between CPE2 and
          Cloud GW; with IPsec SA3 for the traffic from the CPE2
          to the Cloud-GW; and IPsec SA4 for the traffic from
          the Cloud-GW to the CPE2.
       - All the CPEs are under one iBGP administrative domain,
          with a Route Reflector (RR) as their controller. The
          CPEs notify their peers of their corresponding Cloud
          GW addresses (which is out of the scope of this
          document).

     When 11.1.1.x and 10.1.1.x need to communicate with each
     other, CPE1 and CPE2 establish a bidirectional IPsec
     Tunnel, with SA5 for the traffic from CPE1 to CPE2 and SA6
     for the traffic from CPE2 to CPE1. Assume the IPsec ESP
     Tunnel Mode is used. A packet from 11.1.1.1 to 10.1.1.2 has
     the following outer header:












Dunbar, et al.           Expires Dec 18, 2024           [Page 12]


Internet-Draft           Multi-segment SD-WAN


     Outer IP header:
         +---------------------------+
         |    protocol = 17(UDP)     |
         |    src = CPE1             |
         |    dst = Cloud GW         |
         +---------------------------+
         |  Source Port =xxxx        |
         |  Dst Port = 6081 (GENEVE) |
         +===========================+
         | GENEVE Header             |
         | multi-seg-SD-WAN Option   |
         |GENEVE Proto = 50 (ESP)    |
         +- - --  -- - - --      - --+
         |SD-WAN EndPt SubTLV (CPE2) |
         +---------------------------+  < ----------+
         |SPI(Security Parameter Idx)|        Authenticated
         +---------------------------+              |
         |    sequence number        |              |
         +---------------------------+   <-+        |
         | payload IP header:        |     |        |
         |  src =  11.1.1.1          |     |        |
         |  dst =  10.1.1.2          |     |        |
         +---------------------------+  Encrypted   |
         |   TCP header +            |     |        |
         ~    payload (variable)     ~     |        |
         |                           |     |        |
         +===========================+   <-+ -------+
         |   Authentication Data     |
         +---------------------------+

     Figure 8 Packet header illustration of traffic to Cloud GWs

5.2. Multi-hop Transit GWs

     Traffic to/from geographic apart CPEs can cross multiple
     Cloud DCs via Cloud backbone.

     The on-premises CPEs are under one administrative control
     (e.g., iBGP).

     Using Figure 2 as an example:

       - There is a bidirectional IPsec tunnel between CPE1 and
          the Cloud GW1; with IPsec SA1 for the traffic from the
          CPE1 to the Cloud-GW1; and IPsec SA2 for the traffic
          from the Cloud-GW1 to the CPE1.


Dunbar, et al.           Expires Dec 18, 2024           [Page 13]


Internet-Draft           Multi-segment SD-WAN


       - There is a bidirectional IPsec tunnel between CPE10
          and the Cloud GW2; with IPsec SA3 for the traffic from
          the CPE10 to the Cloud-GW2; and IPsec SA4 for the
          traffic from the Cloud-GW2 to the CPE10.
       - All the CPEs are under one iBGP administrative domain,
          with a Route Reflector (RR) as their controller. CPEs
          notify their peers of their corresponding Cloud GW
          addresses.

     When 11.1.1.x and 10.2.1.x need to communicate with each
     other, CPE1 and CPE10 establish a bidirectional IPsec
     Tunnel, with SA5 for the traffic from CPE1 to CPE10 and SA6
     for the traffic from CPE10 to CPE1. Assume the IPsec ESP
     Tunnel Mode is used, a packet from 11.1.1.1 to 10.2.1.2 has
     the following outer header:

































Dunbar, et al.           Expires Dec 18, 2024           [Page 14]


Internet-Draft           Multi-segment SD-WAN


     Outer IP header:
         +---------------------------+
         |    proto = 17 (UDP)       |
         |    src = CPE1             |
         |    dst = Cloud GW1        |
         +===========================+
         | GENEVE Header             |
         | multi-seg-SD-WAN Option   |
         |GENEVE Proto = 50 (ESP)    |
         +- - --  -- - - --      - --+
         |SD-WAN EndPt SubTLV (CPE10)|
         +---------------------------+
         |   EgressGW-SubTLV         |
         +---------------------------+  < ----------+
         |SPI(Security Parameter Idx)|        Authenticated
         +---------------------------+              |
         |    sequence number        |              |
         +---------------------------+   <-+        |
         | payload IP header:        |     |        |
         |  src =  11.1.1.1          |     |        |
         |  dst =  10.2.1.2          |     |        |
         +---------------------------+  Encrypted   |
         |   TCP header +            |     |        |
         ~    payload (variable)     ~     |        |
         |                           |     |        |
         +===========================+   <-+ -------+
         |   Authentication Data     |
         +---------------------------+
      Figure 9 GENEVE header encapsulated IPsec packet



5.3. Data Authentication and Integrity Check by Cloud GW

     The IPsec SA already encrypts the client payload between
     the CPEs, the Cloud GW doesn't need to decrypt and re-
     encrypt the payload when relaying it to the destination
     CPE. However, data authentication and integrity check are
     needed as the traffic traverse an untrusted network.

     [RFC2403] and [RFC2404] define the authentication
     algorithms used in AH and ESP. SHA2 224/256/384/512 are
     some of the cryptographic hashing algorithms. They are part
     of a Hashed Message Authentication Code.

5.4. Packet Header Processing



Dunbar, et al.           Expires Dec 18, 2024           [Page 15]


Internet-Draft           Multi-segment SD-WAN


     In Figure 1, upon receiving a GENEVE encapsulated packet
     with the GENEVE Protocol Type = 50 (ESP), the Cloud GW does
     the following:

      - Authenticate the packet using a preconfigured
         authentication method.
      - Extract the destination CPE address from the SD-WAN
         Endpoint Sub-TLV inside the GENEVE header. Replace the
         outer IP destination address with the destination CPE
         address.
      - Optionally replace the outer IP source address with the
         Cloud GW address.
      - GENEVE header is unchanged.
      - Forward the packet to the destination CPE.

     The cloud GW SHOULD drop all packets with the source
     addresses or the values in the Sub-TLVs of the GENEVE
     header that are not recognized or registered to prevent
     unauthorized users from using the Cloud services.

5.5. Error Handling

   As traffic through Cloud Backbone takes precious resources,
   the Cloud GW SHOULD drop the packets with invalid or
   unregistered source or destination addresses.

   Cloud GW SHOULD drop the packets originated from unpaid (or
   unregistered) address (CPE).

   Cloud GW SHOULD validate the value of the SD-WAN Endpoint
   Sub-TLV and drop the packet if the value of the SD-WAN
   Endpoint Sub-TLV is an unpaid (or unregistered) address.

6. Illustration of Traffic from Private VPN to IPsec Tunnel

   This section illustrates a Cloud GW connecting client traffic
   from a branch CPE via a Private VPN to another CPE via an
   IPsec tunnel.

   Using Figure 1 as an example:





Dunbar, et al.           Expires Dec 18, 2024           [Page 16]


Internet-Draft           Multi-segment SD-WAN


       - CPE1 send traffic via a Private VPN (Direct Connect to
          the Cloud Edge) to the Cloud GW. The traffic is not
          encrypted.
       - There is a bidirectional IPsec tunnel between CPE2 and
          the Cloud GW; with IPsec SA1 for the traffic from the
          CPE2 to the Cloud-GW; and IPsec SA2 for the traffic
          from the Cloud-GW to the CPE2.
       - All the CPEs are under one iBGP administrative domain,
          with a Route Reflector (RR) as their controller. CPEs
          notify their peers of their corresponding Cloud GW
          addresses.

     Assume the IPsec ESP Tunnel Mode is used for the IPsec SA
     between Cloud GW and CPE2. For a packet from 11.1.1.1 to
     10.2.1.2, the following header is added by CPE1 sending
     over the Private VPN:

     Outer IP header:
         +---------------------------+
         |    proto = 17 (UDP)       |
         |    src = CPE1             |
         |    dst = Cloud GW        |
         +===========================+
         | GENEVE Header             |
         | multi-seg-SD-WAN Option   |
         |GENEVE Proto =TCP/UDP/etc. |
         +- - --  -- - - --      - --+
         |SD-WAN EndPt SubTLV (CPE2) |
         +---------------------------+  < -+
         | payload IP header:        |     |
         |  src =  11.1.1.1          |     |
         |  dst =  10.2.1.2          |     |
         +---------------------------+  Not Encrypted
         |   TCP header +            |     |
         ~    payload (variable)     ~     |
         |                           |     |
         +===========================+   <-+
    Figure 10 Illustration of packet through VPN

   Upon receiving the GENEVE encapsulated packet with the
   "Multi-Segment-SD-WAN" option, the Cloud GW extracts the
   destination CPE from the GENEVE header and encrypts the
   packet with the IPsec SA2 to forward to the destination
   (i.e., CPE2). The GENEVE Header is carried to the CPE2.



Dunbar, et al.           Expires Dec 18, 2024           [Page 17]


Internet-Draft           Multi-segment SD-WAN


      Outer IP header:
         +---------------------------+
         |    proto = 17 (UDP)       |
         |    src = Cloud GW         |
         |    dst = CPE2             |
         +===========================+
         | GENEVE Header             |
         | multi-seg-SD-WAN Option   |
         |GENEVE Proto =50 (ESP)     |
         +- - --  -- - - --      - --+
         |SD-WAN EndPt SubTLV (CPE2) |
         +---------------------------+  < ----------+
         |SPI(Security Parameter Idx)|        Authenticated
         +---------------------------+              |
         |    sequence number        |              |
         +---------------------------+   <-+        |
         | payload IP header:        |     |        |
         |  src =  11.1.1.1          |     |        |
         |  dst =  10.2.1.2          |     |        |
         +---------------------------+  Encrypted   |
         |   TCP header +            |     |        |
         ~    payload (variable)     ~     |        |
         |                           |     |        |
         +===========================+   <-+ -------+
         |   Authentication Data     |
         +---------------------------+
 Figure 11 Illustration of packet from the Egress Cloud GW

7. Control Plane considerations

7.1. Control Plane for CPEs

   The control plane enables SD-WAN edges to discover their
   properties and attached routes. The on-premises CPEs and
   their vCPEs (or Virtual Appliances in Cloud DC) can be
   controlled by one iBGP instance. [SDWAN-Edge-Discover]
   describes the mechanism for SD-WAN edges to discover each
   other's properties. The IPsec Key Exchange between on-
   premises CPEs and the vCPE is via the iBGP Update through RR.
   [SD-WAN-Edge-Discovery].

7.2. Control Plane between CPEs and Cloud GWs

   It is common to have eBGP sessions between enterprises CPEs
   and the Cloud GWs. An enterprise-owned vCPE can establish an
   eBGP session with the Cloud VPN GW for accessing the


Dunbar, et al.           Expires Dec 18, 2024           [Page 18]


Internet-Draft           Multi-segment SD-WAN


   workloads hosted in the Cloud DCs. If an IPsec tunnel is
   required between the Cloud DC GW and the vCPE, the full suite
   of IPSec IKEv2 must be exchanged between the vCPE and the
   Cloud GW.

8. Observability Consideration
   This section is intended for describing some metrics that
   enterprises can get from Cloud providers for the traffic
   transited. To be added.

9. Security Considerations
9.1. Threat Analysis
   As shown in Figure 3, the information carried by the GENEVE
   Header is not encrypted, which is susceptible to Man-in-the-
   Middle (MitM) attacks. An attacker can intercept and
   potentially alter the information in the GENEVE header
   between the branch CPEs and the Cloud GWs without the
   enterprise and the Cloud provider's knowledge or consent.
   Here is the threat analysis of the MitM attacks between CPEs
   and Cloud GWs:

  a) Eavesdropping: Attackers can get knowledge of the
     enterprise's branch locations and their respective
     contracted Cloud GWs. As the payload between the CPEs is
     encrypted, attackers can't get any data exchanged between
     CPEs. This threat is no different from direct IPsec SAs
     between two CPEs.
  b) Data Manipulation: Attackers alter the content (Sub-TLVs)
     in the GENEVE header. As packets with unrecognized source
     addresses or invalid values in the Sub-TLVs of the GENEVE
     header are dropped by Cloud GWs, there might be a higher
     packet drop rate between the CPEs.
     Packet drop is not a new problem. Applications' transport
     layer, such as TCP or QUIC, can handle packet drop well.

  c) Potential steeling of Cloud Backbone bandwidth:
     A threat actor might want to leverage Cloud Backbones to
     transport its own traffic between two locations without
     paying for the services. For example, a legitimate Cloud



Dunbar, et al.           Expires Dec 18, 2024           [Page 19]


Internet-Draft           Multi-segment SD-WAN


     subscriber pays for the Cloud Backbone transport services
     for traffic between CPE-A and CPE-B. The attacker, who has
     two locations far apart (say Node-A and Node-B), can use
     CPE-A's address as the source address and CPE-B as the
     value in the SD-WAN Endpoint Sub-TLV for a packet from
     Node-A to Node-B before reaching the ingress Cloud GW. When
     the packet is sent from the egress Cloud GW via the
     Internet towards CPE-B, the actor can change the source
     address back to Node-A and the destination address to Node-
     B. By doing so, Node-A and Node-B can maintain the IPsec
     tunnel via the Cloud Backbone without paying for the
     service.
     Therefore, it is necessary to have some level data
     integrity and authentication for traffic between CPEs and
     Cloud GWs even though it is not necessary for Cloud GWs to
     decrypt and re-encrypt the payload between CPEs.

9.2. HMAC-based Integrity and Authentication
   HMAC (Hash-Based Message Authentication Code) can be used to
   ensure the integrity and authenticity of data between CPEs
   and Cloud GWs to verify that GENEVE header has not been
   tampered with during transmission via the public Internet.

   The basic idea behind HMAC is to combine a secret key and a
   hash function to produce a fixed-size authentication code for
   the GENEVE header between CPEs and the Cloud GW. This
   authentication code is then sent along with the data itself.
   When the Cloud GW and the destination CPEs receive the data
   and the authentication code, they can independently compute
   the HMAC using the same key and hash function. If the
   computed HMAC matches the received authentication code, it
   indicates that the data has not been altered, as long as the
   secret key remains confidential.

   The HMAC authentication code can be carried by an HMAC Sub-
   TLV in the GENEVE Header, as specified below:









Dunbar, et al.           Expires Dec 18, 2024           [Page 20]


Internet-Draft           Multi-segment SD-WAN


    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |MultiSDWAN-HMAC| length        |   reserved    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                                                               ~
   |    HMAC Authentication Code for entire GENEVE Header          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
         Figure 12 Multi Segment SD-WAN HMAC Sub-TLV

   The HMAC Authentication Code, a.k.a. the HMAC hash value, is
   computed including all the bytes in the GENEVE header and with the
   MultiSDWAN-HMAC value field setting to 0.

9.3. AH based Integrity and Authentication
   For enterprises or Cloud providers worrying about secret HMAC
   keys being compromised, they can add another layer of AH
   encryption [RFC4301] or ESP-NULL [RFC2410] [RFC6071] on top
   of the IPsec encryption between the two CPEs. Both AH and
   ESP-NULL IPsec encryption require pairwise IPsec key
   management between Cloud GWs and the CPEs, therefore
   requiring more processing on Cloud GWs and CPEs. In addition,
   the AH encrypted packets can't traverse NAT because of outer
   IP address changes.

10. Manageability Considerations

     To be added.


11. IANA Considerations

   IANA is requested to assign a new GENEVE Option Class from
   the IETF Review range as shown below:

     Option
      Class     Description        Assignee/Contact  Reference
      ------  -------------------  ---------------- -----------
      tbd     Multi Segment SD-WAN    IETF        [this
   document]


   IANA is requested to create a registry as below with the
   initial values shown in the Multi Segment SD-WAN Geneve
   Option Class registry group:



Dunbar, et al.           Expires Dec 18, 2024           [Page 21]


Internet-Draft           Multi-segment SD-WAN


      Registry:  Multi Segment SD-WAN Sub-TLVs
      Assignment Policy:  IETF Review
      Reference:  [this document]

      Sub-TLV Type       Description             Reference
      ------------  ----------------------    ---------------
             0      Reserved
             1      SD-WAN Endpoint           [this document]
             2      SD-WAN Originator         [this document]
             3      SD-WAN Egress GW          [this document]
             4      Multi SD-WAN-HMAC         [this document]
         5-254      Unassigned
           255      Reserved


12. References


12.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed.,
             "A Border Gateway Protocol 4 (BGP-4)", RFC 4271,
             DOI 10.17487/RFC4271, January 2006,
             <https://www.rfc-editor.org/info/rfc4271>.

   [RFC4301] S. Kent and K. Seo, "Security Architecture for the
             Internet Protocol", RFC4301, Dec. 2005.

   [RFC4303] S. Kent, "IP Encapsulating Security Payload (ESP)".
             RFC4303, Dec. 2005.

   [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
             "Multiprotocol Extensions for BGP-4", RFC 4760, DOI
             10.17487/RFC4760, January 2007, <https://www.rfc-
             editor.org/info/rfc4760>.

   [RFC7296] C. Kaufman, et al, "Internet Key Exchange Protocol
             Version 2 (IKEv2)", RFC7296, Oct. 2014.




Dunbar, et al.           Expires Dec 18, 2024           [Page 22]


Internet-Draft           Multi-segment SD-WAN


   [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in
             RFC   2119 Key Words", BCP 14, RFC 8174, DOI
             10.17487/RFC8174, May 2017, <https://www.rfc-
             editor.org/info/rfc8174>.

   [RFC8926] J. Gross, et al, "Geneve: Generic Network
             Virtualization Encapsulation", RFC8926, Nov 2020.

   [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J.
             Scudder, "The BGP Tunnel Encapsulation Attribute",
             RFC 9012, DOI 10.17487/RFC9012, April 2021,
             <https://www.rfc-editor.org/info/rfc9012>.


12.2. Informative References

   [RFC2410] R. Glenn and S. Kent, "The NULL encryption
             Algorithm and Its Use with IPsec", RFC2310, Nov.
             1998.

   [RFC6071] S. Frankel and S. Krishnan, "IP Security (IPsec)
             and Internet Key Exchange (IKE) Document Roadmap",
             Feb. 2011.

   [RFC8192] S. Hares, et al, "Interface to Network Security
             Functions (I2NSF) Problem Statement and Use Cases",
             July 2017

   [RFC5521] P. Mohapatra, E. Rosen, "The BGP Encapsulation
             Subsequent Address Family Identifier (SAFI) and the
             BGP Tunnel Encapsulation Attribute", April 2009.

   [RFC9061] Marin-Lopez, R., Lopez-Millan, G., and F.
             Pereniguez-Garcia, "A YANG Data Model for IPsec
             Flow Protection Based on Software-Defined
             Networking (SDN)", RFC 9061, DOI 10.17487/RFC9061,
             July 2021, <https://www.rfc-
             editor.org/info/rfc9061>.






Dunbar, et al.           Expires Dec 18, 2024           [Page 23]


Internet-Draft           Multi-segment SD-WAN


   [CONTROLLER-IKE] D. Carrel, et al, "IPsec Key Exchange using
             a Controller", draft-carrel-ipsecme-controller-ike-
             01, work-in-progress.

   [MEF-70.1] MEF 70.1 SD-WAN Service Attributes and Service
             Framework. Nov. 2021.

   [Net2Cloud] L. Dunbar and A. Malis, "Dynamic Networks to
             Hybrid Cloud DCs Problem Statement", draft-ietf-
             rtgwg-net2cloud-problem-statement-29, Aug, 2023.

   [SD-WAN-Edge-Discovery] L. Dunbar, et al, "BGP UPDATE for SD-
             WAN Edge Discovery", draft-ietf-idr-sdwan-edge-
             discovery-10, June 2023.

13. Acknowledgments

   Acknowledgements to Donald Eastlake, Aseem Choudh, Stephen
   Farrell for their review and suggestions.

   This document was prepared using 2-Word-v2.0.template.dot.
























Dunbar, et al.           Expires Dec 18, 2024           [Page 24]


Internet-Draft           Multi-segment SD-WAN


Authors' Addresses


   Linda Dunbar
   Futurewei
   Email: ldunbar@futurewei.com

   Kausik Majumdar
   Microsoft
   Email: kmajumdar@microsoft.com


   Venkit Kasiviswanathan
   Arista
   Email: venkit@arista.com

   Ashok Ramchandra
   Microsoft
   Email: aramchandra@microsoft.com

Contributors' Addresses

























Dunbar, et al.           Expires Dec 18, 2024           [Page 25]