Internet Engineering Task Force                            Working Group
Internet Draft                          P. Pan, E. Hahne, H. Schulzrinne
draft-pan-bgrp-framework-00.txt                     Bell Labs/Columbia U.
January 14, 2000
Expires: July, 2000


          BGRP: A Framework for Scalable Resource Reservation

STATUS OF THIS MEMO

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress".

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.


Abstract

   Resource reservation needs to accommodate the rapidly growing size
   and increasing service diversity of the Internet. This memo first
   defines the scaling problem in today's Internet backbone, and briefly
   discusses several existing resource management approaches. Then we
   will present a distributed approach and introduce a protocol, called
   the Border Gateway Reservation Protocol (BGRP), for inter-domain
   resource reservation that can scale in terms of message processing
   load, state storage and control message bandwidth.

   The main idea of our approach is to build a sink tree for each domain
   network.  Each sink tree aggregates reservations from all data
   sources in the network. Sink tree initiation, maintenance and
   termination involve only backbone border routers. Within each domain,
   the network service providers manage network resource and direct user
   traffic independently. At the border routers, the service providers
   can use BGRP to setup domain-level reservation trunks base on bi-
   lateral agreement. Since routers only maintain the sink tree
   information, the total number of reservation states at each router



P. Pan, E. Hahne, H. Schulzrinne                              [Page 1]


Internet Draft               bgrp-framework             January 14, 2000


   scales, in the worst case, linearly with the number of domains in the
   Internet.

   For bandwidth reservation, BGRP relies on differentiated services for
   data forwarding. As a result, the number of packet classifier entries
   is small.  To reduce the protocol message traffic, routers may
   reserve domain bandwidth beyond the current load so that sources can
   join or leave the tree or change their reservation without having to
   send messages all the way to the root for every such change.



1 Introduction

   Resource reservation had originally been defined to support end-to-
   end QoS guarantees for a range of QoS-sensitive applications
   including multimedia-on-demand and teleconferencing. Recently,
   service providers have started to use the same reservation mechanisms
   to provide customer-level VPNs and to dynamically provision network
   resource. To support this, the RSVP [1] has been modified [2] to
   carry MPLS information [3,4] to setup LSP's (Label Switched Path)
   across the Internet. Similarly, it can also be used to set up optical
   crossconnects (OXCs) for optical devices [5].

   Hence, resource reservation schemes must scale well with the rapidly
   growing size of the Internet. A router may be able to handle tens of
   thousands of simultaneous reservations [6], but not hundreds of
   thousands, and certainly not millions. Today's traffic volume is bad
   enough:  as we will show in Table 1 of Sec. 2 below, we have measured
   hundreds of thousands to millions of flows at the MAE-West network
   access point; if many of these flows were to request resource
   reservations, the protocol overhead would swamp the router.  But
   projected future traffic growth is an even more serious problem.  The
   overhead of the current protocol RSVP [7,1] grows like N**2, where N
   is the number of Internet end hosts.  Data in [8] shows the growth of
   N over the last six years, from 2 million to 60 million. This means
   that N**2 grew from 4x(10**12) to 4x(10**15) during that time! With
   no end in sight, N**2 is growing much faster than improvements in
   processing speeds or memory sizes.

   Therefore, we will have to find a reservation scheme that scales
   better than conventional RSVP. In this paper we will propose a
   protocol, called the Border Gateway Reservation Protocol (BGRP), that
   fixes this scaling problem in two ways. First, BGRP overhead scales
   linearly with the size of the Internet; i.e., N**2 is reduced to N.
   Second, BGRP uses "a smaller N". The overhead of the basic BGRP
   protocol is proportional to the number of Internet carrier domains
   (also called Autonomous Systems (AS)), while an enhanced version of



P. Pan, E. Hahne, H. Schulzrinne                              [Page 2]


Internet Draft               bgrp-framework             January 14, 2000


   BGRP has overhead proportional to the number of IP networks (i.e.,
   the number of announced IP address prefixes).

2 Problem Definition

   What exactly is the resource reservation "scaling" problem?  Since
   resource reservation is not yet widely used, we have to extrapolate
   the likely volume of reservation state from other observations. To
   that end, we collected a 90-second traffic trace from the MAE-West
   network access point (NAP). We categorized about 3 million IP packet
   headers [9] from June 1, 1999, according to their transport-layer
   port, IP address, IP network prefix and BGP Autonomous System (AS)
   number. Table 1 shows the results; for example, if we use RSVP, the
   total number of reservations can range from 20,857 if we reserve
   source-destination AS pairs up to 339,245 if every flow identified by
   a unique 5-tuple gets its own reservation.


            Granularity  flow discriminators            flows
            __________________________________________________
            Application  source address, port         143,243
                         dest. address, port, proto.  208,559
                         5-tuple                      339,245
            __________________________________________________
            IP Host      source address                56,935
                         dest. address                 40,538
                         source-dest. pairs           131,009
            __________________________________________________
            Network      source network                13,917
                         dest. network                 20,887
                         source-dest. pairs            79,786
            __________________________________________________
            AS           source AS                      2,244
                         dest. AS                       2,891
                         source-dest. pairs            20,857


   Table 1: Flows and aggregations based on a 90-sec packet trace from
   MAE-West


   Can network routers handle hundreds of thousands of reservations?
   After all, telephone switches handle tens of thousands of
   simultaneous calls.  In a recent study [6], we showed that a low-end
   router can set up 900 new RSVP reservations or maintain up to 1600
   reservations per second, allowing it to sustain about 45,000 flows.
   To sustain that rate, the router has to suspend routing computation
   and packet forwarding due to its hardware and CPU constraints. While



P. Pan, E. Hahne, H. Schulzrinne                              [Page 3]


Internet Draft               bgrp-framework             January 14, 2000


   backbone routers have more CPU power than the low-end router used for
   the measurements, other results [10] indicate that frequent routing
   computation due to route instability may already tax the CPU. Thus,
   we, along with some RSVP developers we talked to, believe that in
   many networking environments, routers do not have enough CPU power to
   sustain hundreds of thousands of reservations.  Developers from
   several high-end router vendors have acknowledged that RSVP
   processing could consume between 30% to 50% of router CPU cycles.

   From Table 1, we observe that there are about 21,000 unique source-
   destination AS pairs which a backbone router should be able to
   handle. However, this number is artificially low due to the small
   90-second window. Over the span of a month (May 1999), MAE-West saw
   4,908 unique source ASes, 5,001 unique destination ASes and 7,900,362
   unique AS pairs, out of the 25 million possible combinations. The
   measurement result is consistent with Bates's statistics [11]. Thus,
   unless edge routers tear down AS-level trunks frequently, there may
   be too many AS-level reservations to sustain in backbone routers.

   The table also indicates that the number of source and destination
   AS's and networks is relatively small. Bates' statistics[11] show
   that there were approximately 5,000 autonomous systems and fewer than
   60,000 network prefixes in the Internet in June 1999. Hence, if we
   set up reservations based on either source or destination AS's or
   network prefixes, we can readily keep the reservation count at levels
   sustainable by today's routers.

   Today's inter-domain routing protocol, BGP [12], establishes "virtual
   edges" by using reachability as a definition for the existence of a
   link in the graph. In case of paths of equal weight, current practice
   dictates that the router forward all packets over only one path. This
   practice guarantees that routing always follow sink trees. Hence, if
   reservations are make along routes chosen by the BGP routing
   algorithms, it is natural to aggregate these reservations along the
   sink trees formed by routing. We will discuss a sink-tree reservation
   approach in detail in Section 6.

2.1 What are we reserving?

   It has been argued that since network link bandwidth is finite, it is
   unlikely that a link would see thousands of reservations.  On the one
   hand, if we assume that a voice call is the finest granularity of
   reservation, an OC-192 link carrying 16 kb/s voice flows could
   support up to 600,000 such calls. On the other hand, our traces
   indicate that the MAE-West link carries only several hundred high-
   volume flows[13]. The latter seems to lead to a conclusion that
   reserving resources for high-volume flows does not pose scaling
   problems.



P. Pan, E. Hahne, H. Schulzrinne                              [Page 4]


Internet Draft               bgrp-framework             January 14, 2000


   However, with the deployment of IP telephony, the number of real-time
   traffic flows is likely to increase. At the same time, network
   bandwidth will likely rapidly increase due to the deployment of
   optical switches into the Internet backbone. Both factors may result
   to a large number of bandwidth reservations in the network.

   More importantly, some of the recent IETF activities on MPLS and
   Traffic Engineering[14] have suggested of using the reservation
   signaling protocols to set up MPLS LSP (Label Switched Path) tunnels
   and to provide service differentiation among users. MPLS LSP's have
   very similar characteristics as ATM VP's.  Each LSP connects two
   network nodes and the connection in turn carries datagram traffic. We
   envision that it is quite possible to have many LSP's on a given link
   in the backbone.

   In conclusion, we believe that reservation signaling protocols must
   be able to carry control information for QoS (such as bandwidth)
   reservation and for MPLS label setup.

3 Related Solutions

   Recently, several authors have addressed scalable resource
   reservation, using either a server-based or a router-based approach.

   In server-based approaches, each domain has a bandwidth broker (or
   agent) which is responsible for selecting and setting up the
   aggregated reservation sessions. This approach has the advantage of
   removing the message processing and storage burden from routers.
   However, synchronizing reservation information among the bandwidth
   brokers and the routers may be complex. No aggregation takes place,
   so that each broker still has to deal with the requests of individual
   flows. Also, care has to be taken so that the broker does not become
   a single point of failure for the domain. Variations of the server-
   based approach have been described by Blake et al. [15], Schelen and
   Pink [16], Berson et al. [17], and Reichmeyer et al. [18]. The
   latter proposal suggests a two-tier system where, within each domain,
   hosts use intra-domain reservation protocols such as RSVP to set up
   reserved flows. Inter-domain reservation protocols set up coarsely-
   measured reserved flows between domains. However, the proposal leaves
   the actual mechanism undefined.

   Awduche et al. [2], Guerin et al. [19] and Baker et al. [20] have
   proposed a router-based approach by modifying RSVP to support
   scalable reservation.  (Awduche's LSP tunnels [2] are designed to
   support intra -domain traffic engineering, but may also      be used to
   set up trunks crossing multiple domains.) These proposals aggregate
   per-application reservation requests into reservation "trunks"
   between pairs of domains, by modifying sender template and session



P. Pan, E. Hahne, H. Schulzrinne                              [Page 5]


Internet Draft               bgrp-framework             January 14, 2000


   objects in RSVP to carry address and mask ("CIDR blocks") or
   autonomous system (AS) numbers instead of 5-tuples (sender IP
   address, sender port, receiver IP address, receiver port, protocol).
   However, this implies that routers in the backbone may have to
   maintain reservation state proportional to the square of the number
   of CIDR blocks or autonomous systems. Since the number of AS is
   currently about 6,000, the number of AS pairs exceeds 36,000,000.  As
   we argued in Sections 2, this is excessive.  A more aggregated
   reservation scheme is needed.

   Finally, Feher et al. [21] have recently proposed a stateless
   reservation mechanism called Boomerang, where end users send
   reservation requests and refresh messages to set up and maintain
   reservations. No per-flow state is stored at routers.  However, the
   scalability of the control message processing is an issue.

4 Terminology

   We define the following terms in the memo.

        Domain: The term "domain" has the same meaning as the one being
             used in BGP [12]: each domain has an unique AS (autonomous
             system) number, and can exchange user traffic with its
             peers.  Each domain is under a single common
             administration.

        Border Router (BR): A domain connects to a number of other
             domains via border routers. We assume all border routers
             use BGP4 for inter-domain routing.  For simplicity reason,
             we only consider the EBGP (External-BGP) border routers at
             present time.

        Downstream and upstream: We define the directional terms
             "upstream" and "downstream" with respect to the direction
             of data flow.  The traffic direction from source to
             destination is "downstream"; destination to source,
             "upstream".

        Reservation sender and receiver: A reservation sender is an
             upstream border router that originates reservation
             messages.  A reservation receiver is a downstream border
             router that terminates reservations.

        Reservation aggregation: Reservation aggregation occurs if
             multiple reservations coming from different sources but
             going toward the same destination can be "added" together
             to create a single reservation.




P. Pan, E. Hahne, H. Schulzrinne                              [Page 6]


Internet Draft               bgrp-framework             January 14, 2000


                                      _______________
                            B     (a)|               | (c)
                   ( S1 ) ---------->| >             |
                                     |    >          |
                                     |      >        |
                            2B    (b)|        >      | (d)   3B
                ( S2,S3 ) ---------->| >-------->--> |--------------> ( D1 )
                                     |_______________|


                      Figure 1: Bandwidth reservation aggregation.




             Figure 1 illustrates the concept of bandwidth reservation
             aggregation.  (a), (b), (c) and (d) are the interfaces for
             a router.  Reservations from data source S1, S2 and S3 are
             all going to destination D1.  In the example, each source
             needs (B) units of bandwidth.  At (d), all individual
             reservations are aggregated together.  The amount of
             reservation at (d) is (3*B) as a result.

5 Requirements and Assumptions

        Message processing: The cost of processing reservation messages
             depends on the complexity of handling each message and the
             frequency of reservation messages. Instead of setting up
             reservations across domains as application flows arrive, we
             rely on pre-computed reservations made in advance. This
             also implies that each reservation would last for
             relatively long period of time (typically hours or days).

        State storage overhead: Routers need to store reservation
             control information and packet classifier tables. To reduce
             the former, we propose a scheme for aggregating reservation
             flows. To reduce the latter, we rely on diff-serv [15] to
             eliminate per-flow queuing and processing, so that the
             number of queues is likely to be no larger than a few
             dozen.

        Bandwidth overhead: The bandwidth consumed by setting up
             reservations should be small compared to the link
             bandwidth, both in steady state and with routing
             transients. This bandwidth overhead is typically
             proportional to the number of flows kept in routers, and
             thus minimizing the state storage overhead also helps here.




P. Pan, E. Hahne, H. Schulzrinne                              [Page 7]


Internet Draft               bgrp-framework             January 14, 2000


        Intra-domain vs. inter-domain reservation: It is desirable that
             each domain can manage its own network resources and
             enforce its own internal traffic engineering policies. This
             implies that a domain only reveals simple delivery
             commitments to its peering domains in terms of bi-lateral
             agreement. The inter-domain reservation then uses these
             delivery commitments to establish a reservation path
             through multiple domains. Each domain sets up transit
             reservation flows using its preferred intra-domain
             reservation mechanism.

        Routing interface: Reservation protocols do not do routing and
             only rely on the routing information to set up reservations
             along the data forwarding path. More importantly,
             reservation protocols must not interfere with route
             aggregation and effect routing protocol scaling properties.

6 Architecture

   The Internet backbone consists of a number of domains, each of which
   has at least one border router (BR). Through BGP4, each BR learns
   about the other BR's within its own domain, and the directly
   connected BR's in the adjacent domains. Through out-of-band means,
   the BR's know of bi-lateral (or multi-lateral) agreements with the
   peering domains.  Figure-2 illustrates a network structure.

   Typically, the bi-lateral agreement specifies the inter-carrier
   policy information such as route filtering and route preference [22].
   In the future, we envision that the bi-lateral agreement may also
   include policies for QoS guarantees between peering domains. It's
   worth noting here that the bi-lateral agreement applies only between
   two adjacent domains, and users do not always have the knowledge or
   the guarantees on which downstream domains their traffic will
   traverse through.

   As being advocated in the IETF Traffic Engineering Working Group, the
   ISP's set up border-to-border (or edge-to-edge) intra-domain
   "virtual" trunks between border routers. At each NAP (Network Access
   Point) or POP (Point-of-Presence), the ISP's set up similar "virtual"
   trunks to interconnect with external domains. The goal here is to
   optimize the use network resource and traffic performance. Example of
   "virtual" trunks, as represented by the lines between BR's in
   Figure-2, includes Frame Relay, ATM, MPLS and DiffServ.








P. Pan, E. Hahne, H. Schulzrinne                              [Page 8]


Internet Draft               bgrp-framework             January 14, 2000


         Domain-1                     Domain-2
    +---------------+            +---------------+
    |               |            |               |
    |  Net1   Net2  |            |  Net3   Net4  |
    |    \     /    |            |    \     /    |
    |     \   /     |            |     \   /     |
    |     BR-A      |            |     BR-B      |
    +---------------+            +---------------+
             \\                         //
              \\                       //
               \\      Domain-3       //
            +-----------------------------+
            |  BR-C                 BR-D  |
            |   | \\                 ||   |
            |   |  \\                ||   |          Domain-4
            |   |   \\=========\\    ||   |     +---------------+
            |   |               \\   ||   |     |               |
            |   |                \\  ||   |     |      Net5     |
            |   |                 \\ ||   |     |       |       |
            |   |                  \\||   |     |       |       |
            |  BR-E                 BR-F  |     |      BR-G     |
            +-----------------------------+     +---------------+
                |                   ||                 //
                |                   ||===||  ||=======//
                |                        ||  ||
           +----------+                  ||  ||
           |   BR-H   |                  ||  ||
           |     |    |           +------------------+
           |     |    |           |       BR-K       |
           |     |    |           |         |        |
           |     BR-I |-----------| BR-J ---Net6     |
           +----------+           |                  |
             Domain-5             +------------------+
                                        Domain-6


      Figure-2: An inter-domain network example.




6.1 Sink Tree Model

   Current BGP practice guarantees that routing paths form sink trees.
   In the example, the border router BR-K advertises the Net6
   reachability to the rest of the network. Let's assume that per
   routing policy, BR-C always favors the routes advertised by BR-F.
   Thus all data traffic going to Net6 forms a sink tree with branches



P. Pan, E. Hahne, H. Schulzrinne                              [Page 9]


Internet Draft               bgrp-framework             January 14, 2000


   (BR-A, BR-C, BR-F, BR-K), (BR-B, BR-D, BR-F, BR-K), and (BR-G, BR-K).
   The sink is BR-K.

   To create inter-domain reservations, one obvious choice is to build
   reservations along the sink trees. This approach has a very desirable
   scaling property: the routers only need to remember the reservation
   "sinks".  We can further improve the scalability by allowing only the
   BGP border routers to participate in the reservation process.

6.2 How to Create Sink Trees?

   There are many ways to establish reservation sink trees inside the
   backbone.

   One solution is the following: reservation senders can always query a
   central database somewhere to get the precise routing path prior to
   each reservation. A similar idea has been introduced in the RSVP LSP
   extension[2], in which an Explict-Route Object (ERO) is computed or
   queried by the reservation senders before each request. In addition,
   we had proposed a router-server tunnel data exchange mechanism using
   COPS [23].  This is a reasonable solution for intra-domain resource
   management, but it may not be applicable in inter-domain environment
   since the centralized database can be very difficult and costly to
   maintain and can have security implications.

   Another solution is to allow the routing protocols, in this case,
   BGP4, to setup sink-trees at route advertising and RIB (Route-
   Information-Base) processing time. However this approach may effect
   route aggregation and cause routing scalability problems.

   Here we propose a two-phase distributed reservation solution.  The
   reservation senders send querying messages to the network.  The
   queries will follow the data path and are delivered between BGP hops.
   At each BGP-hop, a routing path is selected based on the bi-lateral
   agreement.  However, the routers do not pin down the reservation path
   and do not store the query data.

   The reservation receivers keep track of all the queries, and
   construct sink-tree graphs from the information. The receivers send
   reservation request messages upstream to set up the actual
   reservations. At each border router, the router aggregates the
   reservations for each sink tree.

   In the next section, we will describe a protocol, BGRP, based on the
   above ideas.

7 BGRP: Border Gateway Reservation Protocol




P. Pan, E. Hahne, H. Schulzrinne                             [Page 10]


Internet Draft               bgrp-framework             January 14, 2000


   The Border Gateway Reservation Protocol (BGRP) is an inter-Autonomous
   System reservation protocol.  It is to set up aggregated reservations
   over multiple Autonomous Systems (AS). The reservation originators
   and the terminators are at the BGP-speaking border routers.
   Reservation aggregation is at the AS level. Only BGP-speaking routers
   in the network participate in reservation process.

   All traffic destining to a particular AS can be aggregated in a
   sink-tree fashion. Since backbone routers only maintain the sink tree
   information, the total number of reservations at each router scales,
   in the worst case, linearly with the number of AS's in the Internet.

   BGRP runs over TCP and thus eliminates the need to implement
   fragmentation, retransmission and sequencing.  BGRP uses "soft-state"
   to manage reservations, i.e., periodic refresh, to protect against
   events such as link failure.

   BGRP is not involved in setting up and managing intra-domain
   reservations. We envision that the ISP's may use RSVP Traffic
   Engineering extension, LDP [24] or other means to manage internal
   network resource.  BGRP only becomes useful when reservations need to
   be established across multiple AS's.

   The current version of BGRP does not provide support for multicast
   traffic.  This is because, first, the majority of the Internet
   traffic is unicast.  It is not likely to change in the near future.
   Second, most of the multicast traffic are carried in unicast
   "tunnels", where BGRP should treat such traffic transparently.

   BGRP requires two independent phases to set up a reservation:  path
   discovery and reservation aggregation. We describe these phases
   below.

7.1 Path Discovery

   The reservation senders send PROBE messages to discover the
   reservation path. The PROBE messages traverse BR's hop-by-hop until
   reaching the destination domain.  Each BGP router in the transit AS's
   must insert the associated AS number (and its own IP address) in the
   PROBE messages, but doesn't need to keep any state information.

   PROBE messages must travel the path that is used for actual
   reservation.  The PROBE message forwarding decision is made base on
   bi-lateral agreement between ISP's and some policy constraints at
   each BGP hop. The BGP hop can be found from the BGP Next-Hop
   attributes at each border.

   The border routers send rejection messages back to the senders if the



P. Pan, E. Hahne, H. Schulzrinne                             [Page 11]


Internet Draft               bgrp-framework             January 14, 2000


   destination is unreachable or reservation loops are detected.

7.2 Reservation Aggregation

   A BR at the destination domain may receive PROBE messages from
   multiple reservation sources. It uses the information carried in the
   PROBE messages to construct an AS-level graph from which a loop-free
   sink tree is formed.

   The BR sends reservation request (GRAFT) messages upstream toward the
   reservation sender. GRAFT messages traverse exactly the reversed list
   of BR's in the PROBE messages. Upon reception, transit routers can
   interface with intra-domain traffic-engineering protocols (such as
   RSVP, RSVP-LSP, or LDP) to set up a reservation within their AS's.

   The transit BR's aggregate the reservations going to the same
   reservation receiver.  The operation of reservation aggregation is
   illustrated in Section 4.

7.3 Reservation Management

   Each BGRP-enabled router periodically sends REFRESH messages to its
   adjacent BGRP routers. If a particular reservation has not been
   refreshed within a period of time, it will be deleted and the
   associated resource will be freed. Each REFRESH message must contains
   a list of all reservation states at a BR. Each reservation state is
   compressed.  A similar mechanism has been proposed for RSVP [25].

   In case of route changes, BGP must up-call BGRP to re-adjust
   reservations.  To reduce the effect of routing flapping, it may
   require some dampening mechanism to be applied on BGRP routers during
   reservation adjustment.

7.4 BGRP Enhancements

   BGRP sends at least one PROBE message and one GRAFT message between
   leaf and root for each new reservation. Since these messages consume
   processing CPU and bandwidth, one would like to reducing the control
   message volume, and thereby add another dimension of scalability to
   BGRP.  This can be done by making the following enhancements to the
   protocol.

        CIDR labeling: Sink trees can be labeled with the CIDR prefixes
             associated with the tree root. An advancing PROBE message
             can recognize when it has reached the reservation tree it
             would like to join, and no need to propagate further.

        Over-reservation: Sink tree nodes can reserve more bandwidth



P. Pan, E. Hahne, H. Schulzrinne                             [Page 12]


Internet Draft               bgrp-framework             January 14, 2000


             between themselves and the tree root than is currently
             required. (One can think of this as aggregated advance
             reservations on behalf of unknown parties.) Each router
             keeps track of the actual reserved resources and the
             reservation it made downstream. It only forwards the PROBE,
             with a step increase in the reservation, if the actual
             committed resources reach the higher reservation level.

8 Relationship with Other Protocols

8.1 Comparing BGRP with RSVP

   BGRPs approach is similar to RSVP, with BGRP's PROBE and GRAFT
   messages playing similar roles to RSVP's PATH and RESV messages.
   However, there are some of important differences between the two
   protocols:

        RSVP PATH vs. BGRP PROBE: RSVP's PATH message primarily installs
             routing state at intermediate routers to guide RESV
             messages to the data senders. Routers must therefore keep
             both reservation sender and destination information. In a
             network with N nodes, this may require O(N2) entries. In
             BGRP, routers only store the reservation information for
             the O(N) sinks, but no source information.

        Reservation aggregation: RSVP offers per-source and shared
             reservation styles. In the latter, multiple multicast
             senders take turns sharing a single reservation [26]. BGRP
             aggregates reservations by adding them together and
             propagating them downstream.

8.2 Interface with MPLS

   RSVP and LDP are two protocols being designed to distribute MPLS
   labels and thus provide intra-domain level traffic engineering.  BGRP
   interfaces with them at network border to trigger the establishment
   of MPLS LSP's.

8.3 Interface with End Users

   We assume the ISP's set up multi-domain reservations in advance.  End
   users use application-driven reservation protocols such as YESSIR [6]
   and RSVP [1] to request resource from the network. As the end-to-end
   reservation requests are received at the network border, the border
   routers simply aggregate the requests into the pre-computed trunks.

9 Example BGRP Scenarios




P. Pan, E. Hahne, H. Schulzrinne                             [Page 13]


Internet Draft               bgrp-framework             January 14, 2000


   In this section, we illustrate the operation of BGRP with the network
   configuration in Figure-2.

9.1 Path Discovery

   Initially, Net1 needs to reserve a path to Net6. The reservation can
   be characterized in terms of bandwidth, or a label path. We denote
   the reservation as R16. When the network gateway BR-A has been
   notified, it sends out a PROBE message with the following
   information:


         Reservation Source:  BR-A
         Network Destination: Net6
         Reservation Amount:  R16



   The PROBE message is encapsulated in TCP and delivered to an adjacent
   BGP next-hop router, BR-C.

   Upon receiving the PROBE message, BR-C checks with the routing
   database and the bi-lateral agreements, and determines that the
   reservation should go through one of the BGP Next-Hops, BR-F. BR-C
   will insert its IP address into the PROBE and forward the PROBE to
   BR-F via TCP.

   In turn, BR-F examines the PROBE against own resource database,
   insert its IP address, and forwards the PROBE to its BGP Next-Hop,
   BR-K.

   At BR-K, the PROBE message looks like the following:


         Reservation Source:  BR-A
         Network Destination: Net6
         Reservation Amount:  R16
         Route Record:        {BR-C, BR-F}



   After checking with own RIB (Routing Information Base), BR-I finds
   out that Net6 is an advertised IGP route, thus terminate the probing
   process.

9.2 Reservation Aggregation

   BR-K can finish up the reservation process by sending a GRAFT message



P. Pan, E. Hahne, H. Schulzrinne                             [Page 14]


Internet Draft               bgrp-framework             January 14, 2000


   upstream.  The GRAFT message contains the following information.


         Sink-Tree ID:        BR-K
         Reservation Amount:  R16
         Reservation Path:    {BR-F, BR-C, BR-A}



   Before sending the GRAFT message, BR-K needs to reserve resource on
   the link to BR-F. The Reservation Path in GRAFT describes the
   explicit reservation path through the backbone. The GRAFT message is
   sent to the first reservation hop, BR-F.

   On receiving the GRAFT message, BR-F first sets up an intra-domain
   reservation trunk between BR-F and the next reservation hop, BR-C.
   The ISP for Domain-3 can use any of the existing MPLS and resource
   reservation signaling protocols to setup such a trunk. After the
   trunk has been established, BR-F forwards the GRAFT to BR-C and
   stores the following information:



      Tree-ID      Total Resv    Next-Hop    Previous-Hop   Branch Resv
     ---------------------------------------------------------------------
       BR-K           R16          BR-K         BR-C            R16




   Similarly, BR-C sets up a reservation to BR-A. After getting the
   GRAFT message, BR-A terminates the reservation and direct all Net1 to
   Net6 user traffic through the new inter-domain reservation trunk.

   Here let's assume that BR-B had probed the network for a reservation,
   R36, to Net6 on behalf of Net3. The final probed path is (BR-K, BR-F,
   BR-D, BF-B).  BR-K sends a GRAFT message with the following
   information:


         Sink-Tree ID:        BR-K
         Reservation Amount:  R36
         Reservation Path:    {BR-F, BR-D, BR-B}



   Since this reservation shares the same sink-tree id as the previously
   described reservation, the reservations will aggregate at BR-F.  The



P. Pan, E. Hahne, H. Schulzrinne                             [Page 15]


Internet Draft               bgrp-framework             January 14, 2000


   end result is stored at the router, BR-F:



      Tree-ID      Total Resv    Next-Hop    Previous-Hop   Branch Resv
     ---------------------------------------------------------------------
       BR-K        [R16 + R36]     BR-K         BR-C            R16
                                                BR-D            R36




   The example shows the routers need to only store sink-tree
   information. Given that the average AS length in the Internet is
   somewhere between 4 and 5 [27], the storage and process gain
   introduced here can be very significant [13].

9.3 State Maintenance

   BGRP uses "soft-state" to manage reservations. BGRP routers must send
   refresh messages to next-hop and previous hop routers periodically.
   When no refresh messages are received from a peer for a period of
   time, the BGRP router will delete the reserved resource.

   In the example, if the BR-C to BR-F is broken, BR-C will stop sending
   refreshes to BR-F. Some time later, BR-F will adjust the reservation
   and update the local router information:



      Tree-ID      Total Resv    Next-Hop    Previous-Hop   Branch Resv
     ---------------------------------------------------------------------
       BR-K          R36           BR-K         BR-D            R36




   BR-F will propagate the reduced reservation information downstream in
   its next refresh cycle.

10 Security Considerations

   In the BGRP model, we always assume some level of trust between BGRP
   routers. The reservation information is delivered domain by domain.
   Without proper authentication, this will enable denial of service
   attacks. Integrity information is required for each BGRP message.

11 Acknowledgments



P. Pan, E. Hahne, H. Schulzrinne                             [Page 16]


Internet Draft               bgrp-framework             January 14, 2000


   Sean McCreary of NLANR/CAIDA helped us to collect and analyze the AS
   traces.  Craig Labovitz advised us on various aspects of network
   operations that are relevant to BGRP. Tony Przygienda and Rohit Dube
   commented on the interaction between BGP and routing aggregation.  We
   also thank Fred Baker, George Swallow, Roch Guerin, Geoff Huston and
   Andreas Terzis for discussions leading to this work.

12 Bibliography

   [1] R. Braden, Ed., L. Zhang, S. Berson, S. Herzog, and S. Jamin,
   "Resource ReSerVation protocol (RSVP) -- version 1 functional
   specification," Request for Comments (Proposed Standard) 2205,
   Internet Engineering Task Force, Sept. 1997.

   [2] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V.
   Srinivasan, "Extensions to RSVP for LSP tunnels," Internet Draft,
   Internet Engineering Task Force, Mar. 1999.  Work in progress.

   [3] R. Callon, N. Feldman, A. Fredette, G. Swallow, and A.
   Viswanathan, "A framework for multiprotocol label switching,"
   Internet Draft, Internet Engineering Task Force, June 1999.  Work in
   progress.

   [4] R. Callon, A. Viswanathan, and E. Rosen, "Multiprotocol label
   switching architecture," Internet Draft, Internet Engineering Task
   Force, Apr. 1999.  Work in progress.

   [5] D. Awduche, Y. Rekhter, J. Drake, and R. Coltun, "Multi-protocol
   lambda switching: Combining MPLS traffic engineering control with
   optical crossconnects," Internet Draft, Internet Engineering Task
   Force, Nov.  1999.  Work in progress.

   [6] P. Pan and H. Schulzrinne, "YESSIR: a simple reservation
   mechanism for the Internet," ACM Computer Communication Review, vol.
   29, pp. 89--101, Apr. 1999.

   [7] L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala,
   "RSVP: a new resource reservation protocol," in Proceedings of the
   International Networking Conference  (INET) , (San Francisco,
   California), pp. BCB--1, Internet Society, Aug. 1993.

   [8] Internet Software Consortium, "Internet domain survey."
   http://www.isc.org/ds/.

   [9] NLANR, "Nlanr network traffic packet header traces."
   http://moat.nlanr.net/Traces/.

   [10] C. Labovitz, G. R. Malan, and F. Jahanian, "Internet routing



P. Pan, E. Hahne, H. Schulzrinne                             [Page 17]


Internet Draft               bgrp-framework             January 14, 2000


   instability," in SIGCOMM Symposium on Communications Architectures
   and Protocols , (Cannes,     France), Sept. 1997.

   [11] T. Bates, "The cidr report."
   http://www.employees.org/~tbates/cidr-report.html.

   [12] T. Li and Y. Rekhter, "A border gateway protocol 4 (BGP-4),"
   Internet Draft, Internet Engineering Task Force, Aug. 1998.  Work in
   progress.

   [13] P. Pan, E. Hahne, and H. Schulzrinne, "BGRP: A tree-based
   aggregation protocol for inter-domain reservations," Technical Report
   CUCS-029-99, Columbia University, New York, New York, Dec. 1999.
   http://www.cs.columbia.edu/~library/
   TR-repository/reports/reports-1999/cucs-029-99.pdf.

   [14] J. Agogbua, D. Awduche, J. Malcolm, J. McManus, and M. O'Dell,
   "Requirements for traffic engineering over MPLS," Internet Draft,
   Internet Engineering Task Force, June 1999.  Work in progress.

   [15] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W.
   Weiss, "An architecture for differentiated service," Request for
   Comments (Informational) 2475, Internet Engineering Task Force, Dec.
   1998.

   [16] O. Schelen and S. Pink, "Aggregating resource reservation over
   multiple routing domains," in Proc. of       Fifth IFIP International
   Workshop on Quality  of Service (IwQOS) , (Cambridge, England), June
   1998.

   [17] S. Berson and S. Vincent, "Aggregation of internet integrated
   services state," in submitted to IWQOS , 1998.

   [18] F. Reichmeyer, L. Ong, A. Terzis, L. Zhang, and R. Yavatkar, "A
   two-tier resource management model for differentiated services
   networks," Internet Draft, Internet Engineering Task Force, Nov.
   1998.  Work in progress.

   [19] R. Guerin, S. Herzog, and S. Blake, "Aggregating RSVP-based QoS
   requests," Internet Draft, Internet Engineering Task Force, Nov.
   1997.  Work in progress.

   [20] F. Baker, "Aggregation of RSVP for IP4 and IP6 reservations,"
   Internet Draft, Internet Engineering Task Force, June 1999.  Work in
   progress.

   [21] G. Feher, K. Nemeth, M. Maliosz, I. Cselenyi, J. Bergkvist, D.
   Ahlard, and T. Engborg, "Boomerang - a simple protocol for resource



P. Pan, E. Hahne, H. Schulzrinne                             [Page 18]


Internet Draft               bgrp-framework             January 14, 2000


   reservation in ip networks," in IEEE Workshop on QoS Support for
   Real-Time Internet Applications , (Vancouver, Canada), June 1999.

   [22] Sprint, "Sprintlink policies."
   http://www.sprintlink.net/policies.htm.

   [23] B. Suter and P. Pan, "COPS extension for intra-domain traffic
   engineering," Internet Draft, Internet Engineering Task Force, June
   1999.  Work in progress.

   [24] N. Feldman, P. Doolan, L. Andersson, and A. Fredette, "LDP
   specification," Internet Draft, Internet Engineering Task Force, Dec.
   1997.  Work in progress.

   [25] L. Berger, D. Gan, G. Swallow, and P. Pan, "RSVP refresh
   reduction extensions," Internet Draft, Internet Engineering Task
   Force, July 1999.  Work in progress.

   [26] D. J. Mitzel and S. Shenker, "Asymptotic resource consumption in
   multicast reservation styles," in SIGCOMM Symposium on Communications
   Architectures and Protocols  , (London, UK), pp. 226--233, Sept. 1994.

   [27] NLANR, "Nlanr as path lengths." http://moat.nlanr.net/ASPL/.

13 Authors



      Ping Pan
      Bell Labs, Lucent
      101 Crawfords Corner Road
      Holmdel, NJ 07733
      USA
      electronic mail: pingpan@research.bell-labs.com

      Ellen Hahne
      Bell Labs, Lucent
      600 Mountain Ave.
      Murray Hill, NJ 07974
      USA
      electronic mail: hahne@bell-labs.com

      Henning Schulzrinne
      Dept. of Computer Science
      Columbia University
      1214 Amsterdam Avenue
      New York, NY 10027
      USA



P. Pan, E. Hahne, H. Schulzrinne                             [Page 19]


Internet Draft               bgrp-framework             January 14, 2000


      electronic mail: schulzrinne@cs.columbia.edu







                           Table        of Contents



   1          Introduction ........................................    2
   2          Problem Definition ..................................    3
   2.1        What are we reserving?  .............................    4
   3          Related Solutions ...................................    5
   4          Terminology .........................................    6
   5          Requirements and Assumptions ........................    7
   6          Architecture ........................................    8
   6.1        Sink Tree Model .....................................    9
   6.2        How to Create Sink Trees?  ..........................   10
   7          BGRP: Border Gateway Reservation Protocol ...........   10
   7.1        Path Discovery ......................................   11
   7.2        Reservation Aggregation .............................   12
   7.3        Reservation Management ..............................   12
   7.4        BGRP Enhancements ...................................   12
   8          Relationship with Other Protocols ...................   13
   8.1        Comparing BGRP with RSVP ............................   13
   8.2        Interface with MPLS .................................   13
   8.3        Interface with End Users ............................   13
   9          Example BGRP Scenarios ..............................   13
   9.1        Path Discovery ......................................   14
   9.2        Reservation Aggregation .............................   14
   9.3        State Maintenance ...................................   16
   10         Security Considerations .............................   16
   11         Acknowledgments .....................................   16
   12         Bibliography ........................................   17
   13         Authors .............................................   19

H:.tex











P. Pan, E. Hahne, H. Schulzrinne                             [Page 20]