SIPPING Working Group                                            V. Hilt
Internet-Draft                                                I. Widjaja
Expires: September 4, 2007                      Bell Labs/Alcatel-Lucent
                                                                D. Malas
                                                  Level 3 Communications
                                                          H. Schulzrinne
                                                     Columbia University
                                                           March 3, 2007


           Session Initiation Protocol (SIP) Overload Control
                     draft-hilt-sipping-overload-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 4, 2007.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   Overload occurs in Session Initiation Protocol (SIP) networks when
   SIP servers have insufficient resources to handle all SIP messages
   they receive.  Even though the SIP protocol provides a limited
   overload control mechanism through its 503 response code, SIP servers



Hilt, et al.            Expires September 4, 2007               [Page 1]


Internet-Draft              Overload Control                  March 2007


   are still vulnerable to overload.  This document proposes several new
   overload control mechanisms for the SIP protocol.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Design Considerations  . . . . . . . . . . . . . . . . . . . .  4
     3.1.  System Model . . . . . . . . . . . . . . . . . . . . . . .  4
     3.2.  Hop-by-Hop vs. End-to-End  . . . . . . . . . . . . . . . .  5
     3.3.  Topologies . . . . . . . . . . . . . . . . . . . . . . . .  7
     3.4.  Overload Control Method  . . . . . . . . . . . . . . . . .  9
       3.4.1.  Rate-based Overload Control  . . . . . . . . . . . . .  9
       3.4.2.  Loss-based Overload Control  . . . . . . . . . . . . . 10
       3.4.3.  Window-based Overload Control  . . . . . . . . . . . . 10
     3.5.  Overload Control Algorithms  . . . . . . . . . . . . . . . 11
       3.5.1.  Increase Algorithm . . . . . . . . . . . . . . . . . . 12
       3.5.2.  Decrease Algorithm . . . . . . . . . . . . . . . . . . 12
     3.6.  Load Status  . . . . . . . . . . . . . . . . . . . . . . . 12
     3.7.  SIP Mechanism  . . . . . . . . . . . . . . . . . . . . . . 13
     3.8.  Backwards Compatibility  . . . . . . . . . . . . . . . . . 13
     3.9.  Interaction with Local Overload Control  . . . . . . . . . 14
   4.  SIP Application Considerations . . . . . . . . . . . . . . . . 14
     4.1.  How to Calculate Load Levels . . . . . . . . . . . . . . . 14
     4.2.  Responding to an Overload Indication . . . . . . . . . . . 15
     4.3.  Emergency Services Requests  . . . . . . . . . . . . . . . 15
     4.4.  Operations and Management  . . . . . . . . . . . . . . . . 16
   5.  SIP Load Header Field  . . . . . . . . . . . . . . . . . . . . 16
     5.1.  Generating the Load Header . . . . . . . . . . . . . . . . 16
     5.2.  Determining the Load Header Value  . . . . . . . . . . . . 17
     5.3.  Determining the Throttle Parameter Value . . . . . . . . . 17
     5.4.  Processing the Load Header . . . . . . . . . . . . . . . . 18
     5.5.  Using the Load Header Value  . . . . . . . . . . . . . . . 19
     5.6.  Using the Throttle Parameter Value . . . . . . . . . . . . 19
     5.7.  Rejecting Requests . . . . . . . . . . . . . . . . . . . . 20
   6.  Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 21
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 22
   Appendix A.  Acknowledgements  . . . . . . . . . . . . . . . . . . 22
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 22
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 23
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23
   Intellectual Property and Copyright Statements . . . . . . . . . . 25






Hilt, et al.            Expires September 4, 2007               [Page 2]


Internet-Draft              Overload Control                  March 2007


1.  Introduction

   As with any network element, a Session Initiation Protocol (SIP) [2]
   server can suffer from overload when the number of SIP messages it
   receives exceeds the number of messages it can process.  SIP server
   overload can pose a serious problem.  During periods of overload, the
   throughput of a SIP network can be significantly degraded.  In
   particular, SIP server overload may lead to a situation in which the
   throughput drops to a small fraction of the original capacity of the
   network.  This is often called congestion collapse.

   The SIP protocol provides a limited mechanism for overload control
   through its 503 response code.  However, this mechanism cannot
   prevent SIP server overload and it cannot prevent congestion collapse
   in a network of SIP servers.  In fact, the 503 response code
   mechanism may cause traffic to move back and forth between SIP
   servers and thereby worsen an overload condition.  A detailed
   discussion of the SIP overload problem, the 503 response code and the
   requirements for a SIP overload control solution can be found in [5].

   Overload is said to occur if a SIP server does not have sufficient
   resources to process all incoming SIP messages.  These resources may
   include CPU processing capacity, memory, network bandwidth, input/
   output, or disk resources.  Generally speaking, overload occurs if a
   SIP server can no longer process or respond to all incoming SIP
   messages.

   We only consider failure cases where SIP servers cannot process all
   incoming SIP requests.  There are other failure cases where the SIP
   server can process, but not fulfill, requests.  These are beyond the
   scope of this document since SIP provides other response codes for
   these cases and overload control MUST NOT be used to handle these
   scenarios.  For example, a PSTN gateway that runs out of trunk lines
   but still has plenty of capacity to process SIP messages should
   reject incoming INVITEs using a 488 (Not Acceptable Here) response
   [4].  Similarly, a SIP registrar that has lost connectivity to its
   registration database but is still capable of processing SIP messages
   should reject REGISTER requests with a 500 (Server Error) response
   [2].

   This specification is structured as follows: Section 3 discusses
   general design principles of an SIP overload control mechanism.
   Section 4 discusses general considerations for applying SIP overload
   control.  Section 5 defines a SIP protocol extension for overload
   control and Section 6 introduces the syntax of this extension.
   Section 7 and Section 8 discuss security and IANA considerations
   respectively.




Hilt, et al.            Expires September 4, 2007               [Page 3]


Internet-Draft              Overload Control                  March 2007


2.  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
   RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as
   described in BCP 14, RFC 2119 [1] and indicate requirement levels for
   compliant implementations.


3.  Design Considerations

   This section discusses key design considerations for a SIP overload
   control mechanism.  The goal for this mechanism is to prevent
   upstream servers to send SIP messages to an overloaded downstream
   server, rather than rejecting messages already sent at the overloaded
   server.

3.1.  System Model

   The model shown in Figure 1 identifies fundamental components of an
   SIP overload control system:

   o  SIP Processor: component that processes SIP messages.  The SIP
      processor is the component that is protected by overload control.
   o  Monitor: component that monitors the current load of the SIP
      processor on the receiving entity.  The monitor implements the
      mechanisms needed to measure the current usage of resources
      relevant for the SIP processor.  It reports load samples (S) to
      the Control Function.
   o  Control Function: component that implements the actual overload
      control mechanism on the receiving and sending entity.  The
      control function uses the load samples (S) provided by the
      monitor.  It determines if overload has occurred and a throttle
      (T) needs to be set to adjust the load sent to the SIP processor
      on the receiving entity.  The control function on the receiving
      entity sends load feedback (F) to the control function sending
      entity.
   o  Actuator: component that acts on the throttles (T) generated by
      the control function and adjust the load forwarded to the
      receiving entity accordingly.  For example, a throttle may
      instruct the actuator to reduce the load destined to the receiving
      entity by 10%.  The actuator decides how the load reduction is
      achieved (e.g., by redirecting or rejecting requests).

   The type of feedback (F) conveyed from the receiving to the sending
   entity depends on the overload control method used (i.e., rate-based
   vs. window-based overload control; see Section 3.4.3) as well as
   other design parameters (e.g., whether load status information is



Hilt, et al.            Expires September 4, 2007               [Page 4]


Internet-Draft              Overload Control                  March 2007


   included or not).  In any case, the feedback (F) informs the sending
   entity that overload has occurred and that the traffic forward to the
   receiving entity needs to be reduced to a lower rate.

          Sending                Receiving
           Entity                  Entity
     +----------------+      +----------------+
     |    Server A    |      |    Server B    |
     |  +----------+  |      |  +----------+  |    -+
     |  | Control  |  |  F   |  | Control  |  |     |
     |  | Function |<-+------+--| Function |  |     |
     |  +----------+  |      |  +----------+  |     |
     |     T |        |      |       ^        |     | Overload
     |       v        |      |       | S      |     | Control
     |  +----------+  |      |  +----------+  |     |
     |  | Actuator |  |      |  | Monitor  |  |     |
     |  +----------+  |      |  +----------+  |     |
     |       |        |      |       ^        |    -+
     |       v        |      |       |        |    -+
     |  +----------+  |      |  +----------+  |     |
   <-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
   --+->|Processor |--+------+->|Processor |--+->   | System
     |  +----------+  |      |  +----------+  |     |
     +----------------+      +----------------+    -+


                Figure 1: System Model for Overload Control

3.2.  Hop-by-Hop vs. End-to-End

   A SIP request is often processed by more than one SIP server.  Thus,
   overload control can in theory be applied hop-by-hop, i.e.,
   individually between each pair of servers, or end-to-end as a single
   control loop that stretches across the entire path from UAC to UAS
   (see Figure 2).
















Hilt, et al.            Expires September 4, 2007               [Page 5]


Internet-Draft              Overload Control                  March 2007


               +---------+             +-------+----------+
      +------+ |         |             |       ^          |
      |      | |        +---+          |       |         +---+
      v      | v    //=>| C |          v       |     //=>| C |
   +---+    +---+ //    +---+       +---+    +---+ //    +---+
   | A |===>| B |                   | A |===>| B |
   +---+    +---+ \\    +---+       +---+    +---+ \\    +---+
               ^    \\=>| D |          ^       |     \\=>| D |
               |        +---+          |       |         +---+
               |         |             |       v          |
               +---------+             +-------+----------+

      (a) hop-by-hop loop              (b) end-to-end loop

    ==> SIP request flow
    <-- Load feedback loop


                    Figure 2: Hop-by-Hop vs. End-to-End

   In the hop-by-hop model, a separate overload control loop is
   instantiated between each pair of neighboring SIP servers on the path
   of a SIP request.  Each SIP server provides load feedback to its
   upstream neighbors, which then adjust the amount of traffic they are
   forwarding to the SIP server.  However, the neighbors do not forward
   the received feedback information further upstream.  Instead, they
   act on the feedback and resolve the overload condition if needed, for
   example, by re-routing or rejecting traffic.

   The upstream neighbor of a server can, and should, use a separate
   overload control loop with its upstream neighbors.  If the neighbor
   becomes overloaded, it will report this problem to its upstream
   neighbors, which again take action based on the reported feedback.
   Thus, in hop-by-hop overload control, overload is resolved by the
   direct upstream neighbors of the overloaded server without the need
   to involve entities that are located multiple SIP hops away.

   Hop-by-hop overload control can effectively reduce the impact of
   overload on a SIP network and, in particular, can avoid congestion
   collapse.  In addition, hop-by-hop overload control is simple and
   scales well to networks with many SIP entities.  It does not require
   a SIP entity to aggregate a large number of load status values or
   keep track of the load status of SIP servers it is not communicating
   with.

   End-to-end overload control implements an overload control loop along
   the entire path of a SIP request, from UAC to UAS.  An end-to-end
   overload control mechanism needs to consider load information from



Hilt, et al.            Expires September 4, 2007               [Page 6]


Internet-Draft              Overload Control                  March 2007


   all SIP servers on the way (including all proxies and the UAS).  It
   has to be able to frequently collect the load status of all servers
   on the potential path(s) to a destination and combine this data into
   meaningful load feedback.  A UA or SIP server should not throttle its
   load unless it knows that all potential paths to the destination are
   overloaded.

   Overall, the main problem of end-to-end path overload control is its
   inherent complexity since a UAC or SIP server would need to monitor
   all potential paths to a destination in order to know when to
   throttle.  Therefore, end-to-end overload control is likely to only
   work if a UA/server sends lots of requests to the exact same
   destination.

3.3.  Topologies

   A simple topology for overload control is a SIP server that receives
   traffic from a single source (as shown in Figure 3(a)).  A load
   balancer is a typical example for this configuration.  In more
   complex topology, a SIP server receives traffic from multiple
   upstream sources.  This is shown in Figure 3(b), where SIP servers A,
   B and C forward traffic to server D. It is important to note that
   each of these servers may contribute a different amount of load to
   the overall load of D. This load mix may vary over time.  If server D
   becomes overloaded, it generates feedback to reduce the amount of
   traffic it receives from its upstream neighbors (i.e., A or A, B and
   C respectively).

   If a SIP server (server D) becomes overloaded, it needs to decide how
   overload control feedback is balanced across upstream neighbors.
   This decision needs to account for the actual amount of traffic
   received from an upstream neighbor.  The decision may need to be re-
   adjusted as the load contributed by each upstream neighbor varies
   over time.  A server may use a local policy to decide how much load
   it wants to receive from each upstream neighbor.  For example, a
   server may throttle all upstream sources equally (e.g., all sources
   need to reduce traffic forwarded by 10%) or to prefer some servers
   over others.  For example, it may want to throttle a less preferred
   upstream neighbor earlier than a preferred neighbor or throttle the
   neighbor first that sends the most traffic.  Since this decision is
   made by the receiving entity (i.e., server D), all senders for this
   entity are governed by the same overload control algorithm.

   In many network configurations, upstream servers (A, B and C) have
   alternative servers (server E) to which they can redirect excess
   messages if the primary target (server D) is overloaded (see
   Figure 3(c)).  Servers D and E may differ in their processing
   capacity.  When redirecting messages, the upstream servers need to



Hilt, et al.            Expires September 4, 2007               [Page 7]


Internet-Draft              Overload Control                  March 2007


   ensure that these messages do not overload the alternate server.  An
   overload control mechanism should enables upstream servers to only
   choose alternative servers that have enough capacity to handle the
   redirected requests.


                   +---+              +---+
                /->| D |              | A |-\
               /   +---+              +---+  \
              /                               \   +---+
       +---+-/     +---+              +---+    \->|   |
       | A |------>| E |              | B |------>| D |
       +---+-\     +---+              +---+    /->|   |
              \                               /   +---+
               \   +---+              +---+  /
                \->| F |              | C |-/
                   +---+              +---+

     (a) load balancer w/           (b) multiple upstream
      alternate servers                   neighbors

       +---+
       | A |---\                        a--\
       +---+=\  \---->+---+                 \
              \/----->| D |             b--\ \--->+---+
       +---+--/\  /-->+---+                 \---->|   |
       | B |    \/                      c-------->| D |
       +---+===\/\===>+---+                       |   |
               /\====>| E |            ...   /--->+---+
       +---+--/   /==>+---+                 /
       | C |=====/                      z--/
       +---+

     (c) multiple upstream         (d) very large number of
   neighbors w/ alternate server      upstream neighbors


                           Figure 3: Topologies

   Overload control that is based on throttling the message rate is not
   suited for servers that receive requests from a very large population
   of senders, which only infrequently send requests as shown in
   Figure 3(d).  An edge proxy that is connected to many UAs is an
   example for such a configuration.  Since each UA typically only
   contributes a single request to an overload condition, it can't
   decrease its message rate to resolve the overload.

   In such a configuration, a SIP server can gradually reduce its load



Hilt, et al.            Expires September 4, 2007               [Page 8]


Internet-Draft              Overload Control                  March 2007


   by rejecting a percentage of the requests it receives with 503
   responses.  Since there are many upstream neighbors that contribute
   to the overall load, sending 503 to a fraction of them gradually
   reduces load without entirely stopping the incoming traffic and helps
   to resolve the overload condition in this scenario.

3.4.  Overload Control Method

   The method used by an overload control mechanism to curb the amount
   of traffic forwarded to an element is a key aspect of the design.
   Three different types of overload control methods exist: rate-based,
   loss-based and window-based overload control.

3.4.1.  Rate-based Overload Control

   The key idea of rate-based overload control is to indicate the
   message rate that an upstream element is allowed to send to the
   downstream neighbor.  If overload occurs, a SIP server instructs each
   upstream neighbor to send at most X messages per second.  This rate
   cap ensures that the offered load for a SIP server never increases
   beyond the sum of the rate caps granted to all upstream neighbors and
   can protect a SIP server from overload even during extreme load
   spikes.

   A common technique to implement a rate cap of a given number of
   messages per second X is message gapping.  After transmitting a
   message to a downstream neighbor, a server waits for 1/X seconds
   before it transmits the next message to the same neighbor.  Messages
   that arrive during the waiting period are not forwarded and are
   either redirected, rejected or buffered.

   The main drawback of this mechanism is that it requires a SIP server
   to assign a certain rate cap to each of its upstream neighbors based
   on its overall capacity.  Effectively, a server assigns a share of
   its capacity to each upstream neighbor.  The server needs to ensure
   that the sum of all rate caps assigned to upstream neighbors is not
   (significantly) higher than its actual processing capacity.  This
   requires a SIP server to continuously evaluate the amount of load it
   receives from an upstream neighbor and assign a rate cap that is
   suitable for this neighbor.  For example, in a non-overloaded
   situation, it could assign a rate cap that is 10% higher than the
   current rate from this neighbor.  The rate cap needs to be adjusted
   if the load offered by upstream neighbors changes and new upstream
   neighbors appear or an existing neighbor stops transmitting.  If the
   cap assigned to an upstream neighbor is too high, the server may
   still experience overload.  However, if the cap is too low, the
   upstream neighbors will reject messages even though they could be
   processed by the server.  Thus, rate-based overload control is likely



Hilt, et al.            Expires September 4, 2007               [Page 9]


Internet-Draft              Overload Control                  March 2007


   to work well only if the number of upstream servers is small and
   constant, e.g., as shown in the example in Figure 3(d).

3.4.2.  Loss-based Overload Control

   A loss percentage enables a SIP server to ask its upstream neighbor
   to reduce the amount of traffic it would normally forward to this
   server by a percentage X. For example, a SIP server can ask its
   upstream neighbors to lower the traffic it would forward to it by
   10%.  The upstream neighbor then redirects or rejects X percent of
   the traffic that is destined for this server.

   A loss percentage can be implemented in the upstream entity, for
   example, by drawing a random number between 1 and 100 for each
   request to be forwarded.  The request is not forwarded to the server
   if the random number is less than or equal to X.

   A server does not need to track the message rate it receives from
   each upstream neighbor.  To reduce load, a server can ask each
   upstream neighbor to lower traffic by a certain percentage which can
   be determined independent of the actual message rate contributed by
   each server.  The loss percentage depends on the loss percentage
   currently used by the upstream servers and the current system load of
   the server.  For example, if the server load approaches 90% and the
   current loss percentage is set to a 50% load reduction, then the
   server may decide to increase the loss percentage to 55% in order to
   get back to a system load of 80%.  Similarly, the server can lower
   the loss percentage if permitted by the system utilization.  This
   requires that system load can be accurately measured and that these
   measurements are reasonably stable.

   The main drawback of percentage throttling is that the throttle
   percentage needs to be adjusted to the offered load, in particular,
   if the load fluctuates quickly.  For example, if a SIP server sets a
   throttle value of 10% at time t1 and load increases by 20% between
   time t1 and t2 (t1<t2), then the server will see a load increase by
   10% between time t1 and t2.  This is true even though all upstream
   neighbors reduced traffic by 10% as told.  Thus, percentage
   throttling requires the quick adjustment of the throttling percentage
   and may not always be able to prevent a server from encountering
   brief periods of overload in extreme cases.

3.4.3.  Window-based Overload Control

   The key idea of window-based overload control is to allow an entity
   to transmit a certain number of messages before it needs to receive a
   confirmation for the messages in transit.  Each sender maintains an
   overload window that limits the number of messages that can be in



Hilt, et al.            Expires September 4, 2007              [Page 10]


Internet-Draft              Overload Control                  March 2007


   transit without being confirmed.

   Each sender maintains a unconfirmed message counter for each
   downstream neighbor it is communicating with.  For each message sent
   to the downstream neighbor, the counter is increased by one.  For
   each confirmation received, the counter is decreased by one.  The
   sender stops transmitting messages to the downstream neighbor when
   the unconfirmed message counter has reached the current window size.

   A crucial parameter for the performance of window-based overload
   control is the window size.  The windows size together with the
   round-trip time between sender and receiver determines the effective
   message rate that can be achieved.  Each sender has an initial window
   size it uses when first sending a request.  This window size can
   change based on the feedback it receives from the receiver.  The
   receiver can require a decrease in window size to throttle the sender
   or allow an increase to allow an increasing message rate.

   The sender adjusts its window size as soon as it receives the
   corresponding feedback from the receiver.  If the new window size is
   smaller than the current unconfirmed message counter, the sender MUST
   stop transmitting messages until more messages are confirmed and the
   current unconfirmed message counter is less than the window size.

   A sender should not treat the reception of a 100 Trying response as
   an implicit confirmation for a message. 100 Trying responses are
   often created by a SIP server very early in the process and do not
   indicate that a message has been successfully processed and cleared
   from the input buffer.  If the downstream neighbor is a stateless
   proxy, it will not create 100 Trying responses at all and instead
   pass through 100 Trying responses created by the next stateful
   server.  Also, 100 Trying response are typically only created for
   INVITE requests.  Explicit message confirmations in a load feedback
   report do not have these problems.

   The behavior and issues of window-based overload control are similar
   to rate-based overload control, in that the total available receiver
   buffer space needs to be divided among all upstream neighbors.
   However, unlike rate-based overload control, it can ensure that the
   receiver buffer never overflows.  The transmission of messages by
   senders is effectively clocked by message confirmations received from
   the receiver.

3.5.  Overload Control Algorithms

   This section describes algorithms that govern the behavior of
   entities that implement the overload control mechanism.




Hilt, et al.            Expires September 4, 2007              [Page 11]


Internet-Draft              Overload Control                  March 2007


      OPEN ISSUE: the following overload control algorithms are strawman
      proposals.  They depend on the overload control methods.

3.5.1.  Increase Algorithm

   A SIP server that is starting to transmit messages to a downstream
   neighbor should slowly increase the message rate until it reaches the
   rate allowed by the receiver.  In particular servers that are
   restarted or added to the network should execute this algorithm.  The
   slow increase prevents a server from overloading its downstream
   neighbor by suddenly and significantly increasing load.  A server
   should, however, not reject requests during this period.  Instead, it
   should redirect or buffer requests it cannot forward right away.
   This avoids that calls are rejected in a empty network because of the
   slow increase algorithm.

   A slow increase should also be used when the receiver has recovered
   from an overload condition and is increasing the allowed transmission
   rate.  In this case, the sender may reject messages during the slow
   increase Slow increase after overload avoids that the receiver
   becomes overloaded again when all senders suddenly increase the
   message rate.

      OPEN ISSUE: this algorithm seems to be useful for loss-based
      overload control but may not be needed for other types of overload
      control.

3.5.2.  Decrease Algorithm

   A sender MUST immediately reduce its transmission rate when it
   receives an indication that the receiver has reached an overload
   condition.

3.6.  Load Status

   It may be useful for a SIP server to frequently provide its current
   load status to upstream neighbors.  The load status indicates to
   which degree the resources needed by a SIP server to process SIP
   messages are utilized.  SIP servers can use the load status to
   balance load between alternative proxies and to find under-utilized
   servers.  It should be noted, however, that reporting load is not
   intended to replace specialized load balancing mechanisms.

      OPEN ISSUE: reporting load status seems related but somewhat
      orthogonal to overload control.  It might therefore be better to
      handle overload control and load reporting/balancing in separate
      mechanisms.




Hilt, et al.            Expires September 4, 2007              [Page 12]


Internet-Draft              Overload Control                  March 2007


3.7.  SIP Mechanism

   A SIP mechanism needs to convey load feedback from the receiving to
   the sending SIP entity.  A number of alternatives exist to realize
   such a mechanism.

   In principle, it would be possible to define a new SIP request that
   can be used to convey load status reports from the receiving to the
   sending entity.  However, sending separate load status requests from
   receiving to the sending entity would create additional messaging
   overhead, which is undesirable during periods of overload.  It would
   also require each SIP server to keep track of all potential upstream
   neighbors.

   Similarly, it would be possible to define an event packages for
   subscriptions to overload status.  This package would enable a
   sending entity to subscribe to the load status of its downstream
   neighbor and receive status updates in NOTIFY messages.  However, it
   would require each sending entity to set up a subscription to all
   entities it may forward traffic to and manage these subscriptions
   over time for a varying set of downstream neighbors.  Setting up
   subscriptions and sending overload feedback in notifications creates
   an undesirable messaging overhead.

   Another approach is to define a new SIP header field for load
   information that can be inserted into SIP responses.  This approach
   has the advantage that it automatically provides load feedback to all
   upstream SIP entities that are currently forwarding traffic to a SIP
   server with very little overhead.

   Hop-by-hop overload control requires that the distribution of load
   feedback is limited to the next upstream SIP server.  This can be
   achieved by adding the address of the next hop server, that is, the
   destination of the load report, to the load header.  Conveying load
   feedback in a SIP response header requires that SIP traffic is
   flowing between the sending and the receiving entity.  This is
   usually not a problem when regulating traffic between SIP servers.
   Even in an overload situation that requires 100% throttling, an
   upstream server can forward an occasional request or send an OPTIONS
   request to probe the load status of the downstream neighbor.

3.8.  Backwards Compatibility

   An important requirement for an overload control mechanism is that it
   can be gradually introduced into a network and that it functions
   properly if only a fraction of the servers support it.

   Hop-by-hop overload control does not require that all SIP entities in



Hilt, et al.            Expires September 4, 2007              [Page 13]


Internet-Draft              Overload Control                  March 2007


   a network support it.  It can be used effectively between two
   adjacent SIP servers if both servers support this extension and does
   not depend on the support from any other server or user agent.  The
   more SIP servers in a network support this mechanism, the more
   effective it is since it includes more of the servers in the load
   reporting and offloading process.

   In topologies such as the ones depicted in Figure 3(b) and (c), a SIP
   server has multiple neighbors from which only some may support
   overload control.  If a server would simply use this extension for
   overload control, only those that support it would throttle their
   load.  Others would keep sending at the full rate and benefit from
   the throttling by other servers supporting this extension.  In other
   words, upstream neighbors that do not support overload control would
   be better off than those that do.

   A SIP server should therefore use 5xx responses towards upstream
   neighbors that do not support this specification.  The server should
   reject the same amount of requests with 5xx responses that would be
   otherwise be rejected/redirected by the upstream neighbor if it would
   support overload control.  For example, if the server has throttled
   the load by 10%, it should reject 10% of the requests with a 5xx
   response for this neighbor.

3.9.  Interaction with Local Overload Control

   Servers may want to protect themselves against overload by rejecting
   incoming messages with minimal effort when a server is overloaded.
   We refer to such mechanisms as local overload control.

   Local overload control can be used in conjunction with the mechanisms
   defined in this specification and provides an additional layer of
   protection against overload, for example, in cases where upstream
   servers do not support overload control.  In general, servers should
   start to throttle upstream neighbors before using local overload
   control mechanisms to reject messages, i.e., at a lower level of
   load.


4.  SIP Application Considerations

4.1.  How to Calculate Load Levels

   Calculating an element's load level is dependent on the limiting
   resource for this element.  There can be several contributing
   factors, such as CPU, memory, queue depth, calls per second,
   application threads.  The element should consider to have reached a
   load level of 100% at a point at which it cannot reliably process any



Hilt, et al.            Expires September 4, 2007              [Page 14]


Internet-Draft              Overload Control                  March 2007


   more messages.  If an element knows what its limiting resource is, it
   can calculate its load level based on this resource or use a
   combination of resources.  The element should recognize percentages
   of load prior to hitting 100% based on the limit at which the element
   becomes 100% loaded.

4.2.  Responding to an Overload Indication

   An element may receive a load header indicating that it needs to
   reduce the traffic it sends to its downstream neighbor.  An element
   can accomplish this task by sending some of the requests that would
   have gone to the overloaded element to a different destination or by
   altering its load-balancing algorithm to lower the number of calls it
   will offer to the overloaded resource.  It can also buffer requests
   in the hope that the overload condition will resolve quickly and the
   requests still can be forwarded in time.  Finally, it can reject
   these requests.

   The algorithm to reduce load is open to element and vendor
   implementation.  (Note: the load reduction algorithm is not
   specifying the quantity of SIP messages allowed by the SIP server, it
   is simply specifying a treatment of new request based on a specified
   load level.)  However, the algorithm should provide for tunable
   parameters to allow the network operator to optimize over time.  The
   settings for these parameters could be different for a carrier
   network versus an enterprise or VSP network.  The goal of the
   algorithm is to alleviate load throughout the network.  This means
   avoidance of propagating load from one element to another.  This also
   means trying to keep the network as full as possible without reaching
   100% on any given element, except when all elements approach 100%.
   Balancing load across a network is the responsibility of the
   operator, but the load header may help the operator to adjust the
   balancing in a more dynamic fashion by allowing the load-balancing
   algorithm to react to bursts or outages.

4.3.  Emergency Services Requests

   It is generally recommended proxy servers should attempt to balance
   all SIP requests, and relative resources, to a maximum load of 80%.
   In doing so, the servers are proactively tuned to allow an emergency
   services request [7] attempt to be placed to any available upstream
   or downstream SIP device for immediate processing and delivery to the
   intended emergency services provider.

   In some cases, the load will increase beyond this point and a server
   will need to begin to lower the number of requests forwarded.  When
   the SIP server receives an emergency services request, it should not
   be treated by alleviation methods and should be processed



Hilt, et al.            Expires September 4, 2007              [Page 15]


Internet-Draft              Overload Control                  March 2007


   immediately.  In some cases, a SIP server may receive more emergency
   services requests than it is allowed to forward.  This may happen,
   for example, to a SIP server that is serving an emergency service
   center.  In these cases and after rejecting/redirecting all non-
   emergency service requests, a SIP server should also include
   emergency service requests in the alleviation treatment to avoid that
   the downstream server becomes overloaded.

4.4.  Operations and Management

   The load header information can be captured within Call Detail
   Records (CDRs) and SNMP traps for use in service reports.  These
   service reports could be used for future network optimization.


5.  SIP Load Header Field

   This section defines a new SIP header field for overload control, the
   Load header.  This header field follows the above design
   considerations for an overload control mechanism.

      Note: the Load header field is an initial proposal for a overload
      feedback mechanism.  The design of this header depends on many of
      the design aspects discussed above (in particular the overload
      control method as discussed in Section 3.4).

5.1.  Generating the Load Header

   A SIP server compliant to this specification SHOULD frequently
   provide load feedback to its upstream neighbors in a timely manner.
   It does so by inserting a Load header field into the SIP responses it
   is forwarding or creating.  The Load header is a new header field
   defined in this specification.  The Load header can be inserted into
   all response types, including provisional, success and failure
   response types.  A SIP server SHOULD insert a Load header into all
   responses.

   A SIP server MAY choose to insert Load headers less frequently, for
   example, once every x milliseconds.  This may be useful for SIP
   servers that receive a very high number of messages from the same
   upstream neighbor or servers with a very low variability of the load
   measure.  In any case, a SIP server SHOULD insert a Load header into
   a response well before the previous Load header sent to the same
   upstream neighbor expires.  Only SIP servers that frequently insert
   Load header into responses are protected against overload.

   The Load header is only defined in SIP responses and MUST NOT be used
   in SIP requests.  The SIP header is only useful to the upstream



Hilt, et al.            Expires September 4, 2007              [Page 16]


Internet-Draft              Overload Control                  March 2007


   neighbor of a SIP server since this is the entity that can offload
   traffic by redirecting/rejecting new requests.  If requests are
   forwarded in both directions between two SIP servers (i.e., the roles
   of upstream/downstream neighbors change), there are also responses
   flowing in both directions which allow the two SIP servers to
   exchange load information.  While Load headers in requests may
   increase the frequency with which load information is exchanged in
   these scenarios, this increase will rarely provide benefits and does
   not seem to justify the added overhead and complexity needed.

   A SIP server MUST insert the address of its upstream neighbor into
   the "target" parameter of the Load header.  It SHOULD use the address
   of the upstream neighbor found in the topmost Via header of the
   response for this purpose.

   The "target" parameter enables the receiver of a Load header to
   determine if it should process the Load header (since it was
   generated by its downstream neighbor) or if the Load header needs to
   be ignored (since it was passed along by an entity that does not
   support this extension).  Effectively, the "target" parameter
   implements the hop-by-hop semantics and prevents the use of load
   status information beyond the next hop.

      OPEN ISSUE: instead of using the address in the Via header it
      might make sense to use an application identifier similar to the
      one defined for SigComp [6].

   A SIP server SHOULD add a "validity" parameter to the Load header.
   The "validity" parameter defines the time in milliseconds during
   which the Load header should be considered valid.  The default value
   of the "validity" parameter is 500.  A SIP server SHOULD use a
   shorter "validity" time if its load status varies quickly and MAY use
   a longer "validity" time if the current load level is more stable.

5.2.  Determining the Load Header Value

   The value of the Load header contains the current load status of the
   SIP server generating this header.  Load header values range from 0
   (idle) to 100 (overloaded) and MUST reflect the current level in the
   usage of SIP message processing resources.  For example, a SIP server
   that is processing SIP messages at a rate that corresponds to 50% of
   its maximum capacity must set the Load header value to 50.

5.3.  Determining the Throttle Parameter Value

   The value of the "throttle" parameter specifies the percentage by
   which the load forwarded to this SIP server should be reduced.
   Possible values range from 0 (the load forwarded is reduced by 0%,



Hilt, et al.            Expires September 4, 2007              [Page 17]


Internet-Draft              Overload Control                  March 2007


   i.e., all traffic is forwarded) to 100 (the load forwarded is reduced
   by 100%, i.e., no traffic forwarded).  The default value of the
   "throttle" parameter is 0.  The "throttle" parameter value is
   determined by the control function of the SIP server generating the
   Load header.

      OPEN ISSUE: this parameter depends on the overload control method
      used (e.g., whether rate-based or window-based overload control is
      used).

5.4.  Processing the Load Header

   A SIP entity compliant to this specification MUST remove all Load
   headers from the SIP messages it receives before forwarding the
   message.  A SIP entity may, of course, insert its own Load header
   into a SIP message.

   A SIP entity MUST ignore all Load headers that were not addressed to
   it.  It MUST compare its own addresses with the address in the
   "target" parameter of the Load header.  If none of its addresses
   match, it MUST ignore the Load header.  This ensures that a SIP
   entity only processes Load headers that were generated by its direct
   neighbors.

   A SIP server MUST store the information received in Load headers from
   a downstream neighbor in a server load table.  Each time a SIP server
   receives a response with a Load header from a downstream neighbor, it
   MUST overwrites the current value it has stored for this neighbor
   with the one received.  Each entry in the server load table has the
   following elements:

   o  Address of the server from which the Load header was received.
   o  Time when the header was received.
   o  Load header value.
   o  Throttle parameter value (default value if not present).
   o  Validity parameter value (default value if not present).

   A SIP entity SHOULD slowly fade out the contents of Load headers that
   have exceeded their expiration time by additively decreasing the Load
   header and throttle parameter values until they reach zero.  This is
   achieved by using the following equation to access stored Load header
   and "throttle" parameter values.  Note that this equation is only
   used to access Load header and "throttle" parameter values and the
   result is not written back into the table.

      result = value - ((cur_t - rec_t) DIV validity) * 20

   If the result is negative, zero is used instead.  Value is the stored



Hilt, et al.            Expires September 4, 2007              [Page 18]


Internet-Draft              Overload Control                  March 2007


   value of the Load header/the "throttle" parameter.  Cur_t is the
   current time in milliseconds, rec_t is the time the Load header was
   received.  Validity is the "validity" parameter value.  DIV is a
   function that returns the integer portion of a division.

   The idea behind this equation is to subtract 20 from the value for
   each validity period that has passed since the header was received.
   A value of 100, for example, will be reduced to 80 after the first
   validity period and it will be completely removed after 5 * validity
   milliseconds.

   A stored Load header is removed from the table when the above
   equation returns zero for both the load header and throttle parameter
   value.

5.5.  Using the Load Header Value

   A SIP entity MAY use the Load header value to balance load or to find
   an underutilized SIP server.

5.6.  Using the Throttle Parameter Value

   A SIP entity compliant to this specific MUST honor "throttle"
   parameter values when forwarding SIP messages to a downstream SIP
   server.

   A SIP entity applies the usual SIP procedures to determine the next
   hop SIP server as, e.g., described in [2] and [3].  After selecting
   the next hop server, the SIP entity MUST determine if it has a stored
   Load header from this server has not yet fully expired.  If it has a
   Load header and the header contained a throttle parameter that is
   non-zero, the SIP server MUST determine if it can or cannot forward
   the current request within the current throttle conditions.

   The SIP MAY use the following algorithm to determine if it can
   forward the request.  The SIP entity draws a random number between 1
   and 100 for the current request.  If the random number is less than
   or equal to the throttle value, the request is not forwarded.
   Otherwise, the request if forwarded as usual.  Another algorithm for
   SIP entities that processes a large number of requests is to reject/
   redirect the first X of every 100 requests processed.  Other
   algorithms that lead to the same result may be used as well.

   The treatment of SIP requests that cannot be forwarded to the
   selected SIP Server is a matter of local policy.  A SIP entity MAY
   try to find an alternative target or it MAY reject the request (see
   Section 5.7).




Hilt, et al.            Expires September 4, 2007              [Page 19]


Internet-Draft              Overload Control                  March 2007


5.7.  Rejecting Requests

   A SIP server that rejects a request because of overload MUST reject
   this request with a 520 response code. 520 is a new SIP response code
   defined in this specification.  This response code is used for
   requests that are rejected because the SIP server has received an
   indication that its downstream neighbor is overloaded or because the
   SIP server itself is under overload.

      OPEN ISSUE: does it make sense to use a new response code or
      should 500 be used?  Is 520 a reasonable code to use?

   A SIP server may determine that an upstream neighbor does not support
   this extension.  If a SIP server is under overload, it SHOULD use 520
   responses to reject a fraction of requests from upstream neighbors
   that do not support this extension.  This fraction SHOULD be
   equivalent to the fraction of requests the upstream server would
   reject/redirect if it did support this extension.  This is to ensure
   that SIP entities, which do not support this extension, don't receive
   an unfair advantage over those that do.

   A SIP server that has reached overload (i.e., a load close to 100)
   SHOULD start using 520 responses in addition to the throttle
   parameter in the Load header for all upstream neighbors.  If the
   proxy has reached a load close to 100, it needs to protect itself
   against overload.  Also, it is likely that upstream proxies have
   ignored the increasing load status reports and thus do not support
   this extension.


6.  Syntax

   This section defines the syntax of a new SIP response header, the
   Load header.  The Load header field is used to advertise the current
   load status information of a SIP entity to its upstream neighbor.

   The value of the Load header is an integer between 0 and 100 with the
   value of 0 indicating that the proxy is least overloaded and the
   value of 100 indicating that the proxy is most overloaded.

   The "target" parameter is mandatory and contains the URI of the next
   hop SIP entity for the response.  I.e., the SIP entity the response
   is forwarded to.  This is the entity that will process the Load
   header.

   The "throttle" parameter is optional and contains a number between 0
   and 100.  It describes the percentage by which the load forwarded by
   "target" SIP entity to the SIP server generating this header should



Hilt, et al.            Expires September 4, 2007              [Page 20]


Internet-Draft              Overload Control                  March 2007


   be reduced.

   The "validity" parameter is optional and contains an indication of
   how long the reporting proxy is likely to remain in the given load
   status.

   The syntax of the Load header field is:

     Load              = "Load" HCOLON loadStatus
     loadStatus        = 0-100 SEMI serverID *( SEMI loadParam )
     loadParam         = throttleRate | validMS | generic-param
     serverID          = "target" EQUAL SIP-URI | SIPS-URI
     throttleRate      = "throttle" EQUAL 0-100
     validMS           = "validity" EQUAL delta-ms
     delta-ms          = 1*DIGIT

   The BNF for SIP-URI, SIPS-URI and generic-param is defined in [2].

   Table 1 is an extension of Tables 2 and 3 in [2].

     Header field       where   proxy ACK BYE CAN INV OPT REG
     ________________________________________________________
     Load                 r       ar   -   o   o   o   o   o
                   Table 1: Load Header Field

   Example:

      Load: 80;throttle=20;validity=500;target=p1.example.com


7.  Security Considerations

   Overload control mechanisms can be used by an attacker to conduct a
   denial-of-service attack on a SIP entity if the attacker can pretend
   that the SIP entity is overloaded.  When such a forged overload
   indication is received by an upstream SIP entity, it will stop
   sending traffic to the victim.  Thus, the victim is subject to a
   denial-of-service attack.

   An attacker can create forged load status reports by inserting itself
   into the communication between the victim and its upstream neighbors.
   The attacker would need to add status reports indicating a high load
   to the responses passed from the victim to its upstream neighbor.
   Proxies can prevent this attack by communicating via TLS.  Since load
   status reports have no meaning beyond the next hop, there is no need
   to secure the communication over multiple hops.

   Another way to conduct an attack is to send a message containing a



Hilt, et al.            Expires September 4, 2007              [Page 21]


Internet-Draft              Overload Control                  March 2007


   high load status value through a proxy that does not support this
   extension.  Since this proxy does not remove the load status
   information, it will reach the next upstream proxy.  If the attacker
   can make the recipient believe that the load status was created by
   its direct downstream neighbor (and not by the attacker further
   downstream) the recipient stops sending traffic to the victim.  A
   precondition for this attack is that the victim proxy does not
   support this extension since it would not pass through load status
   information otherwise.  The attack also does not work if there is a
   stateful proxy between the attacker and the victim and only 100
   (Trying) responses are used to convey the Load header.

   A malicious SIP entity could gain an advantage by pretending to
   support this specification but never reducing the load it forwards to
   the downstream neighbor.  If its downstream neighbor receives traffic
   from multiple sources which correctly implement overload control, the
   malicious SIP entity would benefit since all other sources to its
   downstream neighbor would reduce load.

      OPEN ISSUE: the solution to this problem depends on the overload
      control algorithm.  For a fixed message rate and window-based
      overload control, it is very easy for a downstream entity to
      monitor if the upstream neighbor throttles load as directed.  For
      percentage throttling this is not always obvious since the load
      forwarded depends on the load received by the upstream neighbor.


8.  IANA Considerations

   [TBD.]


Appendix A.  Acknowledgements

   Many thanks to Rich Terpstra and Jonathan Rosenberg for their
   contributions to this specification.


9.  References

9.1.  Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [2]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
        Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
        Session Initiation Protocol", RFC 3261, June 2002.



Hilt, et al.            Expires September 4, 2007              [Page 22]


Internet-Draft              Overload Control                  March 2007


   [3]  Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol
        (SIP): Locating SIP Servers", RFC 3263, June 2002.

   [4]  Schulzrinne, H. and J. Polk, "Communications Resource Priority
        for the Session Initiation Protocol (SIP)", RFC 4412,
        February 2006.

9.2.  Informative References

   [5]  Rosenberg, J., "Requirements for Management of Overload in the
        Session Initiation Protocol",
        draft-rosenberg-sipping-overload-reqs-02 (work in progress),
        October 2006.

   [6]  Liu, Z., "Applying Signaling Compression (SigComp) to the
        Session Initiation Protocol  (SIP)",
        draft-ietf-rohc-sigcomp-sip-04 (work in progress),
        November 2006.

   [7]  Rosen, B., "Framework for Emergency Calling in Internet
        Multimedia", draft-ietf-ecrit-framework-00 (work in progress),
        October 2006.


Authors' Addresses

   Volker Hilt
   Bell Labs/Alcatel-Lucent
   101 Crawfords Corner Rd
   Holmdel, NJ  07733
   USA

   Email: volkerh@bell-labs.com


   Indra Widjaja
   Bell Labs/Alcatel-Lucent
   600-700 Mountain Avenue
   Murray Hill, NJ  07974
   USA

   Email: iwidjaja@alcatel-lucent.com









Hilt, et al.            Expires September 4, 2007              [Page 23]


Internet-Draft              Overload Control                  March 2007


   Daryl Malas
   Level 3 Communications
   1025 Eldorado Blvd.
   Broomfield, CO
   USA

   Email: daryl.malas@level3.com


   Henning Schulzrinne
   Columbia University/Department of Computer Science
   450 Computer Science Building
   New York, NY  10027
   USA

   Phone: +1 212 939 7004
   Email: hgs@cs.columbia.edu
   URI:   http://www.cs.columbia.edu

































Hilt, et al.            Expires September 4, 2007              [Page 24]


Internet-Draft              Overload Control                  March 2007


Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Acknowledgment

   Funding for the RFC Editor function is provided by the IETF
   Administrative Support Activity (IASA).





Hilt, et al.            Expires September 4, 2007              [Page 25]