SIPPING Working Group V. Hilt Internet-Draft I. Widjaja Expires: September 4, 2007 Bell Labs/Alcatel-Lucent D. Malas Level 3 Communications H. Schulzrinne Columbia University March 3, 2007 Session Initiation Protocol (SIP) Overload Control draft-hilt-sipping-overload-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on September 4, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Abstract Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited overload control mechanism through its 503 response code, SIP servers Hilt, et al. Expires September 4, 2007 [Page 1]
Internet-Draft Overload Control March 2007 are still vulnerable to overload. This document proposes several new overload control mechanisms for the SIP protocol. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Design Considerations . . . . . . . . . . . . . . . . . . . . 4 3.1. System Model . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Hop-by-Hop vs. End-to-End . . . . . . . . . . . . . . . . 5 3.3. Topologies . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4. Overload Control Method . . . . . . . . . . . . . . . . . 9 3.4.1. Rate-based Overload Control . . . . . . . . . . . . . 9 3.4.2. Loss-based Overload Control . . . . . . . . . . . . . 10 3.4.3. Window-based Overload Control . . . . . . . . . . . . 10 3.5. Overload Control Algorithms . . . . . . . . . . . . . . . 11 3.5.1. Increase Algorithm . . . . . . . . . . . . . . . . . . 12 3.5.2. Decrease Algorithm . . . . . . . . . . . . . . . . . . 12 3.6. Load Status . . . . . . . . . . . . . . . . . . . . . . . 12 3.7. SIP Mechanism . . . . . . . . . . . . . . . . . . . . . . 13 3.8. Backwards Compatibility . . . . . . . . . . . . . . . . . 13 3.9. Interaction with Local Overload Control . . . . . . . . . 14 4. SIP Application Considerations . . . . . . . . . . . . . . . . 14 4.1. How to Calculate Load Levels . . . . . . . . . . . . . . . 14 4.2. Responding to an Overload Indication . . . . . . . . . . . 15 4.3. Emergency Services Requests . . . . . . . . . . . . . . . 15 4.4. Operations and Management . . . . . . . . . . . . . . . . 16 5. SIP Load Header Field . . . . . . . . . . . . . . . . . . . . 16 5.1. Generating the Load Header . . . . . . . . . . . . . . . . 16 5.2. Determining the Load Header Value . . . . . . . . . . . . 17 5.3. Determining the Throttle Parameter Value . . . . . . . . . 17 5.4. Processing the Load Header . . . . . . . . . . . . . . . . 18 5.5. Using the Load Header Value . . . . . . . . . . . . . . . 19 5.6. Using the Throttle Parameter Value . . . . . . . . . . . . 19 5.7. Rejecting Requests . . . . . . . . . . . . . . . . . . . . 20 6. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 7. Security Considerations . . . . . . . . . . . . . . . . . . . 21 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 22 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 9.1. Normative References . . . . . . . . . . . . . . . . . . . 22 9.2. Informative References . . . . . . . . . . . . . . . . . . 23 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 Intellectual Property and Copyright Statements . . . . . . . . . . 25 Hilt, et al. Expires September 4, 2007 [Page 2]
Internet-Draft Overload Control March 2007 1. Introduction As with any network element, a Session Initiation Protocol (SIP) [2] server can suffer from overload when the number of SIP messages it receives exceeds the number of messages it can process. SIP server overload can pose a serious problem. During periods of overload, the throughput of a SIP network can be significantly degraded. In particular, SIP server overload may lead to a situation in which the throughput drops to a small fraction of the original capacity of the network. This is often called congestion collapse. The SIP protocol provides a limited mechanism for overload control through its 503 response code. However, this mechanism cannot prevent SIP server overload and it cannot prevent congestion collapse in a network of SIP servers. In fact, the 503 response code mechanism may cause traffic to move back and forth between SIP servers and thereby worsen an overload condition. A detailed discussion of the SIP overload problem, the 503 response code and the requirements for a SIP overload control solution can be found in [5]. Overload is said to occur if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU processing capacity, memory, network bandwidth, input/ output, or disk resources. Generally speaking, overload occurs if a SIP server can no longer process or respond to all incoming SIP messages. We only consider failure cases where SIP servers cannot process all incoming SIP requests. There are other failure cases where the SIP server can process, but not fulfill, requests. These are beyond the scope of this document since SIP provides other response codes for these cases and overload control MUST NOT be used to handle these scenarios. For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a 488 (Not Acceptable Here) response [4]. Similarly, a SIP registrar that has lost connectivity to its registration database but is still capable of processing SIP messages should reject REGISTER requests with a 500 (Server Error) response [2]. This specification is structured as follows: Section 3 discusses general design principles of an SIP overload control mechanism. Section 4 discusses general considerations for applying SIP overload control. Section 5 defines a SIP protocol extension for overload control and Section 6 introduces the syntax of this extension. Section 7 and Section 8 discuss security and IANA considerations respectively. Hilt, et al. Expires September 4, 2007 [Page 3]
Internet-Draft Overload Control March 2007 2. Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [1] and indicate requirement levels for compliant implementations. 3. Design Considerations This section discusses key design considerations for a SIP overload control mechanism. The goal for this mechanism is to prevent upstream servers to send SIP messages to an overloaded downstream server, rather than rejecting messages already sent at the overloaded server. 3.1. System Model The model shown in Figure 1 identifies fundamental components of an SIP overload control system: o SIP Processor: component that processes SIP messages. The SIP processor is the component that is protected by overload control. o Monitor: component that monitors the current load of the SIP processor on the receiving entity. The monitor implements the mechanisms needed to measure the current usage of resources relevant for the SIP processor. It reports load samples (S) to the Control Function. o Control Function: component that implements the actual overload control mechanism on the receiving and sending entity. The control function uses the load samples (S) provided by the monitor. It determines if overload has occurred and a throttle (T) needs to be set to adjust the load sent to the SIP processor on the receiving entity. The control function on the receiving entity sends load feedback (F) to the control function sending entity. o Actuator: component that acts on the throttles (T) generated by the control function and adjust the load forwarded to the receiving entity accordingly. For example, a throttle may instruct the actuator to reduce the load destined to the receiving entity by 10%. The actuator decides how the load reduction is achieved (e.g., by redirecting or rejecting requests). The type of feedback (F) conveyed from the receiving to the sending entity depends on the overload control method used (i.e., rate-based vs. window-based overload control; see Section 3.4.3) as well as other design parameters (e.g., whether load status information is Hilt, et al. Expires September 4, 2007 [Page 4]
Internet-Draft Overload Control March 2007 included or not). In any case, the feedback (F) informs the sending entity that overload has occurred and that the traffic forward to the receiving entity needs to be reduced to a lower rate. Sending Receiving Entity Entity +----------------+ +----------------+ | Server A | | Server B | | +----------+ | | +----------+ | -+ | | Control | | F | | Control | | | | | Function |<-+------+--| Function | | | | +----------+ | | +----------+ | | | T | | | ^ | | Overload | v | | | S | | Control | +----------+ | | +----------+ | | | | Actuator | | | | Monitor | | | | +----------+ | | +----------+ | | | | | | ^ | -+ | v | | | | -+ | +----------+ | | +----------+ | | <-+--| SIP | | | | SIP | | | SIP --+->|Processor |--+------+->|Processor |--+-> | System | +----------+ | | +----------+ | | +----------------+ +----------------+ -+ Figure 1: System Model for Overload Control 3.2. Hop-by-Hop vs. End-to-End A SIP request is often processed by more than one SIP server. Thus, overload control can in theory be applied hop-by-hop, i.e., individually between each pair of servers, or end-to-end as a single control loop that stretches across the entire path from UAC to UAS (see Figure 2). Hilt, et al. Expires September 4, 2007 [Page 5]
Internet-Draft Overload Control March 2007 +---------+ +-------+----------+ +------+ | | | ^ | | | | +---+ | | +---+ v | v //=>| C | v | //=>| C | +---+ +---+ // +---+ +---+ +---+ // +---+ | A |===>| B | | A |===>| B | +---+ +---+ \\ +---+ +---+ +---+ \\ +---+ ^ \\=>| D | ^ | \\=>| D | | +---+ | | +---+ | | | v | +---------+ +-------+----------+ (a) hop-by-hop loop (b) end-to-end loop ==> SIP request flow <-- Load feedback loop Figure 2: Hop-by-Hop vs. End-to-End In the hop-by-hop model, a separate overload control loop is instantiated between each pair of neighboring SIP servers on the path of a SIP request. Each SIP server provides load feedback to its upstream neighbors, which then adjust the amount of traffic they are forwarding to the SIP server. However, the neighbors do not forward the received feedback information further upstream. Instead, they act on the feedback and resolve the overload condition if needed, for example, by re-routing or rejecting traffic. The upstream neighbor of a server can, and should, use a separate overload control loop with its upstream neighbors. If the neighbor becomes overloaded, it will report this problem to its upstream neighbors, which again take action based on the reported feedback. Thus, in hop-by-hop overload control, overload is resolved by the direct upstream neighbors of the overloaded server without the need to involve entities that are located multiple SIP hops away. Hop-by-hop overload control can effectively reduce the impact of overload on a SIP network and, in particular, can avoid congestion collapse. In addition, hop-by-hop overload control is simple and scales well to networks with many SIP entities. It does not require a SIP entity to aggregate a large number of load status values or keep track of the load status of SIP servers it is not communicating with. End-to-end overload control implements an overload control loop along the entire path of a SIP request, from UAC to UAS. An end-to-end overload control mechanism needs to consider load information from Hilt, et al. Expires September 4, 2007 [Page 6]
Internet-Draft Overload Control March 2007 all SIP servers on the way (including all proxies and the UAS). It has to be able to frequently collect the load status of all servers on the potential path(s) to a destination and combine this data into meaningful load feedback. A UA or SIP server should not throttle its load unless it knows that all potential paths to the destination are overloaded. Overall, the main problem of end-to-end path overload control is its inherent complexity since a UAC or SIP server would need to monitor all potential paths to a destination in order to know when to throttle. Therefore, end-to-end overload control is likely to only work if a UA/server sends lots of requests to the exact same destination. 3.3. Topologies A simple topology for overload control is a SIP server that receives traffic from a single source (as shown in Figure 3(a)). A load balancer is a typical example for this configuration. In more complex topology, a SIP server receives traffic from multiple upstream sources. This is shown in Figure 3(b), where SIP servers A, B and C forward traffic to server D. It is important to note that each of these servers may contribute a different amount of load to the overall load of D. This load mix may vary over time. If server D becomes overloaded, it generates feedback to reduce the amount of traffic it receives from its upstream neighbors (i.e., A or A, B and C respectively). If a SIP server (server D) becomes overloaded, it needs to decide how overload control feedback is balanced across upstream neighbors. This decision needs to account for the actual amount of traffic received from an upstream neighbor. The decision may need to be re- adjusted as the load contributed by each upstream neighbor varies over time. A server may use a local policy to decide how much load it wants to receive from each upstream neighbor. For example, a server may throttle all upstream sources equally (e.g., all sources need to reduce traffic forwarded by 10%) or to prefer some servers over others. For example, it may want to throttle a less preferred upstream neighbor earlier than a preferred neighbor or throttle the neighbor first that sends the most traffic. Since this decision is made by the receiving entity (i.e., server D), all senders for this entity are governed by the same overload control algorithm. In many network configurations, upstream servers (A, B and C) have alternative servers (server E) to which they can redirect excess messages if the primary target (server D) is overloaded (see Figure 3(c)). Servers D and E may differ in their processing capacity. When redirecting messages, the upstream servers need to Hilt, et al. Expires September 4, 2007 [Page 7]
Internet-Draft Overload Control March 2007 ensure that these messages do not overload the alternate server. An overload control mechanism should enables upstream servers to only choose alternative servers that have enough capacity to handle the redirected requests. +---+ +---+ /->| D | | A |-\ / +---+ +---+ \ / \ +---+ +---+-/ +---+ +---+ \->| | | A |------>| E | | B |------>| D | +---+-\ +---+ +---+ /->| | \ / +---+ \ +---+ +---+ / \->| F | | C |-/ +---+ +---+ (a) load balancer w/ (b) multiple upstream alternate servers neighbors +---+ | A |---\ a--\ +---+=\ \---->+---+ \ \/----->| D | b--\ \--->+---+ +---+--/\ /-->+---+ \---->| | | B | \/ c-------->| D | +---+===\/\===>+---+ | | /\====>| E | ... /--->+---+ +---+--/ /==>+---+ / | C |=====/ z--/ +---+ (c) multiple upstream (d) very large number of neighbors w/ alternate server upstream neighbors Figure 3: Topologies Overload control that is based on throttling the message rate is not suited for servers that receive requests from a very large population of senders, which only infrequently send requests as shown in Figure 3(d). An edge proxy that is connected to many UAs is an example for such a configuration. Since each UA typically only contributes a single request to an overload condition, it can't decrease its message rate to resolve the overload. In such a configuration, a SIP server can gradually reduce its load Hilt, et al. Expires September 4, 2007 [Page 8]
Internet-Draft Overload Control March 2007 by rejecting a percentage of the requests it receives with 503 responses. Since there are many upstream neighbors that contribute to the overall load, sending 503 to a fraction of them gradually reduces load without entirely stopping the incoming traffic and helps to resolve the overload condition in this scenario. 3.4. Overload Control Method The method used by an overload control mechanism to curb the amount of traffic forwarded to an element is a key aspect of the design. Three different types of overload control methods exist: rate-based, loss-based and window-based overload control. 3.4.1. Rate-based Overload Control The key idea of rate-based overload control is to indicate the message rate that an upstream element is allowed to send to the downstream neighbor. If overload occurs, a SIP server instructs each upstream neighbor to send at most X messages per second. This rate cap ensures that the offered load for a SIP server never increases beyond the sum of the rate caps granted to all upstream neighbors and can protect a SIP server from overload even during extreme load spikes. A common technique to implement a rate cap of a given number of messages per second X is message gapping. After transmitting a message to a downstream neighbor, a server waits for 1/X seconds before it transmits the next message to the same neighbor. Messages that arrive during the waiting period are not forwarded and are either redirected, rejected or buffered. The main drawback of this mechanism is that it requires a SIP server to assign a certain rate cap to each of its upstream neighbors based on its overall capacity. Effectively, a server assigns a share of its capacity to each upstream neighbor. The server needs to ensure that the sum of all rate caps assigned to upstream neighbors is not (significantly) higher than its actual processing capacity. This requires a SIP server to continuously evaluate the amount of load it receives from an upstream neighbor and assign a rate cap that is suitable for this neighbor. For example, in a non-overloaded situation, it could assign a rate cap that is 10% higher than the current rate from this neighbor. The rate cap needs to be adjusted if the load offered by upstream neighbors changes and new upstream neighbors appear or an existing neighbor stops transmitting. If the cap assigned to an upstream neighbor is too high, the server may still experience overload. However, if the cap is too low, the upstream neighbors will reject messages even though they could be processed by the server. Thus, rate-based overload control is likely Hilt, et al. Expires September 4, 2007 [Page 9]
Internet-Draft Overload Control March 2007 to work well only if the number of upstream servers is small and constant, e.g., as shown in the example in Figure 3(d). 3.4.2. Loss-based Overload Control A loss percentage enables a SIP server to ask its upstream neighbor to reduce the amount of traffic it would normally forward to this server by a percentage X. For example, a SIP server can ask its upstream neighbors to lower the traffic it would forward to it by 10%. The upstream neighbor then redirects or rejects X percent of the traffic that is destined for this server. A loss percentage can be implemented in the upstream entity, for example, by drawing a random number between 1 and 100 for each request to be forwarded. The request is not forwarded to the server if the random number is less than or equal to X. A server does not need to track the message rate it receives from each upstream neighbor. To reduce load, a server can ask each upstream neighbor to lower traffic by a certain percentage which can be determined independent of the actual message rate contributed by each server. The loss percentage depends on the loss percentage currently used by the upstream servers and the current system load of the server. For example, if the server load approaches 90% and the current loss percentage is set to a 50% load reduction, then the server may decide to increase the loss percentage to 55% in order to get back to a system load of 80%. Similarly, the server can lower the loss percentage if permitted by the system utilization. This requires that system load can be accurately measured and that these measurements are reasonably stable. The main drawback of percentage throttling is that the throttle percentage needs to be adjusted to the offered load, in particular, if the load fluctuates quickly. For example, if a SIP server sets a throttle value of 10% at time t1 and load increases by 20% between time t1 and t2 (t1<t2), then the server will see a load increase by 10% between time t1 and t2. This is true even though all upstream neighbors reduced traffic by 10% as told. Thus, percentage throttling requires the quick adjustment of the throttling percentage and may not always be able to prevent a server from encountering brief periods of overload in extreme cases. 3.4.3. Window-based Overload Control The key idea of window-based overload control is to allow an entity to transmit a certain number of messages before it needs to receive a confirmation for the messages in transit. Each sender maintains an overload window that limits the number of messages that can be in Hilt, et al. Expires September 4, 2007 [Page 10]
Internet-Draft Overload Control March 2007 transit without being confirmed. Each sender maintains a unconfirmed message counter for each downstream neighbor it is communicating with. For each message sent to the downstream neighbor, the counter is increased by one. For each confirmation received, the counter is decreased by one. The sender stops transmitting messages to the downstream neighbor when the unconfirmed message counter has reached the current window size. A crucial parameter for the performance of window-based overload control is the window size. The windows size together with the round-trip time between sender and receiver determines the effective message rate that can be achieved. Each sender has an initial window size it uses when first sending a request. This window size can change based on the feedback it receives from the receiver. The receiver can require a decrease in window size to throttle the sender or allow an increase to allow an increasing message rate. The sender adjusts its window size as soon as it receives the corresponding feedback from the receiver. If the new window size is smaller than the current unconfirmed message counter, the sender MUST stop transmitting messages until more messages are confirmed and the current unconfirmed message counter is less than the window size. A sender should not treat the reception of a 100 Trying response as an implicit confirmation for a message. 100 Trying responses are often created by a SIP server very early in the process and do not indicate that a message has been successfully processed and cleared from the input buffer. If the downstream neighbor is a stateless proxy, it will not create 100 Trying responses at all and instead pass through 100 Trying responses created by the next stateful server. Also, 100 Trying response are typically only created for INVITE requests. Explicit message confirmations in a load feedback report do not have these problems. The behavior and issues of window-based overload control are similar to rate-based overload control, in that the total available receiver buffer space needs to be divided among all upstream neighbors. However, unlike rate-based overload control, it can ensure that the receiver buffer never overflows. The transmission of messages by senders is effectively clocked by message confirmations received from the receiver. 3.5. Overload Control Algorithms This section describes algorithms that govern the behavior of entities that implement the overload control mechanism. Hilt, et al. Expires September 4, 2007 [Page 11]
Internet-Draft Overload Control March 2007 OPEN ISSUE: the following overload control algorithms are strawman proposals. They depend on the overload control methods. 3.5.1. Increase Algorithm A SIP server that is starting to transmit messages to a downstream neighbor should slowly increase the message rate until it reaches the rate allowed by the receiver. In particular servers that are restarted or added to the network should execute this algorithm. The slow increase prevents a server from overloading its downstream neighbor by suddenly and significantly increasing load. A server should, however, not reject requests during this period. Instead, it should redirect or buffer requests it cannot forward right away. This avoids that calls are rejected in a empty network because of the slow increase algorithm. A slow increase should also be used when the receiver has recovered from an overload condition and is increasing the allowed transmission rate. In this case, the sender may reject messages during the slow increase Slow increase after overload avoids that the receiver becomes overloaded again when all senders suddenly increase the message rate. OPEN ISSUE: this algorithm seems to be useful for loss-based overload control but may not be needed for other types of overload control. 3.5.2. Decrease Algorithm A sender MUST immediately reduce its transmission rate when it receives an indication that the receiver has reached an overload condition. 3.6. Load Status It may be useful for a SIP server to frequently provide its current load status to upstream neighbors. The load status indicates to which degree the resources needed by a SIP server to process SIP messages are utilized. SIP servers can use the load status to balance load between alternative proxies and to find under-utilized servers. It should be noted, however, that reporting load is not intended to replace specialized load balancing mechanisms. OPEN ISSUE: reporting load status seems related but somewhat orthogonal to overload control. It might therefore be better to handle overload control and load reporting/balancing in separate mechanisms. Hilt, et al. Expires September 4, 2007 [Page 12]
Internet-Draft Overload Control March 2007 3.7. SIP Mechanism A SIP mechanism needs to convey load feedback from the receiving to the sending SIP entity. A number of alternatives exist to realize such a mechanism. In principle, it would be possible to define a new SIP request that can be used to convey load status reports from the receiving to the sending entity. However, sending separate load status requests from receiving to the sending entity would create additional messaging overhead, which is undesirable during periods of overload. It would also require each SIP server to keep track of all potential upstream neighbors. Similarly, it would be possible to define an event packages for subscriptions to overload status. This package would enable a sending entity to subscribe to the load status of its downstream neighbor and receive status updates in NOTIFY messages. However, it would require each sending entity to set up a subscription to all entities it may forward traffic to and manage these subscriptions over time for a varying set of downstream neighbors. Setting up subscriptions and sending overload feedback in notifications creates an undesirable messaging overhead. Another approach is to define a new SIP header field for load information that can be inserted into SIP responses. This approach has the advantage that it automatically provides load feedback to all upstream SIP entities that are currently forwarding traffic to a SIP server with very little overhead. Hop-by-hop overload control requires that the distribution of load feedback is limited to the next upstream SIP server. This can be achieved by adding the address of the next hop server, that is, the destination of the load report, to the load header. Conveying load feedback in a SIP response header requires that SIP traffic is flowing between the sending and the receiving entity. This is usually not a problem when regulating traffic between SIP servers. Even in an overload situation that requires 100% throttling, an upstream server can forward an occasional request or send an OPTIONS request to probe the load status of the downstream neighbor. 3.8. Backwards Compatibility An important requirement for an overload control mechanism is that it can be gradually introduced into a network and that it functions properly if only a fraction of the servers support it. Hop-by-hop overload control does not require that all SIP entities in Hilt, et al. Expires September 4, 2007 [Page 13]
Internet-Draft Overload Control March 2007 a network support it. It can be used effectively between two adjacent SIP servers if both servers support this extension and does not depend on the support from any other server or user agent. The more SIP servers in a network support this mechanism, the more effective it is since it includes more of the servers in the load reporting and offloading process. In topologies such as the ones depicted in Figure 3(b) and (c), a SIP server has multiple neighbors from which only some may support overload control. If a server would simply use this extension for overload control, only those that support it would throttle their load. Others would keep sending at the full rate and benefit from the throttling by other servers supporting this extension. In other words, upstream neighbors that do not support overload control would be better off than those that do. A SIP server should therefore use 5xx responses towards upstream neighbors that do not support this specification. The server should reject the same amount of requests with 5xx responses that would be otherwise be rejected/redirected by the upstream neighbor if it would support overload control. For example, if the server has throttled the load by 10%, it should reject 10% of the requests with a 5xx response for this neighbor. 3.9. Interaction with Local Overload Control Servers may want to protect themselves against overload by rejecting incoming messages with minimal effort when a server is overloaded. We refer to such mechanisms as local overload control. Local overload control can be used in conjunction with the mechanisms defined in this specification and provides an additional layer of protection against overload, for example, in cases where upstream servers do not support overload control. In general, servers should start to throttle upstream neighbors before using local overload control mechanisms to reject messages, i.e., at a lower level of load. 4. SIP Application Considerations 4.1. How to Calculate Load Levels Calculating an element's load level is dependent on the limiting resource for this element. There can be several contributing factors, such as CPU, memory, queue depth, calls per second, application threads. The element should consider to have reached a load level of 100% at a point at which it cannot reliably process any Hilt, et al. Expires September 4, 2007 [Page 14]
Internet-Draft Overload Control March 2007 more messages. If an element knows what its limiting resource is, it can calculate its load level based on this resource or use a combination of resources. The element should recognize percentages of load prior to hitting 100% based on the limit at which the element becomes 100% loaded. 4.2. Responding to an Overload Indication An element may receive a load header indicating that it needs to reduce the traffic it sends to its downstream neighbor. An element can accomplish this task by sending some of the requests that would have gone to the overloaded element to a different destination or by altering its load-balancing algorithm to lower the number of calls it will offer to the overloaded resource. It can also buffer requests in the hope that the overload condition will resolve quickly and the requests still can be forwarded in time. Finally, it can reject these requests. The algorithm to reduce load is open to element and vendor implementation. (Note: the load reduction algorithm is not specifying the quantity of SIP messages allowed by the SIP server, it is simply specifying a treatment of new request based on a specified load level.) However, the algorithm should provide for tunable parameters to allow the network operator to optimize over time. The settings for these parameters could be different for a carrier network versus an enterprise or VSP network. The goal of the algorithm is to alleviate load throughout the network. This means avoidance of propagating load from one element to another. This also means trying to keep the network as full as possible without reaching 100% on any given element, except when all elements approach 100%. Balancing load across a network is the responsibility of the operator, but the load header may help the operator to adjust the balancing in a more dynamic fashion by allowing the load-balancing algorithm to react to bursts or outages. 4.3. Emergency Services Requests It is generally recommended proxy servers should attempt to balance all SIP requests, and relative resources, to a maximum load of 80%. In doing so, the servers are proactively tuned to allow an emergency services request [7] attempt to be placed to any available upstream or downstream SIP device for immediate processing and delivery to the intended emergency services provider. In some cases, the load will increase beyond this point and a server will need to begin to lower the number of requests forwarded. When the SIP server receives an emergency services request, it should not be treated by alleviation methods and should be processed Hilt, et al. Expires September 4, 2007 [Page 15]
Internet-Draft Overload Control March 2007 immediately. In some cases, a SIP server may receive more emergency services requests than it is allowed to forward. This may happen, for example, to a SIP server that is serving an emergency service center. In these cases and after rejecting/redirecting all non- emergency service requests, a SIP server should also include emergency service requests in the alleviation treatment to avoid that the downstream server becomes overloaded. 4.4. Operations and Management The load header information can be captured within Call Detail Records (CDRs) and SNMP traps for use in service reports. These service reports could be used for future network optimization. 5. SIP Load Header Field This section defines a new SIP header field for overload control, the Load header. This header field follows the above design considerations for an overload control mechanism. Note: the Load header field is an initial proposal for a overload feedback mechanism. The design of this header depends on many of the design aspects discussed above (in particular the overload control method as discussed in Section 3.4). 5.1. Generating the Load Header A SIP server compliant to this specification SHOULD frequently provide load feedback to its upstream neighbors in a timely manner. It does so by inserting a Load header field into the SIP responses it is forwarding or creating. The Load header is a new header field defined in this specification. The Load header can be inserted into all response types, including provisional, success and failure response types. A SIP server SHOULD insert a Load header into all responses. A SIP server MAY choose to insert Load headers less frequently, for example, once every x milliseconds. This may be useful for SIP servers that receive a very high number of messages from the same upstream neighbor or servers with a very low variability of the load measure. In any case, a SIP server SHOULD insert a Load header into a response well before the previous Load header sent to the same upstream neighbor expires. Only SIP servers that frequently insert Load header into responses are protected against overload. The Load header is only defined in SIP responses and MUST NOT be used in SIP requests. The SIP header is only useful to the upstream Hilt, et al. Expires September 4, 2007 [Page 16]
Internet-Draft Overload Control March 2007 neighbor of a SIP server since this is the entity that can offload traffic by redirecting/rejecting new requests. If requests are forwarded in both directions between two SIP servers (i.e., the roles of upstream/downstream neighbors change), there are also responses flowing in both directions which allow the two SIP servers to exchange load information. While Load headers in requests may increase the frequency with which load information is exchanged in these scenarios, this increase will rarely provide benefits and does not seem to justify the added overhead and complexity needed. A SIP server MUST insert the address of its upstream neighbor into the "target" parameter of the Load header. It SHOULD use the address of the upstream neighbor found in the topmost Via header of the response for this purpose. The "target" parameter enables the receiver of a Load header to determine if it should process the Load header (since it was generated by its downstream neighbor) or if the Load header needs to be ignored (since it was passed along by an entity that does not support this extension). Effectively, the "target" parameter implements the hop-by-hop semantics and prevents the use of load status information beyond the next hop. OPEN ISSUE: instead of using the address in the Via header it might make sense to use an application identifier similar to the one defined for SigComp [6]. A SIP server SHOULD add a "validity" parameter to the Load header. The "validity" parameter defines the time in milliseconds during which the Load header should be considered valid. The default value of the "validity" parameter is 500. A SIP server SHOULD use a shorter "validity" time if its load status varies quickly and MAY use a longer "validity" time if the current load level is more stable. 5.2. Determining the Load Header Value The value of the Load header contains the current load status of the SIP server generating this header. Load header values range from 0 (idle) to 100 (overloaded) and MUST reflect the current level in the usage of SIP message processing resources. For example, a SIP server that is processing SIP messages at a rate that corresponds to 50% of its maximum capacity must set the Load header value to 50. 5.3. Determining the Throttle Parameter Value The value of the "throttle" parameter specifies the percentage by which the load forwarded to this SIP server should be reduced. Possible values range from 0 (the load forwarded is reduced by 0%, Hilt, et al. Expires September 4, 2007 [Page 17]
Internet-Draft Overload Control March 2007 i.e., all traffic is forwarded) to 100 (the load forwarded is reduced by 100%, i.e., no traffic forwarded). The default value of the "throttle" parameter is 0. The "throttle" parameter value is determined by the control function of the SIP server generating the Load header. OPEN ISSUE: this parameter depends on the overload control method used (e.g., whether rate-based or window-based overload control is used). 5.4. Processing the Load Header A SIP entity compliant to this specification MUST remove all Load headers from the SIP messages it receives before forwarding the message. A SIP entity may, of course, insert its own Load header into a SIP message. A SIP entity MUST ignore all Load headers that were not addressed to it. It MUST compare its own addresses with the address in the "target" parameter of the Load header. If none of its addresses match, it MUST ignore the Load header. This ensures that a SIP entity only processes Load headers that were generated by its direct neighbors. A SIP server MUST store the information received in Load headers from a downstream neighbor in a server load table. Each time a SIP server receives a response with a Load header from a downstream neighbor, it MUST overwrites the current value it has stored for this neighbor with the one received. Each entry in the server load table has the following elements: o Address of the server from which the Load header was received. o Time when the header was received. o Load header value. o Throttle parameter value (default value if not present). o Validity parameter value (default value if not present). A SIP entity SHOULD slowly fade out the contents of Load headers that have exceeded their expiration time by additively decreasing the Load header and throttle parameter values until they reach zero. This is achieved by using the following equation to access stored Load header and "throttle" parameter values. Note that this equation is only used to access Load header and "throttle" parameter values and the result is not written back into the table. result = value - ((cur_t - rec_t) DIV validity) * 20 If the result is negative, zero is used instead. Value is the stored Hilt, et al. Expires September 4, 2007 [Page 18]
Internet-Draft Overload Control March 2007 value of the Load header/the "throttle" parameter. Cur_t is the current time in milliseconds, rec_t is the time the Load header was received. Validity is the "validity" parameter value. DIV is a function that returns the integer portion of a division. The idea behind this equation is to subtract 20 from the value for each validity period that has passed since the header was received. A value of 100, for example, will be reduced to 80 after the first validity period and it will be completely removed after 5 * validity milliseconds. A stored Load header is removed from the table when the above equation returns zero for both the load header and throttle parameter value. 5.5. Using the Load Header Value A SIP entity MAY use the Load header value to balance load or to find an underutilized SIP server. 5.6. Using the Throttle Parameter Value A SIP entity compliant to this specific MUST honor "throttle" parameter values when forwarding SIP messages to a downstream SIP server. A SIP entity applies the usual SIP procedures to determine the next hop SIP server as, e.g., described in [2] and [3]. After selecting the next hop server, the SIP entity MUST determine if it has a stored Load header from this server has not yet fully expired. If it has a Load header and the header contained a throttle parameter that is non-zero, the SIP server MUST determine if it can or cannot forward the current request within the current throttle conditions. The SIP MAY use the following algorithm to determine if it can forward the request. The SIP entity draws a random number between 1 and 100 for the current request. If the random number is less than or equal to the throttle value, the request is not forwarded. Otherwise, the request if forwarded as usual. Another algorithm for SIP entities that processes a large number of requests is to reject/ redirect the first X of every 100 requests processed. Other algorithms that lead to the same result may be used as well. The treatment of SIP requests that cannot be forwarded to the selected SIP Server is a matter of local policy. A SIP entity MAY try to find an alternative target or it MAY reject the request (see Section 5.7). Hilt, et al. Expires September 4, 2007 [Page 19]
Internet-Draft Overload Control March 2007 5.7. Rejecting Requests A SIP server that rejects a request because of overload MUST reject this request with a 520 response code. 520 is a new SIP response code defined in this specification. This response code is used for requests that are rejected because the SIP server has received an indication that its downstream neighbor is overloaded or because the SIP server itself is under overload. OPEN ISSUE: does it make sense to use a new response code or should 500 be used? Is 520 a reasonable code to use? A SIP server may determine that an upstream neighbor does not support this extension. If a SIP server is under overload, it SHOULD use 520 responses to reject a fraction of requests from upstream neighbors that do not support this extension. This fraction SHOULD be equivalent to the fraction of requests the upstream server would reject/redirect if it did support this extension. This is to ensure that SIP entities, which do not support this extension, don't receive an unfair advantage over those that do. A SIP server that has reached overload (i.e., a load close to 100) SHOULD start using 520 responses in addition to the throttle parameter in the Load header for all upstream neighbors. If the proxy has reached a load close to 100, it needs to protect itself against overload. Also, it is likely that upstream proxies have ignored the increasing load status reports and thus do not support this extension. 6. Syntax This section defines the syntax of a new SIP response header, the Load header. The Load header field is used to advertise the current load status information of a SIP entity to its upstream neighbor. The value of the Load header is an integer between 0 and 100 with the value of 0 indicating that the proxy is least overloaded and the value of 100 indicating that the proxy is most overloaded. The "target" parameter is mandatory and contains the URI of the next hop SIP entity for the response. I.e., the SIP entity the response is forwarded to. This is the entity that will process the Load header. The "throttle" parameter is optional and contains a number between 0 and 100. It describes the percentage by which the load forwarded by "target" SIP entity to the SIP server generating this header should Hilt, et al. Expires September 4, 2007 [Page 20]
Internet-Draft Overload Control March 2007 be reduced. The "validity" parameter is optional and contains an indication of how long the reporting proxy is likely to remain in the given load status. The syntax of the Load header field is: Load = "Load" HCOLON loadStatus loadStatus = 0-100 SEMI serverID *( SEMI loadParam ) loadParam = throttleRate | validMS | generic-param serverID = "target" EQUAL SIP-URI | SIPS-URI throttleRate = "throttle" EQUAL 0-100 validMS = "validity" EQUAL delta-ms delta-ms = 1*DIGIT The BNF for SIP-URI, SIPS-URI and generic-param is defined in [2]. Table 1 is an extension of Tables 2 and 3 in [2]. Header field where proxy ACK BYE CAN INV OPT REG ________________________________________________________ Load r ar - o o o o o Table 1: Load Header Field Example: Load: 80;throttle=20;validity=500;target=p1.example.com 7. Security Considerations Overload control mechanisms can be used by an attacker to conduct a denial-of-service attack on a SIP entity if the attacker can pretend that the SIP entity is overloaded. When such a forged overload indication is received by an upstream SIP entity, it will stop sending traffic to the victim. Thus, the victim is subject to a denial-of-service attack. An attacker can create forged load status reports by inserting itself into the communication between the victim and its upstream neighbors. The attacker would need to add status reports indicating a high load to the responses passed from the victim to its upstream neighbor. Proxies can prevent this attack by communicating via TLS. Since load status reports have no meaning beyond the next hop, there is no need to secure the communication over multiple hops. Another way to conduct an attack is to send a message containing a Hilt, et al. Expires September 4, 2007 [Page 21]
Internet-Draft Overload Control March 2007 high load status value through a proxy that does not support this extension. Since this proxy does not remove the load status information, it will reach the next upstream proxy. If the attacker can make the recipient believe that the load status was created by its direct downstream neighbor (and not by the attacker further downstream) the recipient stops sending traffic to the victim. A precondition for this attack is that the victim proxy does not support this extension since it would not pass through load status information otherwise. The attack also does not work if there is a stateful proxy between the attacker and the victim and only 100 (Trying) responses are used to convey the Load header. A malicious SIP entity could gain an advantage by pretending to support this specification but never reducing the load it forwards to the downstream neighbor. If its downstream neighbor receives traffic from multiple sources which correctly implement overload control, the malicious SIP entity would benefit since all other sources to its downstream neighbor would reduce load. OPEN ISSUE: the solution to this problem depends on the overload control algorithm. For a fixed message rate and window-based overload control, it is very easy for a downstream entity to monitor if the upstream neighbor throttles load as directed. For percentage throttling this is not always obvious since the load forwarded depends on the load received by the upstream neighbor. 8. IANA Considerations [TBD.] Appendix A. Acknowledgements Many thanks to Rich Terpstra and Jonathan Rosenberg for their contributions to this specification. 9. References 9.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. Hilt, et al. Expires September 4, 2007 [Page 22]
Internet-Draft Overload Control March 2007 [3] Rosenberg, J. and H. Schulzrinne, "Session Initiation Protocol (SIP): Locating SIP Servers", RFC 3263, June 2002. [4] Schulzrinne, H. and J. Polk, "Communications Resource Priority for the Session Initiation Protocol (SIP)", RFC 4412, February 2006. 9.2. Informative References [5] Rosenberg, J., "Requirements for Management of Overload in the Session Initiation Protocol", draft-rosenberg-sipping-overload-reqs-02 (work in progress), October 2006. [6] Liu, Z., "Applying Signaling Compression (SigComp) to the Session Initiation Protocol (SIP)", draft-ietf-rohc-sigcomp-sip-04 (work in progress), November 2006. [7] Rosen, B., "Framework for Emergency Calling in Internet Multimedia", draft-ietf-ecrit-framework-00 (work in progress), October 2006. Authors' Addresses Volker Hilt Bell Labs/Alcatel-Lucent 101 Crawfords Corner Rd Holmdel, NJ 07733 USA Email: volkerh@bell-labs.com Indra Widjaja Bell Labs/Alcatel-Lucent 600-700 Mountain Avenue Murray Hill, NJ 07974 USA Email: iwidjaja@alcatel-lucent.com Hilt, et al. Expires September 4, 2007 [Page 23]
Internet-Draft Overload Control March 2007 Daryl Malas Level 3 Communications 1025 Eldorado Blvd. Broomfield, CO USA Email: daryl.malas@level3.com Henning Schulzrinne Columbia University/Department of Computer Science 450 Computer Science Building New York, NY 10027 USA Phone: +1 212 939 7004 Email: hgs@cs.columbia.edu URI: http://www.cs.columbia.edu Hilt, et al. Expires September 4, 2007 [Page 24]
Internet-Draft Overload Control March 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Hilt, et al. Expires September 4, 2007 [Page 25]