Network Working Group G. Choudhury, Ed.
Request for Comments: 4222 AT&T
BCP: 112 October 2005
Category: Best Current Practice
Prioritized Treatment of Specific OSPF Version 2
Packets and Congestion Avoidance
Status of This Memo
This document specifies an Internet Best Current Practices for the
Internet Community, and requests discussion and suggestions for
improvements. Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract
This document recommends methods that are intended to improve the
scalability and stability of large networks using Open Shortest Path
First (OSPF) Version 2 protocol. The methods include processing OSPF
Hellos and Link State Advertisement (LSA) Acknowledgments at a higher
priority compared to other OSPF packets, and other congestion
avoidance procedures.
Table of Contents
1. Introduction...................................................2
2. Recommendations................................................3
3. Security Considerations........................................6
4. Acknowledgments................................................6
5. Normative References...........................................6
6. Informative References.........................................7
Appendix A. LSA Storm: Causes and Impact..........................8
Appendix B. List of Variables and Values.........................10
Appendix C. Other Recommendations and Suggestions................11
Choudhury, Ed. Best Current Practice [Page 1]
RFC 4222 Prioritized Treatment October 2005
1. Introduction
In this document, OSPF refers to OSPFv2 [Ref1]. The scalability and
stability improvement techniques described here may also apply to
OSPFv3 [Ref2], but that will require further study and operational
experience.
A large network running OSPF protocol may occasionally experience the
simultaneous or near-simultaneous update of a large number of link
state advertisements, or LSAs. This is particularly true if OSPF
traffic engineering extension [Ref3] is used that may significantly
increase the number of LSAs in the network. We call this event an
LSA storm and it may be initiated by an unscheduled failure or a
scheduled maintenance event. The failure may be hardware, software,
or procedural in nature.
The LSA storm causes high CPU and memory utilization at the router
causing incoming packets to be delayed or dropped. Delayed
acknowledgments (beyond the retransmission timer value) result in
retransmissions, and delayed Hello packets (beyond the router-dead
interval) result in neighbor adjacencies being declared down. The
retransmissions and additional LSA originations result in further CPU
and memory usage, essentially causing a positive feedback loop,
which, in the extreme case, may drive the network to an unstable
state.
The default value of the retransmission timer is 5 seconds and that
of the router-dead interval is 40 seconds. However, recently there
has been a lot of interest in significantly reducing OSPF convergence
time. As part of that plan, much shorter (sub-second) Hello and
router-dead intervals have been proposed [Ref4]. In such a scenario,
it will be more likely for Hello packets to be delayed beyond the
router-dead interval during network congestion caused by an LSA
storm.
In order to improve the scalability and stability of networks, we
recommend steps for prioritizing critical OSPF packets and avoiding
congestion. The details of the recommendations are given in Section
2. A simulation study is reported in [Ref13] that quantifies the
congestion phenomenon and its impact. It also studies several of the
recommendations and shows that they indeed improve the scalability
and stability of networks using OSPF protocol. [Ref13] is available
on request by contacting the editor or one of the authors.
Choudhury, Ed. Best Current Practice [Page 2]
RFC 4222 Prioritized Treatment October 2005
Appendix A explains in more detail LSA storm scenarios, their impact,
and points out a few real-life examples of control-message storms.
Appendix B provides a list of variables used in the recommendations
and their example values. Appendix C provides some further
recommendations and suggestions with similar goals.
2. Recommendations
The recommendations below are intended to improve the scalability and