E. Rescorla
                                                              RTFM, Inc.
INTERNET-DRAFT                                                       IAB
draft-iab-model-01.txt                  May 2004 (Expires November 2004)

                        Writing Protocol Models

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference mate-
   rial or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or
   ftp.isi.edu (US West Coast).


   The IETF process depends on peer review. However, IETF documents are
   generally written to be useful for implementors, not for reviewers.
   In particular, while great care is generally taken to provide a com-
   plete description of the state machines and bits on the wire, this
   level of detail tends to get in the way of initial understanding.
   This document describes an approach for providing protocol "models"
   that allow reviewers to quickly grasp the essence of a system.

1. Introduction

   The IETF process depends on peer review. However, in many cases, the
   documents submitted for publication are extremely difficult to
   review. Since reviewers have only limited amounts of time, this leads
   to extremely long review times, inadequate reviews, or both. In my
   view, a large part of the problem is that most documents fail to pre-
   sent an architectural model for how the protocol operated, opting
   instead to siply describe the protocol and let the reviewer figure it

Rescorla                                                         [Page 1]

   This is acceptable when documenting a protocol for implementors,
   because they need to understand the protocol in any case, but dramat-
   ically increases the strain on reviewers. Reviewers necessarily need
   to get the big picture of the system and then focus on particular
   points. They simply do not have time to give the entire document the
   attention an implementor would.

   One way to reduce this load is to present the reviewer with a
   MODEL--a short description of the system in overview form. This pro-
   vides the reviewer with the context to identify the important or dif-
   ficult pieces of the system and focus on them for review. As a side
   benefit, if the model is done first, it can be serve as an aid to the
   detailed protocol design and a focus for early review prior to proto-
   col completion. The intention is that the model would either be the
   first section of the protocol document or be a separate document pro-
   vided with the protocol.

2. The Purpose of a Protocol Model

   A protocol model needs to answer three basic questions:

     1. What problem is the protocol trying to achieve?
     2. What messages are being transmitted and what do they
     3. What are the important but un-obvious features of the

   The basic idea is to provide enough information that the reader could
   design a protocol which was roughly isomorphic to the protocol being
   described. This doesn't, of course, mean that the protocol would be
   identical, but merely that it would share most important features.
   For instance, the decision to use a KDC-based authentication model is
   an essential feature of Kerberos [KERBEROS]. By constrast, the use of
   ASN.1 is a simple implementation decision. S-expressions--or XML, had
   it existed at the time--would have served equally well.

3. Basic Principles

   In this section we discuss basic principles that should guide your

3.1. Less is more

   Humans are only capable of keeping a very small number of pieces of
   information in their head at once. Since we're interested in ensuring
   that people get the big picture, we therefore have to dispense with a
   lot of detail. That's good, not bad. The simpler you can make things

Rescorla                                                         [Page 2]Internet-Draft           Writing Protocol Models                  5/2004

   the better.

3.2. Abstraction is good

   A key technique for representing complex systems is to try to
   abstract away pieces. For instance, maps are better than photographs
   for finding out where you want to go because they provide an
   abstract, stylized, view of the information you're interested in.
   Don't be afraid to compress multiple protocol elements into a single
   abstract piece for pedagogical purposes.

3.3. A few well-chosen detail sometimes helps

   The converse of the the previous principle is that sometimes details
   help to bring a description into focus. Many people work better when
   given examples. Thus, it's often a good approach to talk about the
   material in the abstract and then provide a concrete description of
   one specific piece to bring it into focus. Authors should focus on
   the normal path. Error cases and corner cases should only be dis-
   cussed where they help illustrate some important point.

4. Writing Protocol Models

   Our experience indicates that it's easiest to grasp protocol models
   when they're presented in visual form. We recommend a presentation
   format that is centered around a few key diagrams with explanatory
   text for each. These diagrams should be simple and typically consist
   of "boxes and arrows"--boxes representing the major components,
   arrows representing their relationships and labels indicating impor-
   tant features.

   We recommend a presentation structured in three parts to match the
   three questions mentioned in the previous sections. Each part should
   contain 1-3 diagrams intended to illustrate the relevant points.

4.1. Describe the problem you're trying to solve

     First, figure out what you are trying to do (this is good
     advice under most circumstances, and it is especially apropos here.
     --NNTP Installation Guide

   The absolutely most critical task that a protocol model must perform
   is to explain what the protocol is trying to achieve. This provides
   crucial context for understanding how the protocol works and whether
   it meets its goals. Given the desired goals, in most cases an experi-
   enced reviewer will have an idea of how they would approach the prob-
   lem and be able to compare that to the approach taken by the protocol

Rescorla                                                         [Page 3]

   under review.

   The "Problem" section of the model should start out with a short
   statement of the environments in which the protocol is expected to be
   used. This section should describe the relevant entities and the
   likely scenarios under which they participate in the protocol. The
   Problem section should feature a diagram showing the major communi-
   cating parties and their inter-relationships. It is particularly
   important to lay out the trust relationships between the various par-
   ties as these are often un-obvious.

4.1.1. Example: STUN (RFC 3489)

   Network Address Translation (NAT) makes it difficult to run a number
   of classes of service from behind the NAT gateway. This is a particu-
   lar problem when protocols need to advertise address/port pairs as
   part of the application layer protocol. Although the NAT can be con-
   figured to accept data destined for that port, address translation
   means that the address that the application knows about is not the
   same as the one that it is reachable on.

   Consider the scenario represented in the figure below. A SIP client
   is initiating a session with a SIP server in which it wants the SIP
   server to send it some media. In its Session Description Protocol
   (SDP) [SDP] request  it provides the IP and port on which it is lis-
   tening. However, unbeknownst to the client, a NAT is in the way. It
   translates the IP address in the header, but unless it is SIP aware,
   it doesn't change the address in the request. The result is that the
   media goes into a black hole.

Rescorla                                                         [Page 4]Internet-Draft           Writing Protocol Models                  5/2004

                   |    SIP    |
                   |  Server   |
                   |           |
                        | [FROM:]
                        | [MSG: SEND MEDIA TO]
                   |           |
                   |    NAT    |
     --------------+  Gateway  +----------------
                   |           |
                        | [FROM:]
                        | [MSG: SEND MEDIA TO]
                   |    SIP    |
                   |  Client   |
                   |           |

The purpose of STUN [STUN] is to allow clients to detect this situation
and determine the address mapping. They can then place the appropriate
address in their application-level messages. This is done by making use
of an external STUN server. That server is able to determine the trans-
lated address and tell the STUN client, as shown below.

Rescorla                                                         [Page 5]

                               |   STUN    |
                               |  Server   |
                               |           |
                                 ^      |
[IP HDR FROM:]  |      | [IP HDR TO:]
                                 |      v
                               |           |
                               |    NAT    |
                 --------------+  Gateway  +----------------
                               |           |
                                 ^      |
[IP HDR FROM:]    |      | [IP HDR TO:]
                                 |      v
                               |    SIP    |
                               |  Client   |
                               |           |

4.2. Describe the protocol in broad overview

   Once you've described the problem, the next task is to describe the
   protocol in broad overview. This means showing, either in "ladder
   diagram" or "boxes and arrows" form, the protocol messages that flow
   between the various networking agents. This diagram should be accom-
   pied with explanatory text that describes the purpose of each message
   and the MAJOR data elements.

   This section SHOULD NOT contain detailed descriptions of the protocol
   messages or of each data element. In particular, bit diagrams, ASN.1
   modules and XML schema SHOULD NOT be shown. The purpose of this sec-
   tion is explicitly not to provide a complete description of the pro-
   tocol. Instead, it is to provide enough of a map so that a person
   reading the full protocol document can see where each specific piece

Rescorla                                                         [Page 6]Internet-Draft           Writing Protocol Models                  5/2004

4.3. State Machines

   In certain cases, it may be helpful to provide a state machine
   description of the behavior of network elements. However, such state
   machines should be kept as minimal as possible. Remember that the
   purpose is to promote high-level comprehension, not complete under-

4.4. Example: DCCP

   Although DCCP [DCCP] is datagram oriented like UDP, it is stateful
   like TCP. Connections go through the following phases:
   1. Initiation
   2. Feature negotiation
   3. Data transfer
   4. Termination

4.4.1. Initiation

   As with TCP, the initiation phase of DCCP involves a three-way hand-
   shake, shown in Figure 1.
   Client                                      Server
   ------                                      ------
   DCCP-Request            ->
   [Ports, Service,
                           <-           DCCP-Response
   DCCP-Ack                ->

                       FFiigguurree 11 DCCP 3-way handshake

   In the DCCP-Request message, the client tells the server the name of
   the service it wants to talk to and the ports it wants to communicate
   on. Note that ports are not tightly bound to services the way they
   are in TCP or UDP common practice. It also starts feature negotia-
   tion. For pedagogical reasons, we will present feature negotiation
   separately in the next section. However, realize that the early
   phases of feature negotiation happen concurrently with initiation.

   In the DCCP-Response message, the server tells the client that it is
   willing to accept the connection and continues feature negotiation.
   In order to prevent SYN-flood style DOS attacks, DCCP incorporates an
   IKE-style cookie exchange. The server can provide the client with a
   cookie that contains all the negotiation state. This cookie must be

Rescorla                                                         [Page 7]

   echoed by the client in the DCCP-Ack, thus removing the need for the
   server to keep state.

   In the DCCP-Ack message, the client acknowledges the DCCP-Response
   and returns the cookie to permit the server to compleye its side of
   the connection. As indicated in Figure 1, this message may also
   include feature negotiation messages.

4.4.2. Feature Negotiation

   In DCCP, feature negotiation is performed by attaching options to
   other DCCP packets. Thus feature negotiation can be piggybacked on
   any other DCCP message. This allows feature negotiation during con-
   nection initiation as well as feature renegotiation during data flow.

   Somewhat unusually, DCCP features are one-sided. Thus, it's possible
   to have a different congestion control regime for data sent from
   client to server than from server to client.

   Feature negotiation is done with three options:

   1. Change
   2. Prefer
   3. Confirm

   A Change message says to the peer "change this option setting on your
   side". The peer can either respond with a Confirm, meaning "I've
   changed it" or a Prefer, containing a list of other settings that the
   peer would like. Multiple exchanges of Change and Prefer may occur as
   the peers attempt so sort out what options they have in common. Some
   sample exchanges (partly cribbed from the DCCP spec) follow:

   Client                                      Server
   ------                                      ------
   Change(CC,2)            ->
                           <-           Confirm(CC,2)

   In this exchange, the peers agree to set CC equal to 2.

   Client                                      Server
   ------                                      ------
   Change(CC,3,4)           ->
                            <-       Prefer(CC,1,2,5)
   Change(CC,5)             ->
                            <-           Confirm(CC,5)

   In this exchange, the client requests CC values 3 and 4. Note that
   the client can offer multiple values. The server doesn't like any of

Rescorla                                                         [Page 8]Internet-Draft           Writing Protocol Models                  5/2004

   these and offers 1, 2, and 5. The client chooses 5 and the server

   Since features are one-sided, if a party wants to change one of his
   own options, he must ask the peer to issue a Change. This is done
   using a Prefer, as shown below, where the client gets the server to
   request that the client change the value of CC to 3.

   Client                                      Server
   ------                                      ------
   Prefer(CC,3,4)           ->
                            <-           Change(CC,3)
   Confirm(CC,3)            ->

4.4.3. Data Transfer

   Rather than have a single congestion control regime as in TCP, DCCP
   offers a variety of negotiable congestion control regimes. The DCCP
   documents describe two congestion control regimes: additive increase,
   multiplicative decrease (CCID-2 [CCID2]) and TCP-friendly rate con-
   trol (CCID-3 [CCID3]). CCID-2 is intended for applications which want
   maximum throughput. CCID-3 is intended for real-time applications
   which want smooth response to congestion. CCID-2

   CCID-2's congestion control is extremely similar to that of TCP. The
   sender maintains a congestion window and sends packets until that
   window is full. Packets are Acked by the receiver. Dropped packets
   and ECN [ECN] are used to indicate congestion. The response to con-
   gestion is to halve the congestion window. One subtle diference
   between DCCP and TCP is that the Acks in DCCP must contain the
   sequence numbers of all received packets (within a given window) not
   just the highest sequence number as in TCP. CCID-3

   CCID-3 is an equation based form of rate control which is intended to
   provide smoother response to congestion than CCID-2. The sender main-
   tains a "transmit rate". The receiver sends ACK packets which also
   contain information about the receiver's estimate of packet loss. The
   sender uses this information to update its transmit rate. Although
   CCID-3 behaves somewhat differently from TCP in its short term con-
   gestion response, it is designed to operate fairly with TCP over the
   long term.

Rescorla                                                         [Page 9]

4.4.4. Termination

   Connection termination in DCCP is initiated by sending a Close mes-
   sage. Either side can send a Close message. The peer then responds
   with a Reset message, at which point the connection is closed. The
   side that sent the Close message must quietly preserve the socket in
   TIMEWAIT state for 2MSL.

   Client                                      Server
   ------                                      ------
   Close                    ->
                            <-                  Reset
   [Remains in TIMEWAIT]

   Note that the server may wish to close the connection but not remain
   in TIMEWAIT (e.g., due to a desire to minimize server-side state.) In
   order to accomplish this, the server can elicit a Close from the
   client by sending a CloseReq message and thus keeping the TIMEWAIT
   state on the client.

5. Describe any important protocol features

   The final section (if there is one) should contain an explanation of
   any important protocol features which are not obvious from the previ-
   ous sections. In the best case, all the important features of the
   protocol would be obvious from the message flow. However, this isn't
   always the case. This section is an opportunity for the author to
   explicate those features. Authors should think carefully before writ-
   ing this section. If there are no important points to be made they
   should not populate this section.

   Examples of the kind of feature that belongs in this section include:
   high-level security considerations, congestion control information
   and overviews of the algorithms that the network elements are
   intended to follow. For instance, if you have a routing protocol you
   might use this section to sketch out the algorithm that the router
   uses to determine the appropriate routes from protocol messages.

5.1. Example: WebDAV COPY and MOVE

   WebDAV [WEBDAV] includes both a COPY method and a MOVE method. While
   a MOVE can be thought of as a COPY followed by DELETE, COPY+DELETE
   and MOVE aren't entirely equivalent.

   The use of COPY+DELETE as a MOVE substitute is problematic because of
   the creation of the intermediate file. Consider the case where the
   user is approaching some quota boundary. A COPY+DELETE should be for-
   bidden because it would temporarily exceed the quota. However, a

Rescorla                                                        [Page 10]Internet-Draft           Writing Protocol Models                  5/2004

   simple rename should work in this situation.

   The second issue is permissions. The WebDAV permissions model allows
   the server to grant users permission to rename files but not to cre-
   ate new ones--this is unusual in ordinary filesystems but nothing
   prevents it in WebDAV. This is clearly not possible if a client uses
   COPY+DELETE to do a MOVE.

   Finally, a COPY+DELETE does not produce the same logical result as
   would be expected with a MOVE. Because COPY creates a new resource,
   it is permitted (but not required) to use the time of new file cre-
   ation as the creation date property. By contrast, the expectation for
   move is that the renamed file will have the same properties as the

6. Formatting Issues

   The requirement that Internet-Drafts and RFCs be renderable in ASCII
   is a significant obstacle when writing the sort of graphics-heavy
   document being described here. Authors may find it more convenient to
   do a separate protocol model document in Postscript or PDF and simply
   make it available at review time--though an archival version would
   certainly be handy.

7. A Complete Example: Internet Key Exchange (IKE)

7.1. Operating Environment

   Internet key Exchange (IKE) [IKE] is a key establishment and parame-
   ter negotiation protocol for Internet protocols. Its primary applica-
   tion is for establishing security associations (SAs) [IPSEC] for
   IPsec AH [AH] and ESP [ESP].

Rescorla                                                        [Page 11]

   +--------------------+                       +--------------------+
   |                    |                       |                    |
   |   +------------+   |                       |   +------------+   |
   |   |    Key     |   |         IKE           |   |    Key     |   |
   |   | Management | <-+-----------------------+-> | Management |   |
   |   |  Process   |   |                       |   |  Process   |   |
   |   +------------+   |                       |   +------------+   |
   |         ^          |                       |         ^          |
   |         |          |                       |         |          |
   |         v          |                       |         v          |
   |   +------------+   |                       |   +------------+   |
   |   |   IPsec    |   |        AH/ESP         |   |   IPsec    |   |
   |   |   Stack    | <-+-----------------------+-> |   Stack    |   |
   |   |            |   |                       |   |            |   |
   |   +------------+   |                       |   +------------+   |
   |                    |                       |                    |
   |                    |                       |                    |
   |     Initiator      |                       |     Responder      |
   +--------------------+                       +--------------------+

   The general deployment model for IKE is shown in Figure 2. The IPsec
   engines and IKE engines typically are separate modules. When a packet
   needs to be processed (either sent or received) for which no security
   association exists, the IPsec engine contacts the IKE engine and asks
   it to establish an appropriate SA. The IKE engine contacts the appro-
   priate peer and uses IKE to establish the SA. Once the IKE handshake
   is finished it registers the SA with the IPsec engine.

   In addition, IKE traffic between the peers can be used to refresh
   keying material or adjust operating parameters such as algorithms.

7.1.1. Initiator and Responder

   Although IPsec is basically symmetrical, IKE is not. The party who
   sends the first message is called the INITIATOR. The other party is
   called the RESPONDER. In the case of TCP connections the INITIATOR
   will typically be the peer doing the active open (i.e. the client).

7.1.2. Perfect Forward Secrecy

   One of the major concerns in IKE design was that traffic be protected
   even if they keying material of the nodes was later compromised, pro-
   vided that the session in question had terminated and so the session-
   specific keying material was gone. This property is often called PER-

Rescorla                                                        [Page 12]Internet-Draft           Writing Protocol Models                  5/2004

7.1.3. Denial of Service Resistance

   Since IKE allows arbitrary peers to initiate computationally expen-
   sive cryptographic operations, it potentially allows resource con-
   sumption denial of service attacks to be mounted against the IKE
   engine. IKE includes countermeasures designed to minimize this risk.

7.1.4. Keying Assumptions

   Because Security Associations are essentially symmetric, both sides
   must in general be authenticated. Because IKE needs to be able to
   establish SAs between a broad range of peers with various kinds of
   prior relationships, IKE supports a very flexible keying model. Peers
   can authenticate via shared keys, digital signatures (typically from
   keys vouched for by certificates), or encryption keys.

7.1.5. Identity Protection

   Although IKE requires the peers to authenticate to each other, it was
   considered desirable by the working group to provide some identity
   protection for the communicating peers. In particular, the peers
   should be able to hide their identity from passive observers and one
   peer should be able to require the author to authenticate before they
   self-identity. In this case, the designers chose to make the party
   who speaks first (the INITIATOR) identify first.

7.2. Protocol Overview

   At a very high level, there are two kinds of IKE handshake:
     (1) Those which establish an IKE security association.
     (2) Those which establish an AH or ESP security association.

   When two peers which have never communicated before need to establish
   an AH/ESH SA, they must first establish an IKE SA. This allows them
   to exchange an arbitrary amount of protected IKE traffic. They can
   then use that SA to do a second handshake to establish SAs for AH and
   ESP. This process is shown in schematic form below. The notation
   E(SA,XXXX) is used to indicate that traffic is encrypted under a
   given SA.

Rescorla                                                        [Page 13]

   Initiator                                  Responder
   ---------                                  ---------

   Handshake MSG            ->                           \
                            <-            Handshake MSG   \ Establish IKE
                                                          / SA (IKEsa)
                           [...] /

   E(IKEsa, Handshake MSG)  ->                           \  Establish AH/ESP
                            <-  E(IKEsa, Handshake MSG)  /  SA

   IKE terminology is somewhat confusing, referring under different cir-
   cumstances to "phases" and "modes". For maximal clarity we will refer
   to the the Establishment of the IKE SA as "Stage 1" and the Estab-
   lishment of AH/ESP SAs as "Stage 2". Note that it's quite possible
   for there to be more than one Stage 2 handshake, once Stage 1 has
   been finished. This might be useful if you wanted to establish multi-
   ple AH/ESP SAs with different cryptographic properties.

   The Stage 1 and Stage 2 handshakes are actually rather different,
   because the Stage 2 handshake can of course assume that its traffic
   is being protected with an IKE SA. Accordingly, we will first discuss
   Stage 1 and then Stage 2.

7.2.1. Stage 1

   There are a large number of variants of the IKE Stage 1 handshake,
   necessitated by use of different authentication mechanisms. However,
   broadly speaking they fall into one of two basic categories: MAIN
   MODE, which provides identity protection and DoS resistance, and
   AGGRESSIVE MODE, which does not. We will cover MAIN MODE first. Main Mode

   Main Mode is a six message (3 round trip) handshake which offers
   identity protection and DoS resistance. An overview of the handshake
   is below.

Rescorla                                                        [Page 14]Internet-Draft           Writing Protocol Models                  5/2004

   Initiator                                   Responder
   ---------                                   ---------
   CookieI, Algorithms      ->                           \  Parameter
                            <-       CookieR, Algorithms /  Establishment

   Nonce, Key Exchange      ->
                            <-        Nonce, Key Exchange\  Establish
                                                         /  Shared key

   E(IKEsa, Auth Data)      ->
                            <-        E(IKEsa, Auth data)\  Authenticate
                                                         /      Peers

   In the first round trip, the Initiator offers a set of algorithms and
   parameters. The Responder picks out the single set that it likes and
   responds with that set. It also provides CookieR, which will be used
   to prevent DoS attacks. At this point, there is no secure association
   but the peers have tentatively agreed upon parameters. These parame-
   ters include a Diffie-Hellman group, which will be used in the second
   round trip.

   In the second round trip, the Initiator sends the key exchange infor-
   mation. This generally consists of the Initiator's Diffie-Hellman
   public share (Yi). He also supplies CookieR, which was provided by
   the responder. The Responder replies with his own DH share (Yr). At
   this point, both Initiator and Responder can compute the shared DH
   key (ZZ). However, there has been no authentication and so they don't
   know with any certainty that the connection hasn't been attacked.
   Note that as long as the peers generate fresh DH shares for each
   handshake than PFS will be provided.

   Before we move on, let's take a look at the cookie exchange. The
   basic anti-DoS measure used by IKE is to force the peer to demon-
   strate that they can receive traffic from you. This foils blind
   attacks like SYN floods [SYNFLOOD] and also makes it somewhat easier
   to track down attackers. The cookie exchange serves this role in IKE.
   The Responder can verify that the Initiator supplied a valid CookieR
   before doing the expensive DH key agreement. This does not totally
   eliminate DoS attacks, since an attacker who was willing to reveal
   his location could still consume server resources, but it does pro-
   tect against a certain class of blind attack.

   In the final round trip, the peers establish their identities. Since
   they share an (unauthenticated) key, they can send their identities
   encrypted, thus providing identity protection from eavesdroppers. The
   exact method of proving identity depends on what form of credential
   is being used (signing key, encryption key, shared secret, etc.), but

Rescorla                                                        [Page 15]

   in general you can think of it as a signature over some subset of the
   handshake messages. So, each side would supply its certificate and
   then sign using the key associated with that certificate. If shared
   keys are used, the authentication data would be a key id and a MAC.
   Authentication using public key encryption follows similar principles
   but is more complicated. Refer to the IKE document for more details.

   At the end of the Main Mode handshake, the peers share:
   (1) A set of algorithms for encryption of further IKE traffic.
   (2) Traffic encryption and authentication keys.
   (3) Mutual knowledge of the peer's identity. Aggressive Mode

   Although IKE Main Mode provides the required services, there was con-
   cern that the large number of round trips required added excessive
   latency. Accordingly, an Aggressive Mode was defined. Aggressive mode
   packs more data into fewer messages and thus reduces latency. How-
   ever, it does not provide protection against DoS or identity protec-
   Initiator                                   Responder
   ---------                                   ---------
   Algorithms, Nonce,
   Key Exchange,            ->
                            <-         Algorithms, Nonce,
                                  Key Exchange, Auth Data
   Auth Data                ->

   After the first round trip, the peers have all the required proper-
   ties except that the Initiator has not authenticated to the Respon-
   der. The third message closes the loop by authenticating the Initia-
   tor. Note that since the authentication data is sent in the clear, no
   identity protection is provided and since the Responder does the DH
   key agreement without a round trip to the Initiator, there is no DoS

7.2.2. Stage 2

   Stage 1 on its own isn't very useful. The purpose of IKE, after all,
   is to establish associations to be used to protect other traffic, not
   just to establish IKE SAs. Stage 2 (what IKE calls "Quick Mode") is
   used for this purpose. The basic Stage 2 handshake is shown below.

Rescorla                                                        [Page 16]Internet-Draft           Writing Protocol Models                  5/2004

      Initiator                                    Responder
      ---------                                    ---------
      AH/ESP parameters,
      Algorithms, Nonce,
      Handshake Hash          ->

                               <-          AH/ESP parameters,
                                           Algorithms, Nonce,
                                               Handshake Hash
      Handshake Hash           ->

   As with quick mode, the first two messages establish the algorithms
   and parameters while the final message is a check over the previous
   messages. In this case, the parameters also include the transforms to
   be applied to the traffic (AH or ESP) and the kinds of traffic which
   are to be protected. Note that there is no key exchange information
   shown in these messages.

   In this version of Quick Mode, the peers use the pre-existing Stage 1
   keying material to derive fresh keying material for traffic protec-
   tion (with the nonces to ensure freshness). Quick mode also allows
   for a new Diffie-Hellman handshake for per-traffic key PFS. In that
   case, the first two messages shown above would also include Key
   Exchange payloads, as shown below.

      Initiator                                    Responder
      ---------                                    ---------
      AH/ESP parameters,
      Algorithms, Nonce,
      Key Exchange,            ->
      Handshake Hash

                               <-          AH/ESP parameters,
                                           Algorithms, Nonce,
                                                Key Exchange,
                                               Handshake Hash
      Handshake Hash           ->

7.3. Other Considerations

   There are a number of features of IKE that deserve special considera-
   tion. These are discussed here.

7.3.1. Cookie Generation

   As mentioned previously, IKE uses cookies as a partial defense
   against DoS attacks. When the responder receives Main Mode message 3
   containing the Key Exchange data and the cookie, it verifies that the

Rescorla                                                        [Page 17]

   cookie is correct. However, this verification must not involve having
   a list of valid cookies. Otherwise, an attacker could potentially
   consume arbitrary amounts of memory by repeatedly requesting cookies
   from a responder. The recommended way to generate a cookie, suggested
   by Phil Karn, is by having a single master key and compute a hash of
   the secret and the initiator's address information. This cookie can
   be verified by recomputing the cookie value based on information in
   the third message and seeing if it matches.

7.3.2. Endpoint Identities

   So far we have been rather vague about what sorts of endpoint identi-
   ties are used. In principle, there are three ways a peer might be
   identified: by a shared key, a pre-configured public key, and a
   certificate. Shared Key

   In a shared key scheme, the peers share some symmetric key. This key
   is associated with a key identifier which is known to both parties.
   It is assumed that the party verifying that identity also has some
   sort of table that indicates what sorts of traffic (e.g. what
   addresses) that identity is allowed to negotiate SAs for. Pre-configured public key

   A pre-configured public key scheme is the same as a shared key scheme
   except that the verifying party has the authenticating party's public
   key instead of a shared key. Certificate

   In a certificate scheme, authenticating party presents a certificate
   containing their public key. It's straightforward to establish that
   that certificate matches the authentication data provided by the
   peer. What's less straightforward is to determine whether a given
   peer is entitled to negotiate for a given class of traffic. In the-
   ory, one might be able to determine this from the name in the
   certificate (e.g. the subject name contains an IP address that
   matches the ostensible IP address). In practice, this is not clearly
   specified in IKE and therefore not really interoperable. The more
   likely case at the moment is that there is a configuration table map-
   ping certificates to policies, as with the other two authentication

  [AH]       Kent, S., and Atkinson, R., "IP Authentication Header",
             RFC 2402, November 1998.

Rescorla                                                        [Page 18]Internet-Draft           Writing Protocol Models                  5/2004

  [CCID2]    Floyd, S., Kohler, E., "Profile for DCCP Congestion Control ID 2:
             TCP-like Congestion Control", draft-ietf-dccp-ccid2-04.txt,
             October 2003.

  [CCID3]    Floyd, S., Kohler, E., Padhye, J. "Profile for DCCP Congestion
             Control ID 3: TFRC Congestion Control",
             draft-ietf-dccp-ccid3-05.txt, February 2004.

  [DCCP]     Kohler, E., Handley, M., Floyd, S., "Datagram Congestion
             Control Protocol (DCCP)", draft-ietf-dccp-spec-06.txt,
             February 2004.

  [ECN]      Ramakrishnan, K. Floyd, S., Black D., "The Addition of
             Explicit Congestion Notification (ECN) to IP",
             RFC 3168, September 2001.

  [ESP]      Kent, S., and Atkinson, R., "IP Encapsulating Security
             Payload (ESP)", RFC 2406, November 1998.

  [IKE]      Harkins, D., Carrel, D., "The Internet Key Exchange (IKE)",
             RFC 2409, November 1998.

  [IPSEC]    Kent, S., Atkinson, R., "Security Architecture for the Internet
             Protocol", RFC 2401, November 1998.

  [KERBEROS] Kohl, J., Neuman, C., "The Kerberos Network Authentication
             Service (V5)", RFC 1510, September 1993.

  [SDP]      Handley, M., Jacobson, V., "SDP: Session Description Protocol"
             RFC 2327, April 1998.

  [STUN]     Rosenberg, J., Weinberger, J., Huitema, C., Mahy, R.,
             "STUN - Simple Traversal of User Datagram Protocol (UDP)",
             RFC 3489, March 2003.

  [WEBDAV]   Goland, Y., Whitehead, E., Faizi, A., Carter, S., Jensen, D.
             "HTTP Extensions for Distributed Authoring -- WEBDAV",
             RFC 2518, February 1999.

Security Considerations

   This document does not define any protocols and therefore has no
   security considerations.

Author's Address
Eric Rescorla <ekr@rtfm.com>
RTFM, Inc.

Rescorla                                                        [Page 19]

2064 Edgewood Drive
Palo Alto, CA 94303
Phone: (650)-320-8549

Internet Architecture Board <iab@iab.org>

Appendix A. IAB Members at the time of this writing

Bernard Aboba
Harald Alvestrand
Rob Austein
Leslie Daigle
Patrik Falstrom
Sally Floyd
Jun-ichiro Itojun Hagino
Mark Handley
Bob Hinden
Geoff Huston
Eric Rescorla
Pete Resnick
Jonathon Rosenberg

Rescorla                                                        [Page 20]