Network Working Group                                            H. Song
Internet-Draft                                                  H. Zheng
Intended status: Standards Track                                X. Jiang
Expires: January 8, 2009                                          Huawei
                                                            July 7, 2008


                Diagnose P2PSIP Overlay Network Failures
                     draft-zheng-p2psip-diagnose-02

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 8, 2009.

Abstract

   This document describes a simple and efficient mechanism that can be
   used to detect and localize failures in P2PSIP overlay network.  This
   document mainly consists of two parts: information carried in a
   P2PSIP "Echo request" message and "Echo response" message for the
   purpose of fault detection and localization, and mechanisms for
   processing those messages.








Song, et al.             Expires January 8, 2009                [Page 1]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Usage Scenarios  . . . . . . . . . . . . . . . . . . . . .  3
   2.  Overview of Functions  . . . . . . . . . . . . . . . . . . . .  4
   3.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   4.  Motivation . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   5.  Packets Formats  . . . . . . . . . . . . . . . . . . . . . . .  6
     5.1.  Message Header . . . . . . . . . . . . . . . . . . . . . .  6
     5.2.  Message Attributes . . . . . . . . . . . . . . . . . . . .  6
       5.2.1.  Response Attribute . . . . . . . . . . . . . . . . . .  7
       5.2.2.  Echo Attribute . . . . . . . . . . . . . . . . . . . .  8
       5.2.3.  Respond Peer Info Attribute  . . . . . . . . . . . . . 10
   6.  Message  . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     6.1.  Echo request . . . . . . . . . . . . . . . . . . . . . . . 12
     6.2.  Echo response  . . . . . . . . . . . . . . . . . . . . . . 12
       6.2.1.  Echo response from the terminator peer . . . . . . . . 13
       6.2.2.  Echo response from the intermediate peer . . . . . . . 14
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 15
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 15
   9.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     9.1.  P2PSIP Ping  . . . . . . . . . . . . . . . . . . . . . . . 16
     9.2.  P2PSIP Traceroute  . . . . . . . . . . . . . . . . . . . . 17
   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 18
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 18
     11.2. Informative References . . . . . . . . . . . . . . . . . . 19
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20
   Intellectual Property and Copyright Statements . . . . . . . . . . 22






















Song, et al.             Expires January 8, 2009                [Page 2]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


1.  Introduction

   P2P systems are self-organizing and ideally require no network
   management in the traditional sense to set up and to configure
   individual P2P nodes.  P2P service providers may however contemplate
   usage scenarios where some diagnostics are required.  We present a
   simple connectivity test that may be used in such diagnostics.

1.1.  Usage Scenarios

   The common usage scenarios for P2P diagnostics can be broadly
   categorized in three classes:

   a.  Automatic diagnostics built into the P2P overlay routing
   protocol.  Nodes perform periodic checks of known neighbors and
   remove those nodes from the routing tables that fail to respond to
   connectivity checks [Handling Churn in a DHT].  The unresponsive
   nodes may however be only temporarily disabled due to some local
   cryptographic processing overload, disk processing overload or link
   overload.  It is therefore useful to repeat the connectivity checks
   to see if such nodes have recovered and can be again placed in the
   routing tables.  This process is known as 'failed node recovery' and
   it can be optimized as described in the reference [Handling Churn in
   a DHT].

   b.  P2P system diagnostics to check the overall health of the P2P
   overlay network, the consumption of network bandwidth, problem links
   and also checks for abusive or malicious nodes.  This is not a
   trivial problem and has been studied in detail for content and
   streaming P2P overlays, such as for example in [Diagnostic
   Framework].

   Similar work has been reported more recently for P2PSIP overlays as
   applied to the P2PP protocol [Diagnostics and NAT traversal in P2PP].

   c.  Diagnostics for a particular node to follow up an individual user
   complaint.  In this case a technical support person may use a desktop
   sharing application with the permission of the user to determine
   remotely the health and possible problems with the malfunctioning
   node.  Part of the remote diagnostics may consist of simple
   connectivity tests with other nodes in the P2PSIP overlay.  The
   simple connectivity tests are not dependent on the type of P2PSIP
   overlay and they are the topic of this memo.  Note however that other
   tests may be required as well, such as checking the health and
   performance of the user's computer or mobile device and also checking
   the link bandwidth connecting the user to the Internet.





Song, et al.             Expires January 8, 2009                [Page 3]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


2.  Overview of Functions

   As one diagnostics protocol, P2PSIP diagnostics protocol is mainly
   used to detect and localize failures in P2PSIP overlay network.  It
   provides mechanisms to detect and localize malfunctioning or badly
   behaving peers including disabled peers, congested peers and
   misrouting peers.  It provides a mechanism to detect connectivity to
   the specified peer, a mechanism to detect availabilities of specified
   resource records and a mechanism to discover P2PSIP overlay topology
   and the underlay topology.

   The P2PSIP diagnostics protocol described here reuses P2PSIP peer
   protocol [I-D.jiang-p2psip-sep]; essentially it reuses P2PSIP peer
   protocol specification and then introduces one new type of message
   (i.e., Echo message).  P2PSIP diagnostics protocol strictly follows
   the P2PSIP peer protocol specification on the messages routing,
   transporting and NAT traversal etc.  The diagnostic method is however
   P2PSIP protocol independent.


3.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The other concepts used in this document are compatible with
   "Concepts and Terminology for Peer to Peer SIP" [I-D.ietf-p2psip-
   concepts] and the P2PSIP peer protocol SEP[I-D.jiang-p2psip-sep].


4.  Motivation

   In the last few years, overlay networks have rapidly evolved and
   emerged as a promising platform to deploy new applications and
   services in the Internet.  One of the reasons overlay networks are
   seen as an excellent platform for large scale distributed systems is
   their resilience in the presence of failures.  This resilience has
   three aspects: data replication, routing recovery, and static
   resilience.  Routing recovery algorithms are used to repopulate the
   routing table with live nodes when failures are detected.  Static
   resilience measures the extent to which an overlay can route around
   failures even before the recovery algorithm repairs the routing
   table.  Both routing recovery and static resilience relies on
   accurate and timely detection of failures.

   As descriptions in the "P2PSIP Security Analysis and Evaluation"[I-
   D.song-p2psip-security-eval], "Security requirements in P2PSIP"



Song, et al.             Expires January 8, 2009                [Page 4]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   [I-D.matuszewski-p2psip-security- requirement] and "Security
   Mechanisms for Peer to Peer SIP"[I-D.jennins-p2psip-security-
   mechanisms], there are some malfunctioning or badly behaving peers in
   the P2PSIP overlay, those peers may be disabled peers, congested
   peers or peers behaving with misrouting, and the impact of those
   peers in the overlay network is degradation of quality of service
   provided collectively by the peers in the overlay network or
   interruption of those services.  It is desirable to identify
   malfunctioning or badly behaving peers through some diagnostics
   tools, and exclude or reject them from the P2PSIP system.  Besides
   those faults, node failures may be caused by underlying failures, for
   example, when the IP layer routing failover speed after link failures
   is very slow, then the recovery from the incorrect overlay topology
   may also be slow.  Moreover, if a backbone link fails and the
   failover is slow, the network may be partitioned, which may lead to
   partitions of overlay topologies and inconsistent routing results
   between different partitioned components.

   Some keep-alive algorithms based on periodically probe and
   acknowledge enable accurate and timely detection of failures of one
   peer's neighbors [Overlay-Failure-Detection], but those algorithms
   only can detect the disabled neighbors using the periodical method,
   it may not be enough for operating the overlay network by service
   providers.

   One general P2PSIP overlay diagnostics protocol supporting periodical
   method and on-demand method for node failures and network failures is
   desirable.  This document describes one general P2PSIP overlay
   diagnostics protocol useful for P2PSIP peer protocols and it is a
   good complementation for some keep-alive algorithms in the P2P or
   P2PSIP overlay itself.

   In this document, we mainly describe how to detect and localize those
   failures including disabled peers, congested peers, misrouting
   behaviors and underlying network faults in P2PSIP overlay network
   through a simple and efficient mechanism.  This mechanism is modeled
   after the ping/traceroute paradigm: ping (ICMP echo request [RFC792])
   is used for connectivity checks, and traceroute is used for hop-by-
   hop fault localization as well as path tracing.  This document
   specifies a "ping" mode and a "traceroute" mode for diagnose P2PSIP
   overlay network.

   The basic idea is to transmit a P2PSIP peer protocol request message
   (Echo request message) along the same path which all other P2PSIP
   peer protocol request messages would traverse.  In "Ping" mode, an
   Echo request message are forwarded by the intermediate peers along
   the path and then terminated by the responsible peer, and after local
   diagnostics, the responsible peer returns an Echo response message.



Song, et al.             Expires January 8, 2009                [Page 5]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   In "Traceroute" mode, an Echo request message is received and
   disposed by each peer along the routing path, each peer along the
   path returns an Echo response message with local diagnostics
   information including the result and causes if existing.

   One approach these tools can be used is to detect the connectivity to
   the specified peer or the availability of the specified resource-
   record through P2PSIP Ping operation once the overlay network
   receives some alarms about overlay service degradation or
   interruption, if the ping fails, one can then send a P2PSIP
   Traceroute to determine where the fault lies.


5.  Packets Formats

   This document reuses the P2PSIP peer protocol to carry diagnostics
   information.  Considering special usage due to diagnostics, this
   document extends the P2PSIP peer protocol by introducing one new type
   of message and some attributes.

5.1.  Message Header

   The mechanism defined in this document follows P2PSIP peer protocol
   specification, the introduced message whatever requests or responses
   adopts the same message format with existing P2PSIP peer protocol
   messages.  Different types of messages convey different TLV objects
   following by the common message header according to the protocol
   design.  Those objects are called "Attributes".  Please refer to
   P2PSIP peer protocol [I-D.jiang-p2psip-sep] for the detailed format
   of Message Header.

   This document introduces one new type of message as below:

   Message Type          Name
   11                     Echo

5.2.  Message Attributes

   As P2PSIP peer protocol, A P2PSIP diagnostics protocol message
   contains zero, one or multiple Attributes which describe the
   specified contents.  All attributes follow P2PSIP peer protocol
   specification and adopt TLV style.  Please refer to P2PSIP peer
   protocol [I-D.jiang-p2psip-sep] for the detailed format of Message
   Attributes.

   This document introduces two new types of attributes as below:





Song, et al.             Expires January 8, 2009                [Page 6]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   Attribute Type          Name
   15                      Echo
   16                      Respond Peer Info

   In addition to the newly introduced Echo attribute, this document
   extends the Response attribute defined in P2PSIP peer protocol
   specification.

5.2.1.  Response Attribute

   This document extends the Response attribute defined in the P2PSIP
   peer protocol specification to describe the result of diagnostics as
   Figure 1.

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |M|  Reserved   |Attribute Type |           Length              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |         Response code         |       Response sub-code       |
       +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+

                      Figure 1 Response Attribute Format

   M-flag: the value is set;

   Reserved (7 bits): those bits are reserved and ignored;

   Attribute Type (8 bits): the value is 7 (0x07) for Response
   Attribute;

   Length (16 bits): the length in bytes of this attribute;

   Response Code (16 bits): response code is determined by the
   responder, this field is necessary for any response attribute;

   Response Sub-Code (16 bits): response sub-code is determined by the
   responder, this field is optional.

   This document introduces new response codes as below:











Song, et al.             Expires January 8, 2009                [Page 7]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   Response Code       Meaning
   414                 Underlay Destination Unreachable
   415                 Underlay Time exceeded
   416                 Upstream Misrouting
   417                 Loop detected
   419                 TTL hops exceeded
   This document introduces response sub-codes for response code 414 as
   below:

   Response Sub-Code     Meaning
         0              net unreachable
         1              host unreachable
         2              protocol unreachable
         3              port unreachable
         4              fragmentation needed
         5              source route failed

5.2.2.  Echo Attribute

   This document introduces Echo attribute to describe diagnostics
   control information, including but not limited to: the routing mode
   of the Echo message, the number of hops that the Echo message
   traverses, the reply rule to generate the Echo response message, the
   timestamp of initiating the Echo request message, the timestamp of
   receiving the Echo request message, and the expiration time of the
   Echo request message.

   The Echo attribute format is shown as Figure 2:























Song, et al.             Expires January 8, 2009                [Page 8]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |M|U|P|Reserved |Attribute Type |           Length              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  Routing Mode |  Hop  Counter |   Reply rule  | Underlay TTL  |
       +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+
       |                  TimeStamp Initiated (seconds)                |
       +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+
       |                TimeStamp Initiated (microseconds)             |
       +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+
       |                  TimeStamp Received (seconds)                 |
       +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+
       |                TimeStamp Received (microseconds)              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                   Expiration time (seconds)                   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                  Expiration time (microseconds)               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 2 Echo Attribute Format

   M-flag: the flag is set;

   U-flag: indicate whether the receiver of Echo request message needs
   to carry immediate upstream peer information in the following Echo
   response message.  If set (U=1), the Echo response message must carry
   its immediate upstream peer information such as Peer-ID;

   P-flag: indicate whether the intermediate peer continues to forward
   the Echo request message when it detects misrouting behavior of its
   immediate upstream peer for this Echo request message.  If set (P=1),
   the intermediate peer continues to forward the Echo request message
   upon detecting misrouting behavior of its immediate upstream peer;
   otherwise the intermediate peer stops forwarding.  Certainly the
   intermediate peer should stop forwarding any received Echo request
   message once detecting looping even when P-flag is set;

   Reserved (5 bits): those bits are reserved and ignored;

   Attribute Type (8 bits): the value is 15 (0x0F);

   Length (16 bits): the length in bytes of this attribute;

   Routing Mode (8 bits): indicate the routing mode of the Echo message
   in the overlay.

   Hop Counter (8 bits): This field is ignored by Echo requests.  In



Song, et al.             Expires January 8, 2009                [Page 9]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   Echo responses, this field must be exactly copied from the TTL field
   of the message header in the received Echo request.  Then this
   information is sent back to the request initiator to compute the hops
   that the message traverses in the overlay.

   Reply rule (8 bits): indicate the process policy to the Echo request
   specified by the initiator;

   Underlay TTL (8 bits): indicate the underlay TTL which the
   intermediate peer must adopt when forwarding the Echo requests, it is
   specified by the initiator;

   Timestamp Initiated (64 bits): the time-of-day (in seconds and
   microseconds, according to the sender's clock) in NTP format
   [RFC2030] when the P2PSIP Overlay Echo request is sent。It can
   be carried in the Echo response message from the receiver; certainly
   it first appears in the Echo request message;

   Timestamp Received (64 bits): it is in an Echo response message and
   the time-of-day (according to the receiver's clock) in NTP format
   [RFC2030] that the corresponding the P2PSIP Overlay Echo request was
   received;

   Expiration time (64 bits): the expiration time of Echo request
   message, it is the time-of-day in NTP format [RFC2030].

   This document defines those routing modes as below:

   Forward mode       Meaning
          0           Recursive
          1           Iterative
          2           Semi-recursive
          3           Overlay native

   This document defines those reply rules as below:

   Reply rule       Meaning
          1         Do not reply except destination peer
          2         Immediately reply

5.2.3.  Respond Peer Info Attribute

   This document introduces Respond Peer attribute to describe Peer
   information such as Peer-ID.

   Respond Peer Info attribute is also a composite attribute.  Like the
   Source Peer Info attribute and Destination Peer Info attribute, it
   may be also comprised of Peer-ID attribute, Peer Service Capability



Song, et al.             Expires January 8, 2009               [Page 10]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   attribute and several Peer Address Info attributes, the Peer-ID
   attribute and at least one Peer Address Info attribute are necessary
   among them.

   The Respond Peer Info attribute format is shown as Figure 3.

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |M|U|D|Reserved |Attribute Type |           Length              |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                 Peer-ID                                       |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                Peer service capability                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                Peer Address Info - 1                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                   ............                                |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                Peer Address Info - N                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              Figure 3 Respond Peer Info attribute format

   M-flag: the value is 1;

   U-flag: indicate whether this attribute describe the immediate
   upstream peer of the initiator generating this attribute.  If set
   (U=1), the attribute is used to describe the immediate upstream peer
   on the path;

   D-flag: indicate whether this attribute describe the immediate
   downstream peer of the initiator generating this attribute (e.g.
   next-hop peer in the overlay forwarding path).  If set (D=1), the
   attribute is used to describe the immediate downstream peer on the
   path.  If U=0 and D=0, the attribute is used to describe the peer
   itself (i.e. the attribute generator);

   Reserved (5 bits): those bits are reserved and ignored;

   Attribute Type (8 bits): the value is 16 (0x10);

   Length (16 bits): the length in bytes of this attribute.


6.  Message

   All P2PSIP peer protocol requests and responses use the common



Song, et al.             Expires January 8, 2009               [Page 11]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   message header after which zero, one or more TLV-style attributes
   follow.

   This document introduces the new Echo message to detect and localize
   failures in P2PSIP overlay network.

6.1.  Echo request

   An Echo request message is used to detect possible failures in the
   specified path of P2PSIP overlay network, including disabled peers,
   congested peers, misrouting behavior and underlying network faults.
   An Echo request message is also used to discover the topology of the
   specified path and check the reachability to the specified peer or
   the availability of the specified resource-record.

   An Echo request is normal P2PSIP peer protocol message; it can be
   initiated by any peer supporting P2PSIP peer protocol specification
   in the P2PSIP overlay network.

   An Echo request must contain a message header and an Echo attribute.

   Echo request =
                  Message Header
                  Echo Attribute
                  Source Peer Info

6.2.  Echo response

   An Echo response message is used to convey local diagnostics
   information including result, causes and possible other assistant
   information.

   An Echo response message must contain a message header, a Response
   attribute, an Echo attribute and one or more Respond Peer Info
   attributes.  It may contain a Resource Info attribute and a Status
   attribute.  If the peer is one intermediate peer, the Echo response
   message must contain three Respond Peer Info attributes to describe
   the response peer itself, immediate upstream peer and next-hop peer
   individually.  If the peer is the last peer terminating the Echo
   request message, the Echo message must contain two Respond Peer Info
   attributes to describe self and immediate upstream peer.  The TTL in
   the received Echo request must be copied to the Hop Counter field in
   the Echo response.  In the following section, the last peer
   terminating the Echo request message is called as the "terminator
   peer", in comparison with "intermediate peer" and "initiator peer" or
   "initiator".

   One implementation to estimate whether one peer is disabled is that



Song, et al.             Expires January 8, 2009               [Page 12]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   the initiator uses local timer to determine whether the expected Echo
   response message is expired, i.e., the peer thinks that the specified
   peer is disabled if it does not receive the Echo response message
   before the local timer expires which starts when issuing an Echo
   request message to the specified peer in the P2PSIP overlay network.
   This local timer can be updated in the specified interval by the Echo
   response message from the intermediate peers in the "Traceroute"
   mode.

   Echo response =
                  Message Header
                  Response Attribute
                  Echo Attribute
                  Respond Peer Info Attribute
                  [Resource Info Attribute]
                  [Status Attribute]

6.2.1.  Echo response from the terminator peer

   When an Echo request message arrived at a peer, if the peer's
   responsible ID space covers the destination ID of the Echo request
   message or the peer finds that the destination ID is unreachable in
   the P2PSIP overlay (e.g., detecting loop), then the peer constructs
   and returns an Echo response message using the specified Routing Mode
   indicated by the Echo request message when the Reply rule field of
   the received Echo attribute is not Zero, and the peer does not give
   any response when the Reply rule field is Zero.

   The Echo response must carry a Response attribute, a Respond Peer
   Info attribute describing the receiver of the Echo request message,
   an Echo attribute containing TimeStamp Received field and TimeStamp
   Initiated field copied from the received Echo request message.

   The returning Echo response message further must carry a Resource
   attribute when the responsible resource-record exists in the peer.
   If the Echo response message does not carry any Resource attribute,
   it means that the resource-record whose Resource-ID is equal to the
   destination ID of the Echo request message does not exist in the
   peer.

   If the peer finds that it is bush or congested, the returning Echo
   response message must carry a Status attribute.

   If the peer finds that its immediate upstream peer behaves with
   misrouting, the returning Echo response message must carry a Response
   attribute with the response code 416 "Upstream Misrouting" and a
   Respond Peer Info attribute describing information of its immediate
   upstream peer.



Song, et al.             Expires January 8, 2009               [Page 13]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


6.2.2.  Echo response from the intermediate peer

   When an Echo request arrived at a peer, if the peer's responsible ID
   space does not cover the destination ID of the Echo request, then the
   peer continues to forward this Echo request according to the
   specified Routing Mode field in the received Echo request.

   The peer should return an Echo response carrying a Response attribute
   with the response code 414 "Underlay Destination Unreachable" when it
   receives an ICMP message with "Destination Unreachable" information
   after forwarding the received Echo request.

   The peer should return an Echo response carrying a Response attribute
   with the response code 415 "Underlay Time Exceeded" when it receives
   an ICMP message with "Time Exceeded" information after forwarding the
   received Echo request.

   When an Echo request arrived at a peer, if the peer's responsible ID
   space does not cover the destination ID of the Echo request message
   and the value of received Reply rule field is 2, then the peer must
   construct and return an Echo response and continue to forward the
   Echo request.

   The Echo response must carry a Response attribute, a Respond Peer
   Info attribute describing the receiver of the Echo request message, a
   Respond Peer Info attribute describing the immediate downstream peer
   (i.e. next hop to forward the Echo request message in the P2PSIP
   overlay network), an Echo attribute containing TimeStamp Received
   field and TimeStamp Initiated field copied from the received Echo
   request.

   The returning Echo response must carry a Resource attribute when the
   responsible resource-record exists in the peer.  If the Echo response
   does not carry any Resource attribute, it means that the resource-
   record whose Resource-ID is equal with the destination ID of the Echo
   request message does not exist in the peer.

   If the peer finds that it is bush or congested, the returning Echo
   response message must carry a Status attribute.

   If the peer finds that its immediate upstream peer behaves with
   misrouting, the returning Echo response must carry a Response
   attribute with the response code 416 "Upstream Misrouting" and
   Respond Peer Info attribute describing information of its immediate
   upstream peer.






Song, et al.             Expires January 8, 2009               [Page 14]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


7.  Security Considerations

   One feasible P2PSIP Traceroute implementation based on the value of
   "Reply Rule" field 2 "Immediately reply" (Section 9.2) may cause DoS
   attack to the initiator, though this implementation is more efficient
   than traditional Traceroute operation of Internet using pacing ICMP
   message.

   An advice is to use the efficient Traceroute operation in
   administrated P2PSIP overlay and use the pacing-style Traceroute
   operation in the untrustworthy P2PSIP overlay network, certainly, the
   probability of this type of DoS attack is very low because the
   overlay is distributed and the it is very hard for the attacker to
   know the accurate Peer-IDs and attack most of all peers
   simultaneously.


8.  IANA Considerations

   Message Type: this document introduces a new type of message as
   below:

   Message Type       Name
   11                 Echo

   Attribute Type: this document introduces two new types of attributes
   as below:

   Attribute Type       Name
   15                 Echo
   16                 Respond Peer Info

   Response Code: this document introduces some new response definitions
   as below:

   Result Code         Name
   414                 Underlay Destination Unreachable
   415                 Underlay Time exceeded
   416                 Upstream Misrouting
   417                 Loop detected
   419                 TTL hops exceeded

   Response Sub-Code: this document defines response sub-codes for the
   response code 414 "Underlay Destination Unreachable" as below:







Song, et al.             Expires January 8, 2009               [Page 15]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   Response Sub-Code       Meaning
         0              net unreachable
         1              host unreachable
         2              protocol unreachable
         3              port unreachable
         4              fragmentation needed
         5              source route failed


9.  Examples

9.1.  P2PSIP Ping

   Any peer supporting P2PSIP diagnostics protocol can use P2PSIP Ping
   operation to check the reachability to the specified peer in the
   overlay or the availability of the specified resource-record.

   In the normal P2PSIP Ping operation, a peer constructs and issues an
   Echo request message to the specified destination ID.  The
   destination ID of the Echo request message is the specified Peer-ID
   or Resource-ID, the source ID of the Echo request message is the
   Peer-ID of the initiator.  The "Reply Rule" value must be 1 "Do not
   reply except last peer", and the initiator determines the "Routing
   Mode", and "Underlay TTL" of the Echo request message by itself.  Any
   intermediate peer does only simply forward this message to its next
   hop in the overlay and not disposes this Echo request message until
   the message arrives at the terminator peer who may be the responsible
   peer or one peer who finds that the destination ID is unreachable,
   eventually the terminator peer returns an Echo response message.

   Here is an example of a P2PSIP Ping operation; it is shown as Figure
   4:



















Song, et al.             Expires January 8, 2009               [Page 16]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


    Peer-1              Peer-2               Peer-3               Peer-4
      |                    |                    |                    |
      | (1).Echo Request   |                    |                    |
      |------------------->|                    |                    |
      |                    | (2).Echo Request   |                    |
      |                    |------------------->|                    |
      |                    |                    | (3).Echo Request   |
      |                    |                    |------------------->|
      |                    |                    |                    |
      |                    |                    | (4).Echo Response  |
      |<-------------------|--------------------|--------------------|
      |                    |                    |                    |

                           Figure 4 P2PSIP Ping example

   The overlay network operator may use P2PSIP Ping operation to measure
   the message transmission delay and jitter between two specified
   peers.

9.2.  P2PSIP Traceroute

   Any peer supporting P2PSIP diagnostics protocol can use P2PSIP
   traceroute operation to detect and localize malfunctioning or badly
   behaving peers including disabled peers, congested peers and
   misrouting peers, or detect and localize network failure, or to
   discover the topology of the specified path in the overlay network.

   In one possible P2PSIP Traceroute operation, a peer constructs and
   issues an Echo request message to the specified destination ID.  The
   destination ID in the Echo request message is the specified Peer-ID
   or Resource-ID, the source ID in the Echo request message is the
   Peer-ID of the initiator.  The value of "Reply Rule" field must be 2
   "Immediately reply", and the initiator determines the "Routing mode"
   and "Underlay TTL" of the Echo request message by itself.  Any
   intermediate peer does dispose this Echo request message, i.e.,
   forwards this message to its next hop in the overlay and then returns
   an Echo response message.  The terminator peer for the Echo request
   message is the destination peer or one peer who finds that the
   destination ID is unreachable; eventually the terminator peer returns
   an Echo response message.

   Here is an example of a P2PSIP Traceroute operation; it is shown as
   Figure 5:








Song, et al.             Expires January 8, 2009               [Page 17]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


    Peer-1              Peer-2               Peer-3               Peer-4
      |                    |                    |                    |
      | (1).Echo Request   |                    |                    |
      |------------------->|                    |                    |
      |                    | (2).Echo Request   |                    |
      |                    |------------------->|                    |
      | (3).Echo Response  |                    |                    |
      |<-------------------|                    |                    |
      |                    |                    | (4).Echo Request   |
      |                    |                    |------------------->|
      |                    | (5).Echo Response  |                    |
      |<-------------------|--------------------|                    |
      |                    |                    | (6).Echo Response  |
      |<-------------------|--------------------|--------------------|
      |                    |                    |                    |

                       Figure 5 P2PSIP Traceroute example


10.  Acknowledgments

   Thanks to Jiang Haifeng for his valued comments.  We would also like
   to thank Henry Sinnreich for contributing to the usage scenarios in
   the Introduction.


11.  References

11.1.  Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
   Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
   A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
   Session Initiation Protocol", RFC 3261, June 2002.

   [RFC792] Postel, J., "Internet Control Message Protocol", STD5, RFC
   792, September 1981.

   [RFC2030] Mills, D., "Simple Network Time Protocol (SNTP) Version 4
   for IPv4, IPv6 and OSI", RFC 2030, October 1996.

   [RFC4981] J. Risson, "Survey of Research towards Robust Peer-to-Peer
   Networks: Search Methods", RFC 4981, September 2007.

   [I-D.ietf-p2psip-concepts] Bryan, D., "Concepts and Terminology for
   Peer to Peer SIP", draft-ietf-p2psip-concepts-00 (work in progress),



Song, et al.             Expires January 8, 2009               [Page 18]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   June 2007.

   [I-D.song-p2psip-security-eval] Song, Yongchao., "P2PSIP Security
   Analysis and Evaluation", draft-song-p2psip-security-eval-00 (work in
   progress), February 2008

   [I-D.matuszewski-p2psip-security-requirement] M. Matuszewski,
   "Security requirements in P2PSIP",
   draft-matuszewski-p2psip-security-requirements-01 (work in progress),
   July 2007

   [I-D.jennins-p2psip-security-mechanisms] C. Jennings, "Security
   Mechanisms for Peer to Peer SIP", draft-jennings-p2psip-security-00
   (work in progress), February 2007

   [I-D.jiang-p2psip-sep] X. Jiang, "Service Extensible P2P Peer
   Protocol", draft-jiang-p2psip-sep-00 (work in progress), November
   2007.

   [I-D.bryan-p2psip-requirement] D. Bryan, "P2PSIP Protocol Framework
   and Requirements", draft-bryan-p2psip-requirements-00 (work in
   progress), July 2007

   [Overlay-Failure-Detection] S. Zhuang, "On failure detection
   algorithms in overlay networks", Proc.  IEEE Infocomm, Mar 13-17
   2005.

   [P2PSIP-Concepts-Terminology] Dean Willis, "P2PSIP Concepts and
   Terminology", http://www3.ietf.org/proceedings/07jul/slides/p2psip-
   13.pdf, July 2007

   [Handling Churn in a DHT] S. Rhea et al: "Handling Churn in a DHT".
   USENIX Annual Conference, June 2004

   [Diagnostic Framework] X. Jin et al: "A Diagnostic Framework for
   Peer-to-Peer Streaming", Hong Kong University and Microsoft, 2005

   [Diagnostics and NAT traversal in P2PP] G. Gupta et al: "Diagnostics
   and NAT Traversal in P2PP - Design and Implementation."  Columbia
   University Report.  June 2008

11.2.  Informative References

   [I-D.ietf-behave-rfc3489bis] Rosenberg, J., Huitema, C., Mahy, R.,
   and D. Wing, "Simple Traversal Underneath Network Address Translators
   (NAT) (STUN)", draft-ietf-behave- rfc3489bis-08 (work in progress),
   July 2007.




Song, et al.             Expires January 8, 2009               [Page 19]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   [I-D.ietf-behave-turn] Rosenberg, J., Mahy, R., and C. Huitema,
   "Obtaining Relay Addresses from Simple Traversal Underneath NAT
   (STUN)", draft-ietf-behave-turn-04 (work in progress), July 2007.

   [I-D.ietf-mmusic-ice] Rosenberg, J., "Interactive Connectivity
   Establishment (ICE): A Methodology for Network Address Translator
   (NAT) Traversal for Offer/Answer Protocols", draft-ietf-mmusic-ice-17
   (work in progress), July 2007

   [I-D.bryan-p2psip-dsip] Bryan, D., "dSIP: A P2P Approach to SIP
   Registration and Resource Location", draft-bryan-p2psip-dsip-00 (work
   in progress), February 2007.

   [I-D.bryan-p2psip-reload] Bryan, D., "REsource LOcation And Discovery
   (RELOAD)", draft-bryan-p2psip-reload-00 (work in progress), June
   2007.

   [I-D.baset-p2psip-p2pp] S. Baset, "Peer-to-Peer Protocol (P2PP)",
   draft-baset-p2psip-p2pp-00 (work in progress), July 2007.

   [I-D.Jennings-p2psip-asp] C. Jennings, "Address Settlement by Peer to
   Peer", draft-jennings-p2psip-asp-00 (work in progress), July 2007.

   [I-D.marocco-p2psip-xpp-pcan] Marocco, E. and E. Ivov, "XPP
   Extensions for Implementing a Passive P2PSIP Overlay Network based on
   the CAN Distributed Hash Table", draft-marocco-p2psip-xpp-pcan-00
   (work in progress), June 2007.

   [I-D.matthews-p2psip-hip-hop] Cooper, E., "A Distributed Transport
   Function in P2PSIP using HIP for Multi-Hop Overlay Routing",
   draft-matthews-p2psip-hip-hop-00 (work in progress), June 2007.


Authors' Addresses

   Song Haibin
   Huawei
   Baixia Road No.91
   Nanjing, Jiangsu Province  210001
   PRC

   Phone: +86-25-84565081
   Fax:   +86-25-84565070
   Email: melodysong@huawei.com







Song, et al.             Expires January 8, 2009               [Page 20]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


   Zheng Hewen
   Huawei
   Baixia Road No. 91
   Nanjing, Jiangsu Province  210001
   PRC

   Email: hwzheng@huawei.com


   Jiang Xingfeng
   Huawei
   Baixia Road No.91
   Nanjing, Jiangsu Province  210001
   PRC

   Phone: +86-25-84565079
   Fax:   +86-25-84565070
   Email: jiang.x.f@huawei.com

































Song, et al.             Expires January 8, 2009               [Page 21]


Internet-Draft  Diagnose P2PSIP Overlay Network Failures       July 2008


Full Copyright Statement

   Copyright (C) The IETF Trust (2008).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.











Song, et al.             Expires January 8, 2009               [Page 22]