P2PSIP Working Group J. Peng
Internet-Draft China Mobile
Intended status: Standards Track K. Feng
Expires: January 3, 2012 Beijing University of Posts and
Telecommunications
L. Le
China Mobile
July 2, 2011
One Hop Lookups Algorithm Plugin for RELOAD
draft-peng-p2psip-one-hop-plugin-00
Abstract
This document proposes an implementation of overlay plugin algorithm
which is called one hop lookups to provide examples and references
for the research of one hop based RELOAD. With the development of
the real time communications, there are high demands for the
improvement of routing efficiency. In the one hop algorithm, each
peer maintains complete membership information which can guarantee
one hop lookups to improve the routing efficiency. For the RELOAD,
using the one hop lookups algorithm to construct Topology Plugin with
the same RELOAD core can have a better support for VoIP applications,
and the implementation of one hop lookups algorithm plugin is based
on the methods provided by RELAOD.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 3, 2012.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
Peng, et al. Expires January 3, 2012 [Page 1]
Internet-Draft One Hop Lookups Plugin July 2011
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Hash functions . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Peer data structure . . . . . . . . . . . . . . . . . . . . . 7
5. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Joining . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7. Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.1. Update messages definition . . . . . . . . . . . . . . . . 12
7.2. Primary event notification messages . . . . . . . . . . . 17
7.3. Actions after receiving update messages . . . . . . . . . 18
8. Stabilization . . . . . . . . . . . . . . . . . . . . . . . . 20
9. Leaving . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
10. Leader choosing strategies . . . . . . . . . . . . . . . . . . 22
11. Replication . . . . . . . . . . . . . . . . . . . . . . . . . 23
12. Fault tolerance . . . . . . . . . . . . . . . . . . . . . . . 24
12.1. First hop fail . . . . . . . . . . . . . . . . . . . . . . 24
12.2. Leaders fail . . . . . . . . . . . . . . . . . . . . . . . 24
13. Security Considerations . . . . . . . . . . . . . . . . . . . 26
14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27
15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28
16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29
16.1. Normative References . . . . . . . . . . . . . . . . . . . 29
16.2. Informative References . . . . . . . . . . . . . . . . . . 29
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30
Peng, et al. Expires January 3, 2012 [Page 2]
Internet-Draft One Hop Lookups Plugin July 2011
1. Introduction
RELOAD [I-D.ietf-p2psip-base] is a peer-to-peer (P2P) signaling
protocol for use on the Internet. The Architecture of RELOAD has a
routing layer which is called the Topology Plugin. It is responsible
for implementing the specific overlay algorithm being used. In the
RELOAD base protocol, the Topology Plugin is described by Chord, and
we define another DHT algorithm in order to provide a high rouging
efficiency which can be used in real time communications for a better
performance.
The real time communication has a high demand on the routing
efficiency, for example, the VoIP usage on the application level of
RELOAD, it is no doubt that the routing speed or efficiency must be
very important, and it depends on the looking up efficiency of the
Overlay Topology. There are many kinds of structured peer-to-peer
overlay network algorithms like Chord, Pastry, CAN and so on. Most
of them have one in common is that they have a routing table which
can maintain a small amount of routing state. But all these
algorithms need nearly O (log N) steps to find one peer or resource,
if there are a large number of peers in the overlay, it costs a lot
time to locate peer or resource which may have a bad influence on the
application level of RELOAD. The one hop lookups algorithm gives one
way to maintain complete membership information at each node in order
to increase the routing efficiency even though the system is large
and membership is changing rapidly.
In the one hop algorithm, each peer maintains a complete description
of system memberships, and it presents techniques for the peers to
maintain this information accurately with lower costs of
communications. Different one hop algorithm implementations have
different ways to keep the accuracy of routing states. For example,
the one hop lookups for peer-to-peer overlay [One-Hop-Lookups] uses
event notification mechanism without broadcast to update all the
peers' routing table during several seconds; the SandStone
[SandStone] gives a way to collect update information by SPM (Super
Peer Maintenance) which is called SPM assisted one hop enhancement.
The SandStone overlay is typically deployed as a two-layered DHT,
including a global DHT and several regional DHT. There is at least
one SPM node in each region, and once one peer's state change is
detected by its neighboring peer, the neighbor will notify its local
SPM, and then the SPM will disseminate the event to other SPMs,
finally each SPM will broadcast this notification to all peers under
the charge of it. We use the one hop lookups algorithm to construct
the RELOAD Topology Plugin because there is no SPM like peers and two
layered topology in RELOAD base.
In this draft, we first give a simple overview of one hop lookups
Peng, et al. Expires January 3, 2012 [Page 3]
Internet-Draft One Hop Lookups Plugin July 2011
algorithm with some definitions, then fill the overlay specific data
structures into the RELOAD frame and implements this algorithm with
topology messages defined in the RELOAD Topology Plugin; at last, the
replication and fault tolerance strategies are stated. All of the
definitions and implementations based on the requirements and methods
provided by RELOAD.
Peng, et al. Expires January 3, 2012 [Page 4]
Internet-Draft One Hop Lookups Plugin July 2011
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Peng, et al. Expires January 3, 2012 [Page 5]
Internet-Draft One Hop Lookups Plugin July 2011
3. Hash functions
In this one hop lookups algorithm topology plugin, the size of the
node ID and resource ID is 128 bits. The hash of an ID can be
computed by SHA-1 [RFC3174] and other hash functions.
Peng, et al. Expires January 3, 2012 [Page 6]
Internet-Draft One Hop Lookups Plugin July 2011
4. Peer data structure
A peer, or a node, is responsible for a particular Resource-ID which
is less than or equal to its Node-ID and is greater than the Node-ID
of the previous peer in the neighborhood. In order to have a high
routing accuracy and efficiency, each peer must keep and maintain a
series of data structure.
In the one hop lookups algorithm, we divide the 128-bit circular
identifier space into k equal contiguous intervals which are called
slices. Each slice has a slice leader, which is chosen as the node
that is the successor of the mid-point of the slice space (other
choosing methods are list in Section X). Similarly, each slice also
can be divided into u equal-sized intervals which are called units;
each unit has a unit leader, which is chosen as the successor of the
mid-point of the unit space. And of course every peer in the overlay
should know its unit header and slice header. The slice leader and
unit leader are both ordinary peers who are chosen dynamically, and
they also can be configured at first. The leader choosing strategies
will be stated in section 10.
In the SandStone, there is at least one SPM node in each region, so
when one peer detects the state change of overlay in its
neighborhood, it will notify its local SPM as soon as possible. The
local SPM then disseminate this notification to other SPMs, at last,
each SPM will broadcast this update to all peers under the charge of
it. The SPM is a special peer which can collect update information
and do some other management tasks. The SPM does not participate in
the routing procedures and it is pre-configured.
On the other hand, the SPM uses the broadcast to spread the
notifications. Only the bootstrap peer in RELOAD has the broadcast
function. Using the broadcast is a simple and fast way to inform
information, but it costs a lot of bandwidths and also may cause the
notifications redundancy. Using the event based notification
mechanism defined in one hop lookups can avoid redundancy in the
communications: one peer will get information only from its neighbor
that is one step closer to its unit leader. This implies that within
a unit, information is always flowing from the unit leader to the
ends of the unit. But its disadvantage is obvious: the notifications
are spread too slowly which may cause the delay of overlay
stabilization. In this draft, we choose the one hop lookups
algorithm to be an example.
From above, we can conclude the information peer should maintain as
follows.
Peng, et al. Expires January 3, 2012 [Page 7]
Internet-Draft One Hop Lookups Plugin July 2011
Routing table:
The routing table of each node keeps all of nodes' address, and
then the source node can find the destination by just one hop at
most if the items of routing table are right. So it is very
important to keep the correctness of each node's full routing
table. In order to achieve the goal, the notification of
membership change events must reach every node in the overlay
within a specified amount of time.
Predecessor and successor information:
The information especially Node-ID and address about peer's
predecessor and successor must be kept in order to maintain the
overlay circle. A peer will communicate with its predecessor and
successor with keep-alive message (defined in Update message)
every several seconds. Of course, this structure can be added as
mark-ups or identifiers in the routing table.
Unit leader information:
The information about the unit leader who is responsible for the
unit the peer belongs to. A peer can communicate with its unit
leader to receive the Update message. Of course, this structure
can be added as mark-ups or identifiers in the routing table.
Slice leader information:
The information about the slice leader who is responsible for the
slice the peer belongs to. A peer can send event notification
message (defined in Update message) to its slice leader to inform
events. Of course, this structure can be added as mark-ups or
identifiers in the routing table.
The four kinds of information are maintained in all of peers, but
there are still some kinds of other data structures should be kept in
some special peers like unit leader and slice leader.
Unit boundary identifier:
This identifier does not keep in all the peers, it is stored on
the peer who is the boundary of the unit in order to make sure
the Update message just being transferred within unit.
Unit boundary peer list:
The unit leader should know the information about the unit's
boundary peer in order to know its control area.
Slice leader list:
Every slice leader should know the information about all the
slice leaders in the overlay in order to dispatch the event
notification messages in time.
Peng, et al. Expires January 3, 2012 [Page 8]
Internet-Draft One Hop Lookups Plugin July 2011
Each peer with these data structures can compose a well running
overlay topology based on the one hop lookups algorithm.
Peng, et al. Expires January 3, 2012 [Page 9]
Internet-Draft One Hop Lookups Plugin July 2011
5. Routing
The routing table of the peer has full information of all the nodes
in the overlay. If the peer is not responsible for a resource whose
ID is r, it will query the routing table in order to find the first
peer whose id is greater than r, which means that the peer should be
responsible for this resource; and then sends a request message to
the destination node. If no such node is found, it finds the largest
Node-ID in the interval between the peer and Resource-ID r.
Peng, et al. Expires January 3, 2012 [Page 10]
Internet-Draft One Hop Lookups Plugin July 2011
6. Joining
The joining process for a joining peer (JP) with Node-ID n is as
follows.
(1) JP MUST connect to its chosen or preconfigured bootstrap node.
(2) JP SHOULD send an Attach request to its admitting peer (AP) for
Node-ID n. The "send_update" flag should be used to acquire
the routing table and other information for AP.
(3) JP MUST send a Join to AP, the AP sends the response to the
Join.
(4) AP MUST do a series of Store requests to JP to store the data
that JP will be responsible for.
(5) JP SHOULD send Attach request to its predecessor in order to
let it knows the arriving of new peer; the predecessor should
update its successor information at once.
(6) The predecessor detects a change in its routing table for
example it has a new successor, it MUST send an event
notification message which is a kind of Update messages to its
slice leader directly. Of course, the Update message can also
be sent by the AD.
(7) The slice leader MUST collect all notifications it receives
from the peers in his slice and sends Update messages to other
slice leaders every several seconds to inform these events.
(8) Other slice leaders MUST dispatch the Update messages they
received to all the unit leaders in their respective slices.
(9) A unit leader SHOULD piggyback this information on its Update
message to its successor and predecessor.
(10) Other nodes propagate this information from their successors;
they SHOULD send it to their successors and vice versa.
(11) Peers at unit boundaries SHOULD NOT send information to their
neighboring peers outside their unit.
The Update messages depending on the RELOAD message formats will be
defined in the next section. The peer can find its successor by
sending a Find request with the Resource-ID equals its Node-ID.
Peng, et al. Expires January 3, 2012 [Page 11]
Internet-Draft One Hop Lookups Plugin July 2011
7. Updates
7.1. Update messages definition
The update messages in one hop lookups algorithm is based on the
event notification. So the definition of update message can be
started with the type definition.
enum { eventUpdate (0), dataStructureUpdate (1), (255) }
UpdateType;
The UpdateType gives an enumeration of Update message which includes
eventUpdate and dataStrucruteUpdate. The eventUpdate message will
take a list of event notifications and the dataStructureUpdate
message will take a list of data structure information which may
include the routing table items. This structure allows for unknown
options types.
enum { keepAlive (0), ordinaryPeerChange (1),
unitLeaderChange(2), sliceLeaderChange (3), (255) } EventType;
The EventType gives an enumeration of event kinds, it is composed of
keepAlive, ordinaryPeerChange, unitLeaderChange and
sliceLeaderChange. The keepAlive event is like the Ping message in
Chord. The ordinaryPeerChange, unitLeaderChange and
sliceLeaderChange represent the state change of different peers in
the overlay. This structure allows for unknown options types.
enum { peerJoin (0), peerLeave (1), peerChange (2), (255) }
ChangeType;
The ChangeType gives an enumeration of peer state change type which
includes the PeerJoin, PeerLeave and peerChange. The meaning of join
and leave are easy to understand, and the peerChange means the unit
leader change event or slice leader change event, it also can be
constructed by peer leave and peer join. This structure allows for
unknown options types.
enum { ordinaryPeer (0), unitLeaderPeer (1),
sliceLeaderPeer (2), (255) } PeerType;
The PeerType gives an enumeration of peer type which is used to
cooperate with the ChangeType. This structure allows for unknown
options types.
Peng, et al. Expires January 3, 2012 [Page 12]
Internet-Draft One Hop Lookups Plugin July 2011
struct {
Node-ID peer_id
IpAddressPort peer_addr_port
} PeerContactItem;
The structure PeerContactItem is used to represent a basic item in
the routing table or other data structures that need peers'
information. It is composed by a Node-ID and IpAddressPort. The
IpAddressPort structure is defined in the RELOAD base which includes
one peer's IP address and listening port. If one peer get the
information of another peer's PeerContactItem, then it can contact
this peer with the IP address.
Peng, et al. Expires January 3, 2012 [Page 13]
Internet-Draft One Hop Lookups Plugin July 2011
struct {
uint32 uptime;
EventType event_type;
PeerContactItem detect_peer;
uint32 length;
select (event_type) {
case keepAlive:
;
case ordinaryPeerChange:
ChangeType ordinary_change_type;
select (ordinary_change_type) {
case peerJoin:
PeerContactItem joining_peer;
case peerLeave:
PeerContactItem leaving_peer;
}
case unitLeaderChange:
ChangeType unit_leader_change_type;
select (unit_leader_change_type) {
case peerJoin:
PeerContactItem new_unit_leader;
PeerContactItem old_unit_leader;
case peerLeave:
PeerContactItem leaving_unit_leade;
}
case sliceLeaderChange:
ChangeType slice_leader_change_type;
select (slice_leader_change_type) {
case peerJoin:
PeerContactItem new_slice_leader;
PeerContactItem old_slice_leader;
case peerLeave:
PeerContactItem leaving_slice_leader;
}
}
} EventNotification;
The structure EventNotification is very important because it is the
base of event notification. The length of the structure changes in
different event type. This structure has enough information to
describe some important events. The contact information of detect
Peng, et al. Expires January 3, 2012 [Page 14]
Internet-Draft One Hop Lookups Plugin July 2011
peer (detect_peer) is useless and can be ignored. In the event type,
the three different event types can still be divided by the
ChangeType. The notification must have enough information for all
the peers in the overlay to update their data structures like routing
table.
struct {
PeerType peer_type;
PeerContactItem predecessor_peer;
PeerContactItem successor_peer;
PeerContactItem routing_table_list <0...n>;
uint32 length;
select (peer_type) {
case ordinaryPeer:
PeerContactItem unit_leader;
PeerContactItem slice_leader;
case unitLeaderPeer:
PeerContactItem boundary_peers <2>;
case sliceLeaderPeer:
PeerContactItem slice_leaders_list <0...n>;
}
} DataStructureContent
The structure DataStructureContent mainly used in the transferring of
data structure during the peer joining procedure. The length of the
structure changes in different peer type because different peer has
different data structure. The ordinary peer only has predecessor,
successor, unit leader, slice leader and routing table information.
And the unit leader and slice leader should also know some other
information.
Peng, et al. Expires January 3, 2012 [Page 15]
Internet-Draft One Hop Lookups Plugin July 2011
typedef EventNotification<0...n> EventNotificationList;
typedef DataStructureContent<0...n> DataStructureContentList;
struct {
UpdateType update_type;
uint32 length;
select (update_type) {
case eventUpdate:
EventNotificationList event_notification_list;
case dataStructureUpdate:
DataStructureContentList data_structure_list;
}
} OneHopUpdateReq;
This structure is the Update message used in the one hop lookups.
The Update message is composed by two kinds of list. Using list can
take more information. One of them is the EventNotificationList
which includes a list of EventNotification structures, and the other
is the DataStructureContentList which includes a list of
DataStructureContent structures. All the peers in the overlay can
use this Update request to inform event notification or transfer
routing information.
struct {
uint32 update_response;
} OneHopUpdateAns;
The structure OneHopUpdateAns is used to be a response to the
OneHopUpdateReq message. It is only has a response number to
represent the receiver's attitude which may include success, fail and
error. This number also can be defined as a type.
The PeerContactItem structure contains parameters as follows.
peer_id
The peer's Node-ID.
peer_addr_port
The contact information of peer's especially the IP address.
The EventNotification structure contains parameters as follows.
Peng, et al. Expires January 3, 2012 [Page 16]
Internet-Draft One Hop Lookups Plugin July 2011
uptime
The time this peer has been up in seconds.
event_type
The type of event the message will take, this structure allows
for unknown event types.
length
The length of the remainder of this message.
ordinary_change_type, unit_leader_change_type,
slice_leader_change_type
The type of peer changing event, this structure allows for
unknown peer changing types
The DataStructureContent structure contains parameters as follows.
peer_type
The type of peer who wants to get these information, this
structure allows for unknown peer types.
routing_table_list
The peer's routing table which has all of the peers' information
in the overlay.
boundary_peers
The information of peers' who are the boundary of unit.
slice_leaders_list
All of the slice leaders' information will be stored in this
list, and it can be transferred from one slice leader to another
or from the old slice leader to the new one.
The OneHopUpdateAns message contains parameters as follows.
update_response
The response of the Update message, different number has
different meaning includes success, error and so on.
7.2. Primary event notification messages
There are four main messages for the event notification among the
one-hop lookups algorithm, the keep-alive messages, the event
notification to slice leader messages, the messages exchanged between
slice leaders and the messages from slice leaders to unit leaders.
The keep-alive message is like the Ping defined in the Link
Management of Reload, and the messages are used to form the base
Peng, et al. Expires January 3, 2012 [Page 17]
Internet-Draft One Hop Lookups Plugin July 2011
level communication between a peer and its predecessor and successor.
The event notification message to slice leaders is used to send event
notification to the peer's slice leader directly which can use the
template of Update message. The messages exchanged between slice
leaders are the most important. Each message sent from one slice
leader to another batches together events that occurred during
several seconds. We can still use the Update message to play this
role. The messages from slice leaders to unit leaders can dispatch
event notification. Messages received by a slice leader are batched
for one second and the forwarded to unit leaders. As above, this
message still can use Update message to express.
Above all, these important event notification messages can be
expressed by Update messages. In some scenarios like the peer
joining, the Update message should be extended to support the
transfer of the routing table information.
7.3. Actions after receiving update messages
When peers in the overlay receive update messages which based on the
event mechanism, the peer will check the type of event first.
(1) If the message received is a keep-alive message, then the peer
can make sure the relationship with the sender and return an
answer as soon as possible.
(2) If the message contains an ordinary peer join event, the actions
taken should depend on the peer type of receiver, the slice
leader will send node join notification to other slice leaders
and its unit leader; the unit leader will send this notification
to its peers; the peers will use the Update message to inform
their processor or successor.
(3) If the message contains an ordinary peer leave event, the
actions taken should depend on the peer type of receiver, the
slice leader will send node leave notification to other slice
leaders and its unit leader; the unit leader will send this
notification to its peers; the peers will use the Update message
to inform their processor or successor.
(4) If the message's event type is unit leader change, the actions
taken should depend on the peer type of receiver, the slice
leader will inform the sender to become the new unit leader, and
then the new unit leader will spread this information to its
unit members via the Update message. And the unit leader change
message may be constructed by unit leader join and unit leader
leave messages, but the procedure is same.
Peng, et al. Expires January 3, 2012 [Page 18]
Internet-Draft One Hop Lookups Plugin July 2011
(5) If the message's event type is slice leader change, the other
slice leaders who receive this will record the event caused
peer's information and return the acknowledge information, the
new slice leader will inform its unit leaders to tell them the
change of slice leader, and then the unit leaders will spread
this information to its unit members via the Update message.
Peng, et al. Expires January 3, 2012 [Page 19]
Internet-Draft One Hop Lookups Plugin July 2011
8. Stabilization
The stabilization of one hop lookups pay much more attention on the
refreshing of routing table because if one peer has a right routing
table, then it can contact any peers in the overlay directly. The
accuracy of routing table depends on the speed of Update message
spreading. So the notification message frequency is a very important
case which can have effects on the stabilization. The event
notification to slice leader, the messages exchanged between slice
leaders and the messages from slice leaders to unit leaders have
different sending frequency, and all of them are not fixed numbers.
The frequency is calculated by the acceptable fraction of queries
that fail in the first routing attempt, the peers in the overlay and
the expected rate of membership changes. The details of frequency
and time calculating are described in [One-Hop-Lookups]. Sometimes
we can use the broadcast to inform notification in order to save
time.
On the other hand, keeping the accuracy of neighborhood peers'
information is also very important. For example, in the one hop
lookups, the keep-alive messages are sent every second, and the keep-
alive messages construct the base level communications between a peer
and its predecessor and successor.
Peng, et al. Expires January 3, 2012 [Page 20]
Internet-Draft One Hop Lookups Plugin July 2011
9. Leaving
The leaving action is also a kind of event, so it can be expressed by
the Update message. When a peer leaves the overlay peacefully, it
needs to send an Update message as an event notification message, so
the Leaving message may not be defined and just using the Update
message. And the data transfer can also use the Update messages, but
it sometimes depends on the replication strategies (Section 11).
If a peer leaves the overlay without any notification, the neighbor
peer (predecessor or successor) must inform this event to the leader
by the Update messages, and then the backup peer of the leaving one
should transfer the backup data to the new one who is charging the
leaving peer's namespace. When an unit leader or slice leader leaves
this overlay, the overlay should choose a new one to replace the old
by the leader choosing strategies described in section 10.
Peng, et al. Expires January 3, 2012 [Page 21]
Internet-Draft One Hop Lookups Plugin July 2011
10. Leader choosing strategies
The default slice and unit leader strategy is dynamic choosing the
successor of the mid-point peer of the slice or unit space. It is
convenient to manage and choose, but it is so simple that does not
consider some other important cases like handling ability and so on.
On the other hand, the slice leader can be treated as a "Super Node",
and we can give some other super node choosing strategies as follows.
On the other hand, a SandStone like schema can be used to choose
leaders. In some scenarios, we can identify well connected and well
provisioned peers as "Super Node" which may have the same meaning
with SPM. The chosen super nodes can form a parallel ring. And of
course the super slice will act as a slice leader; it does not
participate in the routing procedure because its focus of working is
on the collecting of event notifications and spreading them in time.
The unit leader can be chosen by the default schema because the
working burden of unit leader is much more relaxed than the slice
leader, and if the chosen one has a poorly provisioned node with a
low bandwidth connection to the Internet, the predecessor or
successor of the chosen one can replace it.
Peng, et al. Expires January 3, 2012 [Page 22]
Internet-Draft One Hop Lookups Plugin July 2011
11. Replication
The replica placement policy of the one hop algorithm can use the way
defined in SandStone. It is designed to maximize data reliability
and availability, and to maximize network bandwidth utilization. In
the SandStone, the subscriber data can be replicated on multiple
peers, three on default. We can use a SandStone like replica
replacement policy with three replications. The first replica is
stored in the primary peer's successor; the second is saved in a peer
of different unit but in same slice; and the third is put in a peer
from different slice.
Above all, one of the most important issues is the choosing of the
replica peer. Maybe we can use a special function whose input is
peer's Node-ID to calculate the three replica peers, and of course
the root peer should know about the replication method in order to be
able to search the data.
Peng, et al. Expires January 3, 2012 [Page 23]
Internet-Draft One Hop Lookups Plugin July 2011
12. Fault tolerance
12.1. First hop fail
In the one hop algorithm, if a query fails on its first attempt it
does not return an error to an application. Instead, queries can be
rerouted with the RELOAD message RouteQueryAns. We can define the
RouteQueryAns as follows.
struct {
uint32 response_state;
PeerContactItem redirect_peer;
} RouteQueryAns;
The structure RouteQueryAns is composed by a state number and a
Node-ID. The state number tells the query state and the
PeerContactItem gives a peer's information that is closer to the
destination. And in most of cases, two hops are enough to locate one
peer or resource. If the two hops are still wrong, then we can know
the lookup procedure is fail.
response_state
The response mark of the RouteQueryReq message's; and it may
include success, failure and errors.
redirect_peer
If the query is failed, the application can send RouteQueryReq
message to this peer again for finding resources.
If a lookup query from peer A to peer B because the routing table is
outdate or the peer B has left, peer A can retry the query by sending
the RouteQueryReq to the peer C who is the successor of peer B. And
at that time joining a peer D to be peer B's new successor and peer
C's predecessor, then peer C can reply RouteQueryAns message with the
redirect_peer of peer D's, and peer A can contact it in this routing
step.
12.2. Leaders fail
The most important issues are how to keep the correct functioning of
unit leaders and slice leaders; we need to recover from their
failure. When a slice or unit leader fails, its successor soon
detects the failure and becomes the new leader; the successor of a
failed slice leader will communicate with its unit leaders and other
slice leaders to recover information about the missed events. And of
course, the leaser choosing strategies described in section 10 also
can be used to choose a new leader. All of these events can be
constructed by the Update message with the EventType (e.g. the
Peng, et al. Expires January 3, 2012 [Page 24]
Internet-Draft One Hop Lookups Plugin July 2011
UnitLeaderChange and SliceLeaderChange).
Peng, et al. Expires January 3, 2012 [Page 25]
Internet-Draft One Hop Lookups Plugin July 2011
13. Security Considerations
Peng, et al. Expires January 3, 2012 [Page 26]
Internet-Draft One Hop Lookups Plugin July 2011
14. IANA Considerations
There are no IANA considerations associated to this memo.
Peng, et al. Expires January 3, 2012 [Page 27]
Internet-Draft One Hop Lookups Plugin July 2011
15. Acknowledgements
Peng, et al. Expires January 3, 2012 [Page 28]
Internet-Draft One Hop Lookups Plugin July 2011
16. References
16.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[I-D.ietf-p2psip-base]
Jennings, C., Lowekamp, B., Rescorla, E., Baset, S., and
H. Schulzrinne, "REsource LOcation And Discovery (RELOAD)
Base Protocol", draft-ietf-p2psip-base-15 (work in
progress), May 2011.
16.2. Informative References
[RFC3174] Eastlake, D. and P. Jones, "US Secure Hash Algorithm 1
(SHA1)", RFC 3174, September 2001.
[One-Hop-Lookups]
Gupta, A., Liskov, B., and R. Rodrigues, "One Hop Lookups
for Peer-to-Peer Overlays", June 2003.
[SandStone]
Shi, G., Chen, J., Gong, H., Fan, L., Xue, H., Lu, Q., and
L. Liang, "SandStone: A DHT based Carrier Grade
Distributed Storage System", September 2009.
Peng, et al. Expires January 3, 2012 [Page 29]
Internet-Draft One Hop Lookups Plugin July 2011
Authors' Addresses
Jin Peng
China Mobile
Unit 2, 28 Xuanwumenxi Ave,
Xuanwu District
Beijing 100053
P.R.China
Email: Penjin@chinamobile.com
Kai Feng
Beijing University of Posts and Telecommunications
10 Xi Tu Cheng Rd.
Haidian District
Beijing 100876
P.R.China
Email: fengkai_sunny@139.com
Lifeng Le
China Mobile
Unit 2, 28 Xuanwumenxi Ave,
Xuanwu District
Beijing 100053
P.R.China
Email: Lelifeng@chinamobile.com
Peng, et al. Expires January 3, 2012 [Page 30]