Individual Submission C. Dannewitz
Internet-Draft University of Paderborn
Intended status: Informational T. Rautio
Expires: January 6, 2011 VTT Technical Research Centre of
Finland
O. Strandberg
Nokia Siemens Networks
July 5, 2010
Secure naming structure and p2p application interaction
draft-dannewitz-ppsp-secure-naming-00
Abstract
Many P2P applications use their own way to identify and address data
relying on host centric addressing, limiting the access to the same
data on potentially multiple locations for multiple P2P applications.
There are potential benefits in providing a generic way to identify
and address data so that multiple P2P systems can use the same data
regardless of data location. The proposed secure naming structure
provides a potential way to address these challenges with a common
naming structure for all data and different needs. The additional
feature of the proposal is securing the way data is addressed such
that the receiver has the possibility to verify that the correct data
is received. The secure naming structure should be beneficial as
potential design principle in defining the two protocols identified
as objectives in the PPSP charter. This document enumerates a number
of design considerations to impact the design and implementation of
the tracker-peer signaling and peer-peer streaming signaling
protocols.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Dannewitz, et al. Expires January 6, 2011 [Page 1]
Internet-Draft ppsp-secure-naming July 2010
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 6, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Dannewitz, et al. Expires January 6, 2011 [Page 2]
Internet-Draft ppsp-secure-naming July 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Naming requirements . . . . . . . . . . . . . . . . . . . . . 4
3. Basic Concepts for an Application-independent P2P Naming
Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2. ID Structure . . . . . . . . . . . . . . . . . . . . . . . 7
3.3. Security Metadata Structure . . . . . . . . . . . . . . . 8
4. Application use of secure naming structure . . . . . . . . . . 9
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
7. Security Considerations . . . . . . . . . . . . . . . . . . . 10
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
9. Informative References . . . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11
Dannewitz, et al. Expires January 6, 2011 [Page 3]
Internet-Draft ppsp-secure-naming July 2010
1. Introduction
Today's dominating naming schemes in the Internet, i.e., IP addresses
and URLs, are rather host-centric with respect to the fact that they
are bound to a location. This kind of naming scheme is not suitable
for P2P systems as they are based on an information-centric thinking,
i.e., putting the information at the focus whereas the source for
this information is constantly changing and might involve more than
one source at once.
Numerous P2P applications use their own data model and protocol for
keeping track of data and locations. This poses a challenge for use
of the same information for several applications. A common naming
scheme e.g. data model would be important to enable interconnectivity
between different P2P systems. To be able to build a common P2P
infrastructure that can serve a multitude of applications there is a
need for a common application independent naming scheme. With such a
naming scheme different applications can use and refer to the same
information/data objects.
It is possible to introduce false data into P2P systems, only
detectable when the content is played out in the user application.
The false data copies can be identified and sorted out if the P2P
system can verify the reference used in the tracker protocol towards
data received at the peer. One option to address this can be to
secure the naming structure i.e. make the data reference be dependent
on the data and related metadata.
For any type of caching solution (network based or P2P) and network
based storage, e.g. DECADE, a common application independent naming
scheme is essential to be able to identify cached copies of
information/data objects.
This document enumerates and explains the rationale for why a naming
structure for information/data objects should be part of a
specification for a protocol for PPSP. The main advantage is
probably in the definition of a protocol for signaling and control
between trackers and peers (the PPSP "tracker protocol") but also a
signaling and control protocol for communication among the peers (the
PPSP "peer protocol") might have benefits from a common and secure
naming scheme.
2. Naming requirements
In the following, we discuss the requirements that a common naming
scheme for P2P systems has to fulfill.
Dannewitz, et al. Expires January 6, 2011 [Page 4]
Internet-Draft ppsp-secure-naming July 2010
To enable efficient, large scale data dissemination that can make use
of any available data copy, identifiers (IDs) in P2P systems have to
be location-independent. Thereby, identical data can be identified
by the same ID independently of its storage location and improved
data dissemination can then benefit from all available copies. This
should be possible without compromising trust in data regardless of
its network source.
Security in a P2P system needs to be implemented differently than in
host-centric networks. In the latter, most security mechanisms are
based on host authentication and then trusting the data that the host
delivers. In a P2P system, host authentication cannot be relied
upon, or one of the main advantages of a P2P system, i.e., benefiting
from any available copy, is defeated. Host authentication of a
random, untrusted host that happens to have a copy does not establish
the needed trust. Instead, the security has to be directly attached
to the data which can be done via the scheme used to name the data.
Therefore, _self-certification_ is a main requirement for the naming
scheme. Self-certification ensures the integrity of data and
securely binds this data to its ID. More precisely, this property
means that any unauthorized change of data with a given ID is
detectable without requiring a third party for verification.
Beforehand, secure retrieval of IDs (e.g., via search, embedded in a
Web page as link, etc.) is required to ensure that the user has the
right ID in the first place. Secure ID retrieval can be achieved by
using recommendations, past experience, and specialized ID
authentication services and mechanisms that are out of the scope of
this discussion.
Another important requirement is _name persistence_, not only with
respect to storage location changes as discussed above, but also with
respect to changes of owner and/or owner's organizational structure,
and content changes producing a new version of the information.
Information should always be identifiable with the same ID as long as
it remains _essentially equivalent_. Spreading of persistent naming
schemes like the Digital Object Identifier (DOI) [Paskin2010] also
emphasizes the need for a persistent naming scheme. However, name
persistence and self-certification are partly contradictory and
achieving both simultaneously for _dynamic_ content is not trivial.
From a user's perspective, persistent IDs ensure that links and
bookmarks remain valid as long as the respective information exists
somewhere in the network, reducing today's problem of "404 - file not
found" errors triggered by renamed or moved content. From a content
provider's perspective, name persistence simplifies data management
as content can, e.g., be moved between folders and different servers
as desired. Name persistence with respect to content changes makes
Dannewitz, et al. Expires January 6, 2011 [Page 5]
Internet-Draft ppsp-secure-naming July 2010
it possible to identify different versions of the same information by
the same consistent ID. If it is important to differentiate between
multiple versions, a dedicated versioning mechanism is required, and
version numbers may be included as a special part of the ID.
The requirement of building trust in a P2P system combined with the
desire for anonymous publication as well as accountability (at least
for some content) can be translated into two related naming
requirements. The first is _owner authentication_, where the owner
is recognized as the same entity, which repeatedly acts as the object
owner, but may remain _anonymous_. The second is _owner
identification_, where the owner is also identified by a physically
verifiable identifier, such as a personal name. This separation is
important to allow for anonymous publication of content, e.g., to
support free speech, while at the same time building up trust in a
(potentially anonymous) owner.
In general, the naming scheme should be able to adapt to future
needs. Therefore, the naming scheme should be extensible, i.e., it
should be able to add new information (e.g., a chunk number for
BitTorrent-like protocols) to the naming scheme. The need for such
extensions is stressed by today's variety of naming schemes (e.g.,
DOI or PermaLink) added on top of the original Internet architecture
that fulfill specialized needs which cannot be met by the common
Internet naming schemes, i.e., IP addresses and URLs.
3. Basic Concepts for an Application-independent P2P Naming Scheme
In this section, we introduce an examplary naming scheme that
illustrates a possible way to fulfill the requirements posed upon an
application-independent naming scheme for P2P networks. The naming
scheme integrates security deeply into the system architecture.
Trust is based on the data's ID in combination with additional
_security metadata_. Section 3.1 gives an overview of the naming
scheme in general with details about the ID structure, and Section
3.2 describes the security metadata in more detail.
3.1. Overview
Building on an identifier/locator split, each data element, e.g.,
file, is given a unique ID with cryptographic properties. Together
with the additional security metadata, the ID can be used to verify
data integrity, owner authentication, and owner identification. The
security metadata contains information needed for the security
functions of the naming scheme, e.g., public keys, content hashes,
certificates, and a data signature authenticating the content. In
comparison with the security model in today's host-centric networks,
Dannewitz, et al. Expires January 6, 2011 [Page 6]
Internet-Draft ppsp-secure-naming July 2010
this approach minimizes the need for trust in the infrastructure,
especially in the host(s) providing the data.
In a P2P network, multiple copies of the same data element typically
exist at different locations. Thanks to the ID/locator split and the
application-independent naming scheme, those identical copies have
the same ID and, hence, each P2P application can benefit from all
available copies.
Data elements are manipulated (e.g., generated, modified, registered,
and retrieved) by physical entities such as nodes (clients or hosts),
persons, and companies. Physical entities able of generating, i.e.,
creating or modifying data elements are called _owners_ here.
Several security properties of this naming scheme are based on the
fact that each ID contains the hash of a public key that is part of a
public/secret key pair PK/SK. This PK/SK pair is conceptually bound
to the data element itself and not directly to the owner as in other
systems like DONA [Koponen]. If desired, the PK/SK pair can be bound
to the owner only _indirectly_, via a certificate chain. This is
important to note because it enables owner change while keeping
persistent IDs. The key pair bound to the _d_ata is thus denoted as
PK_D/SK_D.
Making the (hash of the) public key part of ID enables self-
certification of _dynamic_ content while keeping persistent IDs.
Self-certification of _static_ content can be achieved by simply
including the hash of content in the ID, but this would obviously
result in non-persistent IDs for dynamic content. For dynamic
content, the public key in the ID can be used to securely bind the
hash of content to the ID, by signing it with the corresponding
secret key, while not making it part of ID.
The owner's PK as part of the ID inherently provides _owner
authentication_. If the public key is bound to the owner's identity
(i.e., to its real-world name) via a trusted third party certificate,
this also allows _owner identification_. Without this additional
certificate, the owner can remain anonymous.
To support the potentially diverse requirements of certain groups of
P2P applications and adapt to future changes, the naming scheme can
enable flexibility and extensibility by supporting different name
structures, differentiated via a _Type field_ in the ID.
3.2. ID Structure
The naming scheme uses flat IDs to support self-certification and
name persistence. In addition, flat IDs are advantageous when it
comes to mobility and they can be allocated without an administrative
Dannewitz, et al. Expires January 6, 2011 [Page 7]
Internet-Draft ppsp-secure-naming July 2010
authority by relying on statistical uniqueness in a large namespace,
with the rare case of ID collisions being handled by the P2P system.
Although IDs are not hierarchical, they have a specified basic ID
structure. The ID structure given as ID = (Type field | A = hash(PK)
| L) is described subsequently.
The _Authenticator_ field A=Hash(PK_D) binds the ID to a public key
PK_D. The hash function _Hash_ is a cryptographic hash function,
which is required to be one-way and collision-resistant. The hash
function serves only to reduce the bit length of PK_D. PK_D is
generated in accordance with a chosen public-key cryptosystem. The
corresponding secret key SK_D should only be known to a legitimate
owner. In consequence, an owner of the data is defined as any entity
who (legitimately) knows SK_D.
The pair (A, L) has to be globally unique. Hence, the _Label_ field
L provides global uniqueness if PK_D is repeatedly used for different
data.
To build a flexible and extensible naming scheme, e.g., to adapt the
naming scheme to future changes, different types of IDs are supported
by the naming scheme and differentiated via a mandatory and globally
standardized _Type field_ in each ID. For example, the Type field
specifies the hash functions used to generate the ID. If a used hash
function becomes insecure, the Type field can be exploited by the P2P
system in order to automatically mark the IDs using this hash
function as invalid.
3.3. Security Metadata Structure
The security metadata is extensible and contains all information
required to perform the security functions embedded in the naming
scheme. The metadata (or selected parts of it) will be signed by
SK_D corresponding to PK_D. This securely binds the metadata to the
ID, i.e., to the Hash(PK_D) which is part of the ID. For example,
the security metadata may include:
o specification of the hash function _h_ and the algorithm _DSAlg_
used for the digital signature
o complete PK_D (not only Hash(PK_D))
o specification of the parts of data that are self-certified, i.e.,
authenticated via the signature
o hash of the self-certified data
Dannewitz, et al. Expires January 6, 2011 [Page 8]
Internet-Draft ppsp-secure-naming July 2010
o signature of the self-certified data signed by SK_D
o all data required for owner authentication and identification
A detailed description and security analysis of this naming scheme
and its security properties, especially self-certification, name
persistence, owner authentication, and owner identification can be
found in Dannewitz et al. [Dannewitz_10].
4. Application use of secure naming structure
From an application perspective the main advantage of a secure naming
structure for a P2P infrastructure is that multiple applications can
have common access to the same data elements. Another benefit of
application-independent naming is that locally available and cached
copies can easily be located. The secure naming also enables that
data can be verified even if it is received from an untrusted host.
For example, when an application like BitTorrent [WWWbittorrent] uses
self-certifying names, the user is guaranteed that the data received
is actually the data that has been requested, without having to trust
any servers in the network (e.g., the tracker) or the peers that
provide the data.
This means that BitTorrent's validation of the data integrity can be
improved significantly using the presented secure naming structure.
Currently, a standard BitTorrent system has no means to verify the
integrity of the torrent file and consequently of the data. The
torrent file contains the SHA1 hashes of the content pieces.
However, anyone can modify a torrent file to bind different content
to this file. If the torrent file gets modified, the user has no
means any more to verify the integrity of the data. If, in addition,
the tracker delivers forged data (consistent with the forged torrent
file), a user could effectively be tricked into downloading forged
content which would falsely be identified as being correct by the
BitTorrent client. I.e., in the current BitTorrent system, a user
has no guarantee that the downloaded content actually matches the
expected/correct content.
The secure naming structure presented in this draft can provide a
simple solution for this problem by securely binding the content of
the torrent file to the name/ID of the torrent file. This can be
done by extending the torrent file to include the above described
security metadata information. In practice, an object owner would
sign the hash values in the torrent file with the private key (SK_D)
and would store this signature, the public key (PK_D), and some
additional security metadata in the torrent file during torrent file
Dannewitz, et al. Expires January 6, 2011 [Page 9]
Internet-Draft ppsp-secure-naming July 2010
creation. The respective torrent file ID would be generated
according to the rules described in Section 3. Consequently,
whenever a user knows the ID of the content/torrent file and
retrieves the torrent file, she/he can now verify the integrity of
the torrent file, can download the data pieces, and can use the
included (and secured) hash(es) to verify the integrity of the
received data. As a result, the user can be sure that the correct
content was retrieved.
5. Conclusion
The secure naming structure is proposed for consideration as common
reference ID structure in PPSP WG. For any P2P streaming application
to have fair and multitude of data access, it is essential to have a
common naming structure that is suitable for many different needs.
The common naming is probably best displayed in the tracker protocol
case but potential benefit in the actual streaming protocol case has
to still be identified. The secure binding of reference ID to the
actual content is manifested in the end user peer possibility to
check correct data reception in regard to the used ID.
The naming structure has been implemented in the 4WARD project
prototypes and has been released as open source (www.netinf.org).
The naming structure is also available through a public NetInf
registration service at www.netinf.org. Three NetInf-enabled
applications have also been published, the InFox (Firefox plugin),
InBird (Thunderbird plugin), and a NetInf Information Object
Management Tool, all available at the www.netinf.org site.
6. IANA Considerations
This document has no requests to IANA.
7. Security Considerations
There are considerations about what private/public key and hash
algorithms to utilize when designing the naming structure in a secure
way.
8. Acknowledgements
We would like to thank especially Borje Ohlman for excellent
discussion and review of the draft. Thanks also goes to all persons
participating in the Network of Information work package in the EU
Dannewitz, et al. Expires January 6, 2011 [Page 10]
Internet-Draft ppsp-secure-naming July 2010
FP7 project 4WARD, the project SAIL and the Finnish ICT SHOK Future
Internet 2 project for contributions and feedback to this document.
9. Informative References
[Dannewitz_10]
Dannewitz, C., Golic, J., Ohlman, B., and B. Ahlgren,
"Secure Naming for a Network of Information", 13th IEEE
Global Internet Symposium , 2010.
[Koponen] Koponen, T., Chawla, M., Chun, B., Ermolinskiy, A., Kim,
K., Shenker, S., and I. Stoica, "A Data-Oriented (and
beyond) Network Architecture", Proc. ACM SIGCOMM , 2007.
[Paskin2010]
Paskin, N., "Digital Object Identifier ({DOI}(R)) System",
Encyclopedia of Library and Information Sciences , 2010.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[WWWbittorrent]
Cohen, B., "The BitTorrent Protocol Specification",
http://www.bittorrent.org/beps/bep_0003.html , 2008.
Authors' Addresses
Christian Dannewitz
University of Paderborn
Paderborn
Germany
Email: cdannewitz@upb.de
Teemu Rautio
VTT Technical Research Centre of Finland
Oulu
Finland
Email: teemu.rautio@vtt.fi
Dannewitz, et al. Expires January 6, 2011 [Page 11]
Internet-Draft ppsp-secure-naming July 2010
Ove Strandberg
Nokia Siemens Networks
Espoo
Finland
Email: ove.strandberg@nsn.com
Dannewitz, et al. Expires January 6, 2011 [Page 12]