TOC 
Virtual World Region AgentM. Hamrick
Protocol 
Internet-DraftD. Levine
Intended status: InformationalIBM Thomas J. Watson Research
Expires: November 19, 2010Center
 May 18, 2010


VWRAP: Introduction and Goals
draft-hamrick-vwrap-intro-01

Abstract

The Virtual World Region Agent Protocol (VWRAP) defines interactions between hosts collaborating to create an shared, internet scale virtual world experience. This document introduces the protocol, the objectives and requirements it imposes on hosts implementing the protocol. To the extent it affects protocol interactions, this document describes the model assumed by the protocol.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

This Internet-Draft will expire on November 19, 2010.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.



Table of Contents

1.  Introduction and Motivation
    1.1.  Requirements Language
2.  Introducing the Virtual World Region Agent Protocol
    2.1.  A Brief Introduction to Virtual Worlds
    2.2.  Architectural Objectives
    2.3.  Protocol Objectives
3.  Virtual World Region Agent Protocol Architecture
    3.1.  Deployment Patterns
    3.2.  Protocol Elements
        3.2.1.  Communicating Application State Using REST-Like Resource Accesses
        3.2.2.  Bi-Directional Messaging with the VWRAP Event Queue
        3.2.3.  Using Capabilities to Simplify Inter-Domain Access Control
        3.2.4.  Using LLSD to Avoid Version Skew
4.  Virtual World Region Agent Protocol Services
    4.1.  Authentication
    4.2.  Text and Voice Chat
    4.3.  Agent Presence
        4.3.1.  Moving Presence
    4.4.  Digital Asset Access
        4.4.1.  Manipulating Digital Assets
        4.4.2.  Establishing Presence for Digital Assets
5.  Security Considerations
    5.1.  Capabilities
    5.2.  User Authentication
    5.3.  Service to Service Authentication
    5.4.  Access Control for Digital Assets
6.  Accessibility Considerations
7.  IANA Considerations
8.  References
    8.1.  Normative References
    8.2.  Informative References
Appendix A.  Definitions of Important Terms
Appendix B.  Acknowledgements
§  Authors' Addresses




 TOC 

1.  Introduction and Motivation

Virtual Worlds are of increasing interest to the internet community. Innumerable examples of virtual world implementations exist; most using proprietary protocols. With roots in games and social interaction, virtual worlds are now finding use in business, education and information exchange. This document introduces the Virtual World Region Agent Protocol (VWRAP) suite. This protocol suite is intended to carry information about a virtual world: its shape, its residents and objects existing within it. VWRAP's goal is to define an extensible set of messages for carrying state and state change information between hosts participating in the simulation of the virtual world.

Virtual worlds differ in their capabilities and architectural features. The VWRAP protocol suite is not appropriate for all massively multi-participant experiences. This document provides an introduction to virtual worlds and defines characteristics of virtual worlds VWRAP explicitly supports. It also describes the specific objectives of the protocol suite. An overview of the protocol is included, including an architectural description and list of services defined in a typical virtual world deployment.



 TOC 

1.1.  Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].



 TOC 

2.  Introducing the Virtual World Region Agent Protocol



 TOC 

2.1.  A Brief Introduction to Virtual Worlds

At its most basic level, the virtual world mirrors the reified world. It is inhabited by people and contains objects. Objects and people have a distinct place in the world and respond to external forces. The social construction of the virtual world is also similar to the reified world. People meet and interact with others to complete work tasks or for simple enjoyment. People converse, share media, and even sing to each other. A virtual world may allow commerce or enable building of virtual assets. Nearly the complete range of human interaction can be replicated in the virtual world. Properly rendered, an experience in a virtual world can carry the same impact as one in consensus reality.

To be relevant to the participant's experience, virtual worlds must retain characteristics of the "real world." Objects in the virtual world are models of familiar shapes and textures, represented using a common data format so they may be easily processed by the human visual cortex. At the same time they must carry sufficient information to be consumed by participants with visual impairments or to be processed by automated systems. Though the virtual world's state is maintained by abstract collections of data, it is rendered as recognizable (though occasionally fantastical) physical objects.

But virtual worlds are not completly faithful mirrors of the world our physical bodies inhabit. Virtual worlds are not limited by distance. With appropriate network connectivity, users may interact even if they are on opposite sides of the earth. Virtual worlds also allow participants to "play" with physical constraints. They provide the subjective experience of things not possible in consensus reality: participants can fly, the effects of "death" are temporary, users may call items into existence, examine the International Space Station with friends, examine DNA codon by codon with co-workers, or experience with a works of interactive art.

Virtual worlds inherit the intellectual property characteristics of other digital media. Implementers wishing to replicate the economy of scarcity present in the reified world must take special measures to avoid or limit the ramifications of unauthorized content duplication.

The VWRAP suite assumes network hosts, likely operated by distinct organizations will collaborate to simulate the virtual world. It also assumes that services originally defined for other environments (like the world wide web) will enhance the experience of the virtual world. The virtual world is expected to be simulated using software from multiple sources; interoperability standards are essential for delivering a robust services and compelling user experiences. VWRAP provides these interoperability standaards. It may be used with large arrays of hosts simulating a vast virtual world, or small worlds operated for the benefit a few persons. It defines a trust model so hosts from different organizations may limit access to sensitive information.

VWRAP presupposes a virtual world with the following characteristics:

1. The virtual world exists independent of the participating clients.

This is in contrast to systems that dynamically create virtual environments for specific social or task-oriented simulation. VWRAP assumes the state virtual world is "always on" and does not require a specific protocol to establish new virtual worlds.
2. Avatars have a single, unique presence in the virtual world.

Users are represented in the virtual world by a digital representation called an avatar. A user's avatar has an existence that mirrors the common physical world. Like people, avatars exist in only one place at one time. Avatars have a single identity that persists between user sessions. This identity may be used to render user-specific avatar shape or as the basis for access control.
3. The virtual world contains persistent objects.

Objects in the virtual world are governed by a "rational" life-cycle. They are created, persist and are (optionally) destroyed. Absent external forces, they tend to keep their current state.



 TOC 

2.2.  Architectural Objectives

The overall objective of the VWRAP suite is to enable a stable, extensible and secure virtual world. The protocols defined by this working group do not mandate a specific system or network architecture. However, previous implementations of large distributed systems can yeild clues as to how VWRAP compliant systems will be deployed. Documents produced by this working group reflect the consensus view of best common practice for producing large-scale, loosely-coupled distributed systems. VWRAP does not require a specific architecture, but it's protocols are defined with a specific series of architectural patterns in mind.

These architectural patterns have the following objectives in mind:

1. Systems implementing virtual world services should be distributed.

Virtual worlds are inherently distributed systems. They are collaborative tools intended to reduce the effect of distance on meaningful social, educational and commercial interaction. Typical virtual worlds are envisioned as drawing services from network hosts operated by a wide array of organizational entities. Virtual worlds may also be large enough that a single system or cluster of systems is not capable of providing all services to each client.

Experience with previous virtual world technologies suggests systems implementing the virtual world assume services they depend on are provided by distinct hosts in different administrative domains. The virtual world, it's user base and the constellation of virtual objects they use are simply too vast and varied to assume a single system or a single provider.

This does not mean, however, that virtual worlds simulated by the VWRAP suite must exist on more than one system. Quite the contrary, all VWRAP components (including the user agent) could operate on a single system. But there is little reason to believe the protocol used to exchange information between virtual world services must differ between small and large systems.

Large virtual worlds with many users are certain to implement virtual world services differently than small ones. But the protocol used to to communicate between systems implementing the virtual world can be the same. To be sure, optimizations may be possible for smaller deployments. Implementers of these "small" systems may find it useful to utilize standard, unoptimized protocols to allow for the possibility of growth.

However large (or small) a virtual world deployment is, or how many distinct organizations contribute to its operation, software implementing virtual world services MUST assume resources required to perform its function are distributed amongst multiple hosts.
2. Services supporting collaboration are hosted on 'central' systems.

The VWRAP suite assumes a collection of authoritative hosts for critical data and services. This applies to digital assets, user accounts, agent presence information and sources for virtual world object updates. This is in contrast to 'peer to peer' systems in which authoritative data may travel between hosts, depending on which hosts are connected to the network at any given moment. It also contrasts with 'client authoritative' or 'co-simulation' architectures in which physics simulation and agent state change occurs on the client.

This does not imply there is a single collection of systems holding authoritative data for the entire connected Internet.On the contrary, it explicitly allows for the simultaneous existance of multiple virtual worlds. It defines the mechanisms by which systems cooperate to simulate a virtual world, but does not specify a specific instance of those protocols as being any more authoritative than any others.

This objective implies that for any given data item, be it a avatar mesh, a region description or object description, there is an authorative URL. Subject to security policy, the data may be cached or transferred between domains, but authority over the data rests with administrative ownership of the host identified by it's URL.

This decision was motivated by the realization that network load in niave peer-to-peer implementations increases proportionally to the factorial of the number of participants. "Client Authoritative" solutions, where clients are first class participants in physics simulation or object presence and differences are explicitly resolved between clients, was thought to introduce undesirable requirements for client connectivity and client system performance.

This is not to say that either peer-to-peer or co-simulation are incompatible with systems that might use VWRAP. But rather, it was believed that their specification should be deferred to a later time when solutions to perceived problems could be explored more fully.
3. Hosts cooperating to simulate a virtual world should be composed of loosely coupled systems.

This document uses the term "loosely coupled" to mean software and systems implementing the virtual world should exhibit strong separation of concerns and resiliance in the presence of version skew. For the purposes of this document, "loosely coupled" implies system developers should expect multiple revisions of message formats to be encountered. Systems implementing these protocols should make a best effort to interpret messages received from remote hosts. Conversely, protocol developers and system implementers should carefully craft new revisions of messages to avoid confusion for systems expecting earlier message formats.

The motivation for emphesizing this pattern stems from the requirement that systems cooperating to simulate the virtual world may be owned by distinct organizations.
4. A persistent, ubituitous identity may accompany requests between hosts.

Many network services are provided anonymously; the nature of the service does not require identity authentication prior to it's use. But with the increasing deployment of customizable services delivered on the internet, identity is increasingly important. Even services that contain information that might not be considered "sensitive" require a representation of digital identity if for no other reason than to match service requests with user preferences. For example, a web page presenting current weather information may be enhanced by remembering locations of interest to each user. Recent work with "web mash ups," where multiple personalized or sensitive resources are used in concert with one another points to the utility of a "universal" identity. The representation of this universal identity enables independent services to cooperate to present the facade of a unified application to the service consumer. This allows service aggregators to more easily integrate "best of breed" services into a consistent solution.

Universal identity is critical to the virtual world. To achieve an internet scale virtual world, user services must be distributed amongst multiple hosts. To achieve a compelling experience, it must be easy for service providers to deliver their services in the virtual world. To facilitate a compelling social experience in the virtual world, all users must have the ability evaluate identity information of other users. Domains responsible for virtual world simulation MUST use a consistent representation of identity across all their hosts; simulation would otherwise be uncoordinated. Service providers who deliver content into the virtual world MUST use a consistent representation of identity to maintain the persistence of the virtual manifestation of their service; virtual objects used in conjunction with these services might otherwise appear to change state without apparent cause. Users depend on the persistent, universal identity of other users; if an avatar's identity changed unexpectedly, the result would be a suboptimal virtual world experience.



 TOC 

2.3.  Protocol Objectives

The primary objective of the Virtual World Region Agent Protocol is to provide a stable, extensible, secure system for virtual world information interchange with the following characteristics:

1. Flexible presentation of protocol data

While the primary purpose of the virtual world is to simulate a physical or social space, the tools used to access objects in the virtual world may be varied. Using a "3d viewer" is the primary mode of interaction with the virtual world, but other tools may be better suited for some tasks. For instance, it may be easier for a user to use a web browser to review avatar profile information, or to change details of virtual objects. Further, virtual world "mash ups" may prove to be important to some communities. To support the web (where XML and JSON are the lingua franca of information exchange) while also supporting tools where binary encodings are more appropriate, VWRAP was designed to be "presentation neutral."
VWRAP protocol exchanges are described in terms of an abstract type system using an interface description language. Implementations may choose to instantiate actual protocol data units using the most appropriate presentation format. Web-based applications may choose to use JSON or XML. Server-to-server interactions may use the VWRAP specific binary serialization scheme if implementers and deployers view binary encoding to be advantageous. The decision of which serialization scheme to use is ultimately that of the system implementer. VWRAP has been designed to provide this flexibility to system implementers and those tasked with deploying VWRAP compatible systems.
2. Flexible decomposition of concerns and ease of extension

VWRAP has been daesigned to allow meaningful separation of concerns. In other words, changes in one part of the protocol should not appreciably affect other parts.
For example, the authentication portion of the protocol is independent of the part of the protocol that deals with instant messaging or instantiating objects in the virtual world. In addition to defining messages for communicating application state, the specification also defines pre- and post-conditions. Should one particular authentication scheme be found to be lacking, it can be modified or replaced without affecting other systems.
This type of separation of concerns in the protocol specification also makes it easy to deploy "related solutions." While VWRAP was designed primarily to communicate the state of the virtual world between servers and client applications, a number of related applications also exist. E-Commerce web sites related to the virtual world and mobile chat clients allowing instant messaging between mobile networks and virtual world participants are just two examples of such applications. Proper separation of concerns allows new services to be specified and deployed without the need to redefine existing protocol.
3. Resilience in the face of version skew

Core to the VWRAP protocol is the idea that different components and services may be operated by different administrative entities; identity management services might be operated one business while simulation services are operated by another. In environments where many different organizations participate, version skew can be an important concern. VWRAP was designed to "degrade gracefully" when two systems running different versions of the protocol attempt to communicate.
VWRAP uses the LLSD abstract type system and the LLIDL interface description language to describe the structure and type semantics of elements in messages sent between systems. Because LLSD makes extensive use of variable width, clearly delineated data fields, services which consume protocol messages may identify and extract only those message elements they know how to handle. While this is not a guarantee that message semantics may be preserved in all version skew situations, it does eliminate one important cause of interoperability failures.


 TOC 

3.  Virtual World Region Agent Protocol Architecture



 TOC 

3.1.  Deployment Patterns

The VWRAP suite assumes that the services that simulate the virtual world will be hosted on multiple network hosts. Client applications that render the virtual world for end users are assumed to be on distinct hosts as well.


                          +---------------------+
                          | organization 1      |
                          |                     |
                          | +----------------+  |     +-------------+
                    +------>| service host 1 |<------>]             |
                    |     | +----------------+  |     |   client    |
                    |     |                     | +-->| application |
+-------------+     |     |                     | |   |             |
|             |<----+     | +----------------+  | |   +-------------+
|   client    |<----------->| service host 2 |<---+     ^
| application |           | +----------------+  |       |
|             |<----+     +---------------------+       |
+-------------+     |                                   |
                    |     +---------------------+       |
                    |     | organization 2      |       |
                    |     |                     |       |
                    |     | +----------------+  |       |
                    +------>| service host 3 |<---------+
                          | +----------------+  |
                          +---------------------+

Figure 1: Protocol Flows in VWRAP

Figure 1 above demonstrats a typical virtual world deployment. In it we see multiple client applications connecting to several servers operated by multiple organizations. Each "service host" in this diagram exposes an interface to a collection of resources representing the state of objects in the virtual world. It is traditional to discuss related resources as exposing "an interface" while related interfaces comprise "a service." Typical services are listed in a section below.

There is no restriction on how services map to hosts. Deployments where all resource interfaces are hosted on a single host is just as valid as one where resources are spread between a wide array of systems. Client applications should support both.Resource interfaces are addressed using URLs. Clients should treat the URLs for resources as opaque resource locations.

Clients use the "service establishment" process to retrieve URLs to various resources.



 TOC 

3.2.  Protocol Elements

VWRAP utilizes a number of "architectural motifs" or recurring design patterns. Most notably they include:

  • exposing application state via RESTful resources
  • using URIs to represent the address of application resources
  • using HTTP to "carry" message oriented protocol data
  • defining application state transitions and accesses with an interface description language
  • using an abstract type system to define access semantics of fields in protocol messages
  • using multiple "serializations" of the abstract type system to support different categories of consumers; defined serializations include XML, JSON and Binary.



 TOC 

3.2.1.  Communicating Application State Using REST-Like Resource Accesses

Contrary to popular opinion, not ALL virtual world interactions must be real-time exchanges. Many common activities like user authentication and texture and object transfer do not require "real time" semantics in the same way that applications like video-conferencing and Voice Over IP (VOIP) do. While it is generally a better experience if textures download quickly, if they are delayed, it does not have the same ramifications as if a voice packet in a VOIP system were delayed. Additionally, some interactions with the virtual world are strongly reminiscent of the request / response semantics used by popular protocols (like HTTP, POP3, etc.)

Because many protocol exchanges in the virtual world may be represented as non-real-time request / response interactions, VWRAP "reuses" the messaging semantics of HTTP. The justification for this is simple. Were VWRAP to not use HTTP, many of the features of HTTP would need to be re-invented or at least re-specified. Features like the use of mime types to identify payload structure; the use of message headers to modify the request or response and the use of URIs to address and identify resources. HTTP also has the benefit of being well supported by tools vendors and well understood by manufacturers of networking equipment.

Protocol exchanges in VWRAP that utilize request / response semantics are described using the LLSD / LLIDL abstract type system [I‑D.hamrick‑vwrap‑type‑system] (Brashears, A., Hamrick, M., and M. Lentczner, “VWRAP : Abstract Type System for the Transmission of Dynamic Structured Data,” February 2010.). LLSD defines type semantics for elements in a protocol data unit as well as rules for converting the data into a serialized form suitable for transmission across the network. VWRAP defines HTTP (and HTTPS) as the transports for serialized messages.

Addressable protocol endpoints in VWRAP are represented as URIs [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.). Protocol endpoints generally address RESTful resources. The VWRAP protocol uses HTTP verbs to provide read and write access to resources which represent the application state of the remote peer.

To recap, the objective of VWRAP is to communicate application state about the virtual world to all participants. VWRAP messages that communicate request / response style messages flow between clients and servers, using HTTP(S) as a message transport. Application objects representing the application state expose a RESTful interface and are addressed unambiguously using URIs. The VWRAP message formats are described using LLIDL, the interface description language defined as part of the LLSD abstract type system. LLIDL defines RESTful resource accesses in terms of the LLSD abstract type system, which may be serialized using one of three well defined serialization mechanisms: XML, JSON and Binary. Protocol participants decide before interacting which serialization mechanism is most appropriate or use the content negotiation mechanisms defined in HTTP.



 TOC 

3.2.2.  Bi-Directional Messaging with the VWRAP Event Queue

Not all protocol interactions are easily represented by HTTP's request / response semantics. When the server has a message for the client, there is no widely deployed technique for the server to initiate a HTTP request to the client. It is interesting to note that this is the same problem developers of "rich web applications" see when deploying their applications. Though VWRAP is not targeted for implementation exclusively in web browsers, we can utilize some of the techniques common in COMET applications.

Work is ongoing to define a general solution for "reverse HTTP," but many of these solutions require the definition of new protocol and deploying new code to web servers. The current best practice for COMET-style interaction is the use of the "long poll."

To avoid "technology lock in," VWRAP defines an Event Queue abstraction that represents the flow of messages from the server to the client. The Event Queue is expected to be implemented using the long poll technique. When additional options such as Reverse HTTP or web sockets are specified and in general deployment, the Event Queue may be re-implemented using these techniques. However, the interface defined by the Event Queue in the VWRAP Foundation document (Lentczner, M., “Virtual World Region Agent Protocol: Foundation,” February 2010.) [I‑D.lentczner‑vwrap‑foundation] should not change.



 TOC 

3.2.3.  Using Capabilities to Simplify Inter-Domain Access Control

Simulated objects and services delivered by VWRAP compliant systems will require some level of access control. Unfortunately, distributed access control is a notoriously difficult problem. VWRAP seeks to minimize the drawbacks of distributed access control by use of capabilities. In this context, a capability is an opaque URL, some portion of which contains a securely generated, cryptographically unguessable sequence of digits. Capabilities are used to define service endpoints and are intended to only be in the possession of trusted parties.

For example, a system may export the capability:

http://www.example.org/s/B2A2A445-D234-463A-BE6D-6C54E5854FE4/

This URL defines the protocol endpoint used to communicate application state changes (or query application state) for a specific application object by a specific user (or delegate.)

Capabilities are required to be effectively unguessable as they represent the right to perform a set of operations on a particular resource. Additionally, they must be kept "secret." While the task of maintaining the confidentiality of a number of web resource addresses may be a burden, it does have the advantage of simplifying access delegation. If a subject wishes to delegate access to a third party, they simply communicate the capability.

To reduce the likelihood of successful guessing attacks, inadvertent disclosure of a capability and "time of check, time of use" attacks, capabilities in VWRAP have a fixed lifetime, after which they expire. Systems SHOULD pick capability lifetimes commensurate with their security requirements and MUST NOT respond to protocol requests directed at a capability's URL after it has expired. Additionally, VWRAP capabilities may be "single use" or "one shot," meaning that they may only be used once before expiring.

Because capabilities are randomly generated with a short lifetime, VWRAP defines a mechanism for securely communicating capabilities and re-requesting expired capabilities.

It is important to note that capabilities do not completely replace traditional access control models. Systems may still use traditional Subject-Permission-Object schemes to represent access control for objects under their control. Capabilities provide a mechanism for communicating access control decisions among loosely coupled trusted systems.



 TOC 

3.2.4.  Using LLSD to Avoid Version Skew

It is a common practice in large, complicated software systems to divide the system into smaller, more manageable pieces. The precise nature of this partitioning is beyond the scope of this protocol. However, practical experience has demonstrated that services distributed across multiple co-operating hosts MUST contend with the issue of version skew. Simply stated, version skew is the condition where multiple versions of a service are interoperating simultaneously.

There are many reasons why version skew may be introduced. In VWRAP, agent domain hosts and region domain hosts may be operated by different organizations with different deployment schedules. Or perhaps a domain operator is required to support an obsolete version of a particular service endpoint for a small number of customers. Whatever the cause of version skew, it has, in the past introduced difficulties in deploying distributed services.

VWRAP does not seek to eliminate version skew, but it does attempt to reduce it's impact. VWRAP services are defined in using the LLIDL interface description language. LLIDL defines the type semantics of fields inside a protocol message using the LLSD abstract type system. Each of the abstract types defined in LLSD has a default value, and common conversions between conformable types are defined. LLSD specifies three standard techniques for serializing a protocol message prior to transmission across the network. Each of the three serialization techniques renders protocol messages into a collection of variable length fields. Protocol content is identified by JSON syntax, binary tags or XML element semantics, not by it's position in the message. LLIDL does not support the concept of a "required field." If a field defined in a protocol interaction is not present in the serialized message, it is semantically equivalent to the field being present and containing the default value for the field's type.

Careful construction of service endpoints allows them to consume messages described using LLIDL without fear that version skew induced format differences may cause the semantics of the message to be unclear. If a message arrives at a service endpoint with extra fields (fields defined in a later revision of the protocol exchange), the consumer can still extract those fields it understands. If a message arrives lacking a field described in the protocol exchange, the service endpoint SHOULD interpret it as if the field was present and contained the default value for it's type. This implies the message consumer cannot depend on the format of the message to determine validity, but must examine the contents of the message, converting missing fields to present fields with default values, and then determine if sufficient information is present to imply semantics about the protocol exchange.

This technique will not eliminate all ramifications of version skew, but carefully constructed service descriptions should be able to avoid the most common problems found when services interoperate with minor revision differences. While the Virtual World Region Agent Protocol itself does not mandate this style of message interpretation, it does require that messages be constructed so that service endpoints may do so.



 TOC 

4.  Virtual World Region Agent Protocol Services

Simulation and persistence of virtual objects in VWRAP is managed on "central" servers. Servers maintain interfaces to shared resources which represent the state of the virtual world. Interfaces are grouped by function into "services." Services are available to support the access to and distribution of data representing multiple facets of the virtual world. This section defines some services defined by VWRAP.

Avatar Presence
When a user authenticates themselves and wishes to interact with the virtual world, the user's avatar presence must be established. Simulating a user's avatar is a task shared by the agent domain, the region domain and the client application. The agent domain is responsible for establishing the user's presence in the virtual world simulated by the region domain, and to provide information about the avatar so it may be properly rendered.
Object Presence
The state of objects in the virtual world (landscapes, avatars, virtual "things") must be communicated to all participants. It is the responsibility of the region domain to keep track of each object's state. The landscape can change; clouds in the virtual sky may move and change shape; virtual people may move around and virtual "things" can undergo any number of state changes. The region domain is responsible for receiving input from virtual world inhabitants, evaluating how that input changes the state of objects in the virtual world, and then communicating those state changes to other observers.
Physics Simulation
Simulating the physical behavior of objects in the virtual world is a core feature of any virtual world. Regions may choose "earth like" conditions, or may modify gravity and atmospheric settings to create the experience of being on a different planet. Still other options include simulating quantum effects seen at very small scales or the large scale relativistic effects seen on galactic scales. Whatever the "physics" involved, it is the individual hosts in the region domain tasked with the simulation.
Effects of Programmatic Changes
(aka "scripting.") Some virtual worlds allow users to modify the state of objects using simple programming languages. The region domain is generally responsible for managing and executing scripts that modify the state of objects.
Region Specific Asset Storage
Though most "assets at rest" are associated with an avatar, it is conceptually appropriate for some items to be associated with a region; textures associated with landscapes, for instance. Or in some situations, the operator of a region may wish to bind "sensitive" resources to the location to ensure they do not follow region visitors to regions outside the originating region's administrative authority.


 TOC 

4.1.  Authentication

User Authentication in the Virtual World Region Agent Protocol is intended to verify the user's authorization to control their avatar in the virtual world and associated services. VWRAP currently defines three methods for authenticating a user, as well as recommendations for integrating some third party authentication schemes. The inputs to authentication are an avatar or account identifier and a related authentication token. Assuming the token is successfully authenticated, the output of authentication is a seed capability or "seed cap."

Like most VWRAP protocol exchanges, authentication protocol data is represented as LLSD serialized data carried over a secure HTTPS transport. The use of TLS with VWRAP authentication is recommended for all deployers who do not employ some other network security scheme (IPSec, link encryption, etc.) Implementers are advised that in addition to user's password (or other credential,) the seed capability returned after successful authentication is also considered "sensitive" and should be protected with appropriate network security measures.

The three authentication schemes defined in the VWRAP Trust Model and User Authentication (Chu, T., Hamrick, M., and M. Lentczner, “VWRAP Trust Model and User Authentication,” February 2010.) [I‑D.hamrick‑vwrap‑authentication] specification use a cryptographic hashes to demonstrate the user is in possession of the shared secret associated with their account. Recommendations also exist for using transport authentication mechanisms (such as TLS client certificates) in place of shared secrets. Also, work is currently underway to define protocol messages for use with Secure Remote Password (SRP).

The authentication mechanisms described above are believed to be sufficient at the time of this writing. It is an unfortunate truth, however, that cryptographic primitives are occasionally shown to be less secure than originally believed. For this reason, VWRAP Authentication was designed to be extensible; allowing future users to define new authentication schemes without invalidating other authentication components. A further benefit of flexibility is the ability to integrate other authentication schemes into an VWRAP context. OpenID and SAML, for instance, are popular identity and user authentication technologies that are defined outside the IETF. VWRAP's flexible authentication system allows organizations responsible for these standards to define their use with VWRAP without having to change the text of the VWRAP Authentication standard.

A typical flow of events for user authentication follows. This is a simplified version; readers with an interest in authentication are referred to the VWRAP Trust Model and User Authentication (Chu, T., Hamrick, M., and M. Lentczner, “VWRAP Trust Model and User Authentication,” February 2010.) [I‑D.hamrick‑vwrap‑authentication] specification.

  1. The end user presents their account identifier (either avatar name or account name) and an authenticator to the authentication services of the agent domain. Endpoints for user authentication protocol messages are typically well defined, public URLs.
  2. The authentication service authenticates the authenticator. If the credentials cannot be authenticated, an error condition is returned.
  3. The authentication service generates a seed capability and returns it to the user.
  4. The user queries the "seed cap," requesting capabilities for other services the user is authorized to use.

It is important to note that in the last step listed above, the client is free to request a subset of services offered by the agent domain. This allows the same authentication service to be used by restricted clients (for instance, a group-chat only client) as well as traditional 3d viewers.



 TOC 

4.2.  Text and Voice Chat

Virtual worlds are social spaces; efficent communication between users is a critical component of the virtual world experience. VWRAP virtual worlds support four types of text communication:

1. User to User
Akin to traditional "instant message" systems, User-to-User Text Chat includes two participants, the sender and the receiver.
2. Group
Group Text Chat is the process of delivering text messages to members of a pre-defined group. Group Text Chat groups are frequently defined by an external authority and have a well known name, discoverable by a search function.
3. Ad Hoc Group
Ad Hoc Group Text Chat involves delivering text message to a group created in an ad-hoc fashion. Membership in an Ad Hoc Text Chat group is defined by the message sender. The group typically does not have an externally locatable name.
4. Proximal Chat
When a user "virtually" chats to an avatar near it in the virtual world, it is making use of Proximal Chat. The endpoint for Proximal Chat messages is a location in the virtual world rather than a particular usre or group. Users whose avatars are within the proximity of the chatting user's avatar will receive text chat from that user.

Significant effort has been invested in developing protocols capable of efficiently deliverign textual information between users and groups. Protocols such as XMPP, SIMPLE and IRC may be used to carry text chat messages while VWRAP defined messages are used to establish and destroy text chat sessions.

Though the protocols used to implement voice communication in the virtual world are no doubt different from those that carry text messages, the scenarios for voice communication mirror those of for text. Four types of voice chat exist: User-to-User, Group, Ad Hoc Group and Proximal.



 TOC 

4.3.  Agent Presence

Once authenticated, the client application has established "agent presence." Once in possession of a valid seed capability, the client application may request a set of capabilities representing services offered by the agent domain: digital asset management, instant message and voice chat support as well as placing the user's avatar into the virtual world.

Placing an avatar in the virtual world begins with the client exercising the "place my avatar in a region" capability. As part of this transaction, the client provides the URI representing a region. Upon receipt of this request, the agent domain determines the validity of the URL provided, and if the URL resolves to a trusted region domain begins the protocol between the agent domain and the region domain to place the user's avatar in the region.

The precise exchange of messages between each party is beyond the scope of this document, but is described in the VWRAP Teleport specification But a few important points should be noted:

  • The protocol endpoint used by the client application to place the user's avatar in a region is managed by the agent presence service and provided by the authentication service following successful authentication. It is not a publicly defined, fixed URL.
  • Regions in VWRAP are represented with a URL. After authentication, the client uses this URL to tell the agent presence service where to place the user's avatar.
  • The agent presence service MAY apply a local policy to the URI and reject the request before attempting to connect with a region. For instance, a "behind the firewall" agent presence service may limit clients to regions known to be hosted by systems inside the local intranet.
  • The agent presence service MAY apply a local policy and reject the request after it makes an initial communication request with the remote region. (for example, if the region domain is operating servers with expired TLS certificates, or if those certificates are issued by a certifying authority the agent domain does not trust, it may reject the request.)
  • The process of placing the avatar in the region results in capabilities from the region being communicated back to the agent presence service and the client application. These capabilities are used to by the client to control the user's avatar and to access region-specific services.
  • The process also results in the agent presence service issuing capabilities to the region, allowing it limited access to information about the avatar such as the avatar's shape and appearance.

After an avatar is "placed" or "rezzed" in a region, the agent presence service is responsible for maintaining it's presence in a single location. That is to say, after the avatar has been successfully been placed in a region, the agent presence service MUST refuse to allow a second region to "take" the avatar's presence without removing the avatar from its current region.



 TOC 

4.3.1.  Moving Presence

When an avatar moves between regions, special care must be taken that the agent presence service and both the source and destination regions agree where avatar is located.

Moving between regions is typically initiated by the client. The process is largely the same as the initial avatar placement, but with the important added step of removing the avatar from it's source location before rezzing it in it's destination. (In fact, the initial placement of an avatar can be thought of as a transfer from "nowhere.")

The process of moving between regions is described in the VWRAP Teleport specification, thought implementers should keep the following important considerations in mind:

  • The client signals to the agent presence service it's desire to move from one region to another by accessing the same capability as is used for initial placement of the avatar.
  • The agent presence service must again check that local policy allows movement to the new destination, and MUST receive a capability for placing the client into the new region before it removes the avatar from it's current location.
  • The agent presence service MUST also remove the avatar from it's current location before placing the avatar in the destination location. Capabilities granted to the current region MUST be revoked as part of this process.
  • The location of the avatar MUST be unambiguous and the agent presence service MUST NOT represent the avatars location as being in two places at once. If required, for the short period between removing the avatar from one region and placing it in another, the avatar's location may be "in transit."



 TOC 

4.4.  Digital Asset Access

The virtual world is filled with virtual objects: conference tables, houses, airships, dragons, etc. These objects may or may not be instantiated in the virtual world. Digital descriptions of objects to be placed in the virtual world should be made available to clients by way of the "Asset Access Protocol." Services that wish to display or manipulate digital assets query servers using this protocol.

The asset access protocol defined in VWRAP differs significantly from legacy protocols defined by previous virtual world systems. It has been designed to more easily integrate existing collections of digital assets using standard protocols such as HTTP. All assets used by VWRAP, including avatar meshes, scene graphs and virtual object descriptions are accessed using this protocol.



 TOC 

4.4.1.  Manipulating Digital Assets

A number of useful manipulations of digital assets "at rest" are defined by VWRAP. Where appropriate, asset metadata may be altered by directly communicating with the network host with authority for that asset. This host may be part of the user's agent domain, or in the case of region-specific assets, it could be associated with a region domain. It is important to note, however, that not all metadata is modifiable by all users, even the asset's owner. Specifically, the semantics of the creator metadata do not allow the owner to change the creator's identity. Group membership may carry some rights like the ability to manipulate the size, shape and texture of an asset, but not an asset's owner.

The ability to access or manipulate digital assets is based on the accessor's identity. Accessing and manipulating digital assets is performed via capabilities which expose the state of the asset to an authorized client. This requires positive identification of the accessor prior to access. In the case where an asset service is owned by the same authority as the authentication service, this access may be as simple as providing the proper capability after user authentication. In cases where the asset service is owned by a different authority, systems for deferred authentication may be necessary. Work is currently underway to integrate OAuth and SAML into VWRAP for this purpose.

At a gross level, the types of resources exposed by a digital asset server would include:

  • a resource for searching an agent's inventory
  • a resource for iterating across an agent's inventory
  • a resource for accessing or manipulating a digital asset's metadata
  • a resource for uploading new digital assets, or changes to an existing asset.
  • a resource for removing a digital asset from the authority of the asset service
  • a resource for transferring the asset or a copy of the asset to a remote asset service
  • a resource for instantiating (or "rezzing") an object "in world"



 TOC 

4.4.2.  Establishing Presence for Digital Assets

Digital assets are intended to be used "in world," meaning there must be a way for a user to direct a simulation host to take an asset from an asset store and imbue it with presence in the virtual world. The separation between agent based services and region based services is fundamental to VWRAP and implies the authority for the system maintaining the asset "at rest" may be distinct from that which simulates the asset "in world." In practical terms, a region simulator may need to communicate with an asset server owned by a different person or company. In situations like this, trust is paramount. Because an asset's metadata may limit the owner's right to make copies of an asset, the agent domain MUST be able to trust the region domain will honor that metadata.

There are two levels of trust defined when working with digital assets: host-based trust and user-based trust. The former represents one system's expectation that the other will honor the metadata regarding ownership, creatorship and rights and restrictions implied by these concepts. Host based trust is carried by X.509 / PKIX certificates and implies a managed PKI. User-based trust represents the expectation the asset server will expose sensitive resources only to users with the right to access such resources.

Provided trust is established between the asset server and a simulation host, and the simulation host can demonstrate it is acting on behalf of a user with rights to access a particular resource, VWRAP defines a protocol for transferring a representation of the digital asset for simulation. As part of this protocol, access to a digital asset may be restricted while the object exists "in world." This is the case for objects whose creators or owners specify that only one copy of the asset may exist at a time.



 TOC 

5.  Security Considerations

As mentioned previously, the concept of a persistent, ubiquitous identity in the virtual world is core to the user experience. Keeping agent based services distinct from region or object based services has advantages for scalability and flexibility. However, it does have ramifications for the security of the virtual world as a whole.

Most notably, this structure puts the authentication service in the role of a trust broker. As the authentication service must collect capabilities for the benefit of the client, is trusted by both client applications and peer services. A transitive trust relationship exists between the peer services and end users by way of the authentication service. The administrators of peer services trust the authentication service to properly identify end users, and potentially to ensure they are members of a particular class. The end users trust the authentication service to properly identify peer services and to limit the transfer of digital assets to only those entities that explicitly agree to honor asset permissions meta-data.

VWRAP does not REQUIRE services to adhere to any preset policy, however. It instead provides a mechanism for communicating identity information so that such a policy MAY be enforced.



 TOC 

5.1.  Capabilities

VWRAP makes extensive use of RESTful resources accessed via HTTP. Application state is communicated and changed by accessing web based resources. One characteristic of such resources is they have a well defined URL, many of which are formatted as URL-based capabilities. [I‑D.lentczner‑vwrap‑foundation] (Lentczner, M., “Virtual World Region Agent Protocol: Foundation,” February 2010.) Capabilities have the characteristic that possession of the URL implies the right to access the resource it identifies. It is important that capability URLs are shared only with trusted participants. The VWRAP Base document defines the characteristics of URL-based capabilities, including the requirement that they include an unpredictable random component in the URL. Implementers need also ensure that these URLs are protected using suitable mechanisms (such as TLS, IPSec or link encryption.)



 TOC 

5.2.  User Authentication

Prior to granting an end user access to sensitive resources, the client MUST be authenticated. The VWRAP Authentication specification defines three techniques for using shared secrets to authenticate end users. The agent_login resource used for end user authentication provides an extensible mechanism, allowing the development and use of additional authentication techniques (SRP, TLS Client Certificates, SASL, etc.)

Again, it should be noted that VWRAP as currently defined does not REQUIRE an authentication service to support a particular authentication scheme (shared secret, public key, secure remote password, etc.) But it does define the mechanism for three shared secret options.

Once a user is successfully authenticated, their client application is passed a seed capability (as described in the VWRAP Base specification.) This seed capability is used by the client application to request access to resources and services managed by the agent domain (including services like "place my avatar in the virtual world.")



 TOC 

5.3.  Service to Service Authentication

Authenticating one service to another uses an X.509 PKI. Prior to communicating, the originating service generates a key pair for each host under their control and requests a certificate from each the peer service with which they wish to interact. The peer service returns a signed certificate the originating service uses in subsequent communication with the region.



 TOC 

5.4.  Access Control for Digital Assets

In addition to security characteristics addressing traditional network and user security issues, the raison d'être of VWRAP is to communicate state concerning items inhabiting a virtual world. Some of these items may have access control restrictions within the scope of the applications used to simulate and render the virtual world. VWRAP defines an extensible permissions model which allows permissions meta-data to be associated with virtual items.



 TOC 

6.  Accessibility Considerations

This document makes extensive reference to the visual properties of objects. It is important to note that semantic cues may be more important to persons with visual deficits. The VWRAP suite includes the ability to annotate objects with situaltional or semantic meta-data to provide a meaningful experience to non-sighted users. Implementers are are encouraged to consider that client applications favored by non-blind users may query the state of the virtual world differently; the VWRAP suite SHOULD be sufficently expressive to support multiple modes of interaction with the virtual world.

Implementers are also encouraged to consider the wide range of human capability and to design interaction experiences applicable to all.



 TOC 

7.  IANA Considerations

This memo includes no request to IANA.



 TOC 

8.  References



 TOC 

8.1. Normative References

[I-D.hamrick-vwrap-authentication] Chu, T., Hamrick, M., and M. Lentczner, “VWRAP Trust Model and User Authentication,” draft-hamrick-vwrap-authentication-00 (work in progress), February 2010 (TXT).
[I-D.hamrick-vwrap-type-system] Brashears, A., Hamrick, M., and M. Lentczner, “VWRAP : Abstract Type System for the Transmission of Dynamic Structured Data,” draft-hamrick-vwrap-type-system-00 (work in progress), February 2010 (TXT).
[I-D.lentczner-vwrap-foundation] Lentczner, M., “Virtual World Region Agent Protocol: Foundation,” draft-lentczner-vwrap-foundation-00 (work in progress), February 2010 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).


 TOC 

8.2. Informative References

[RFC1822] Lowe, J., “A Grant of Rights to Use a Specific IBM patent with Photuris,” RFC 1822, August 1995 (TXT).
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999 (TXT, PS, PDF, HTML, XML).
[RFC2817] Khare, R. and S. Lawrence, “Upgrading to TLS Within HTTP/1.1,” RFC 2817, May 2000 (TXT).
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” STD 66, RFC 3986, January 2005 (TXT, HTML, XML).


 TOC 

Appendix A.  Definitions of Important Terms

agent domain
The agent domain is the administrative authority responsible for managing services related to avatars and users. Identity management, group membership, avatar appearance, profile information, user authentication and group messaging are examples of services and information maintained by the agent domain.
agent host
A network host maintained by the agent domain is called an "agent host."
avatar
The avatar is the representation of a user in the virtual world. The avatar's shape and appearance are used by other users to render a graphical representation of the inhabited virtual world. The user's view of the virtual world is rendered from the perspective of their avatar.
client application
A client application is any application that is operated for the benefit of the user. Common client applications might include a "viewer" that renders the virtual world on the user's workstation or a web application used to manipulate the user's digital assets. VWRAP does not provide a canonical list of client application categories, but if an application is not a part of an agent domain or a region domain and it is manipulating user data or an avatar on behalf of a user, with the user's permission, it is a client application.
region domain
The region domain is the administrative authority responsible for managing services related to presence in the virtual world and it's simulation. Typical services exposed by a region domain would include physics simulation, avatar presence and virtual object presence lifecycle management (i.e. - the creation, manipulation and destruction of objects in the virtual world.)
region host
A network host maintained by the region domain is called a "region host", though the historical term "simulator" is still very common.
user
The entity controlling an avatar in world is the "user".



 TOC 

Appendix B.  Acknowledgements

The author gratefully acknowledges the contributions of: Mark Lentczner, David Levine, David Crocker, Larry Mastiner, Joshua Bell, Barry Leiba, Kevin Flynn, Morgaine Dinova, Joe Hildebrand, Lora Baines, Alan Bradley, Chris Newman, Katherine Mancuso and Jon Peterson.



 TOC 

Authors' Addresses

  Meadhbh Siobhan Hamrick
  P.O. Box 783
  Boulder Creek, CA 95006
  US
Phone:  +1 650 283 0344
Email:  OhMeadhbh@gmail.com
  
  David W. Levine
  IBM Thomas J. Watson Research Center
  19 Skyline Drive
  Hawthorn, NY 10532
  US
Phone:  +1 914 784 7427
Email:  dwl@us.ibm.com