[Search] [pdf|bibtex] [Tracker] [Email] [Nits]

Versions: 00                                                            
ICNRG Working Group                                          C. Westphal
Internet-Draft                                                    Huawei
Intended status: Informational                             July 14, 2018
Expires: January 15, 2019

                             AR/VR and ICN


   This document describes the challenges of AR/VR in ICN.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 15, 2019.

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Westphal                Expires January 15, 2019                [Page 1]

Internet-Draft                  ICN-ARVR                       July 2018

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .   4
       2.1.1.  Office productivity, personal movie theater . . . . .   4
       2.1.2.  Retail, Museum, Real Estate, Education  . . . . . . .   4
       2.1.3.  Sports  . . . . . . . . . . . . . . . . . . . . . . .   4
       2.1.4.  Gaming  . . . . . . . . . . . . . . . . . . . . . . .   5
       2.1.5.  Maintenance, Medical, Therapeutic . . . . . . . . . .   5
       2.1.6.  Augmented maps and directions, facial recognition,
               teleportation . . . . . . . . . . . . . . . . . . . .   5
   3.  Information-Centric Network Architecture  . . . . . . . . . .   6
     3.1.  Native Multicast Support  . . . . . . . . . . . . . . . .   6
     3.2.  Caching . . . . . . . . . . . . . . . . . . . . . . . . .   7
     3.3.  Naming  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     3.4.  Privacy . . . . . . . . . . . . . . . . . . . . . . . . .   7
     3.5.  Other benefits? . . . . . . . . . . . . . . . . . . . . .   7
     3.6.  Security Considerations . . . . . . . . . . . . . . . . .   7
   4.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     4.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     4.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Introduction

   Augmented Reality and Virtual Reality are becoming common place.
   Facebook and YouTube have deployed support for some immersive videos,
   including 360 videos.  Many companies, including the aforementioned
   Facebook, Google, but also Microsoft and others, are offering devices
   to view virtual reality, ranging from simple mechanical additions to
   a smart phone, such as Google Cardboard to full fledged dedicated
   devices, such as the Oculus Rift.

   Current networks however, are still struggling to deliver high
   quality video streams. 5G Networks will have to address the
   challenges introduced by the new applications delivering augmented
   reality and virtual reality services.  However, it is unclear that
   without architectural support, it will be possible to deploy such

   Most surveys of augmented reality systems (say, [van2010survey])
   ignore the potential underlying network issues.  We attempt to
   present some of these issues in this paper.  We also intend to
   explain how an Information-Centric Network architecture is beneficial
   for AR/VR.  Information-Centric Networking has been considered for
   enhancing content delivery by adding features that are lacking in an

Westphal                Expires January 15, 2019                [Page 2]

Internet-Draft                  ICN-ARVR                       July 2018

   IP network, such as caching, or the requesting and routing of content
   at the network layer by its name rather than a host's address.

2.  Definitions

   We provide definitions of virtual and augmented reality (see for
   instance [van2010survey]):

   Augmented Reality: an AR system inserts a virtual layer over the
   user's perception of the real objects, which combines both real and
   virtual objects in such a way that they function in relation to each
   other, with synchronicity and the proper depth of perception in three

   Virtual Reality: a VR system places the user in a synthetic, virtual
   environment with a coherent set of rules and interactions with this
   environment and the other participants in this environment.

   Virtual reality is immersive and potentially isolating from the real
   world, while augmented reality inserts extra information onto the
   real world.

   For the purpose of this article, we restrict ourselves to the audio-
   visual perception of the environment (even though haptic systems may
   be used) as a first step.  Many of the applications of augmented and
   virtual reality similarly start with eyesight and sounds only.

   Most of the AR/VR we consider here focuses on head-mounted displays,
   such as Oculus Rift or Google Cardboard.

   There are obvious observations derived from these descriptions of
   virtual and augmented reality.  One is that virtual reality only
   really needs a consistent set of rules for the user to be immersed
   into it.  It could theoretically work on a different time scale, say
   where the reaction to motion is slowler than in the real world.
   Further, VR only needs to be self-consistent, and does not require
   synchronization with the real world.

   As such, there are several levels of complexity along a reality-
   virtuality continuum.  For the purpose of the networking
   infrastructure, we will roughly label them as 360/immersive video,
   where user is streaming a video stream with a specific viewing angle
   and direction; virtual reality environment, where the user is
   immersed in a virtual world and has agency (say, decide of the
   direction of the motion, in addition to deciding of the direction of
   her viewing angle); and augmented reality where the users' view is
   overlayed on top of the actual real view of the user.

Westphal                Expires January 15, 2019                [Page 3]

Internet-Draft                  ICN-ARVR                       July 2018

   The last application requires identifying the environment, generating
   and fetching the virtual artifacts, layering these on top of the
   reality in the vision of the user, in real time and in
   synchronization with the space dimensionality and the perception of
   the user, and with the motion of the user's field of vision.  Such
   processing is very computationally heavy and would require a
   dedicated infrastructure to be placed within the network provider's

2.1.  Use Cases

   For AR/VR specifically, there is a range of scenarios with specific
   requirements.  We denote a few below, but make no claim of
   exhaustivity: there are plenty of other applications.

2.1.1.  Office productivity, personal movie theater

   This is a very simple, canonical use case, where the headmounted
   device is only a display for the workstation of the user.  This has
   little networking requirements, as all is collocated and could even
   be wired.  For this reason, it is one of the low hanging fruits in
   this space.  The main issue is of display quality, as the user spends
   long hour looking at a screen, with a resolution, a depth of
   perception, and a reactivity of the headmounted display that should
   be comfortable for the user.

2.1.2.  Retail, Museum, Real Estate, Education

   The application recreates the experience of being in a specific area,
   such as a home for sale, a classroom or a specific room in a museum.
   This is an application where the files may be stored locally, as the
   point is to replicate an existing point of reference, and this can be
   processed ahead of time.

   Issues become then how to move the virtual environment onto the
   display.  Can it be prefetched ahead of time; can it be distributed
   and cached locally near the device; can it be rendered in the device?

2.1.3.  Sports

   This attempts to put the user in the middle of a different real
   environment, as in the previous case, but adds to it several
   dimensions: that of real time, as the experience must be synchronized
   with a live event; that of scale, as many users may be attempting to
   participate in the experience simultanuously.

   These new dimensions add some corresponding requirements, namely how
   to distribute live content in a timely manner that still corresponds

Westphal                Expires January 15, 2019                [Page 4]

Internet-Draft                  ICN-ARVR                       July 2018

   to the potentially unique viewpoint of each of the users; how to
   scale this distribution to a large number of concurrent experiences.
   The viewpoint in this context also may impose different requirements,
   if it is that of a player in a basketball game, or that of a
   spectator in the arena.  For instance, in the former case, the
   position of the viewpoint is well defined by that of the player,
   while in the latter, it may wildly vary.

2.1.4.  Gaming

   Many games place the user into a virtual environment, from Minecraft
   to multi-user shooter game.  Platform such as Unity 3D allow creation
   of virtual worlds.  Unlike the previous use case, there are now
   interactions in between the different participants in the virtual
   environment.  This require communication of these interactions in
   between peers, and not just from a server onto the device.  There are
   issues of consistency across users and synchronization issues.

2.1.5.  Maintenance, Medical, Therapeutic

   There exist a few commercial products where the AR is used to overlay
   instructions on top of some equipment so as to assist the agent in
   performing maintenance.  Surgical assistance may fall in this
   category as well.

   The advantage of a specific task is that it facilitates the pattern
   recognition and the back-end processing as it is narrowed down.
   However, the requirements to overlay the augmented layer on top of
   the existing reality puts stringent synchronization and round-trip
   time requirements, both on the display and on the sensors capturing
   the motion and position.

2.1.6.  Augmented maps and directions, facial recognition, teleportation

   The more general scenario of augmented reality does not focus on a
   specific, well defined application, but absorbs the environment as
   observed by the user (or the user's car or the pilot's plane, if the
   display is overlayed on a windshield) and annotates this environment,
   for instance to specify directions.  This includes recognizing
   patterns and potentially people with the help of little context
   beyond the position of the user.  Another main target of AR is
   telepresence, where a person in a remote location could be made
   present, as if in another location, say with others in the same
   conference room.  Teleportation plus the display of the workstation
   of a user (as in the first scenario above) may allow remote
   collaboration on entreprise tasks.

Westphal                Expires January 15, 2019                [Page 5]

Internet-Draft                  ICN-ARVR                       July 2018

3.  Information-Centric Network Architecture

   We now turn our attention to the potential benefits that Information-
   Centric Networks can bring to the realization of AR/VR.

   The abstractions offered by an ICN architecture are promising for
   video delivery.  RFC7933 [RFC7933] for instance highlights the
   challenges and potential of ICN for adaptive rate streaming.  As VR
   in particular may encompass a video component, it is natural to
   consider ICN for AR/VR.

   There is a lot of existing work on ICN (say, caching or traffic
   engineering [su2013benefit]) which could be applied to satisfy the
   QoS requirements of the AR/VR applications, when possible.

3.1.  Native Multicast Support

   One of the key benefits from ICN is the native support for multicast.
   For instance, [macedonia1995exploiting] quotes: "if the systems are
   to be geographically dispersed, then highspeed, multicast
   communication is required."  Similarly, [frecon1998dive] states that:
   "Scalability is achieved by making extensive use of multicast
   techniques and by partitioning the virtual universe into smaller

   In the sport use case, many users will be participating in the same
   scene.  They will have potentially distinct point of views, as each
   may look into one specific direction.  However, each of these views
   may share some overlap with the others, as there is a natural focus
   point within the event (say, the ball in a basketball game).

   This means that many of the users will request some common data and
   native multicast significantly reduces the bandwidth and in the case
   of ICN, without extra signaling.

   Further, the multicast tree should be adhoc, and dynamic to
   efficiently support AR/VR.  Back in 1995, [funkhouser1995ring]
   attempted to identify the visual interactions in between entities
   representing users in a VE so as to "reduce the number of messages
   required to maintain consistent state among many workstations
   distributed across a wide-area network.  When an entity changes
   state, update messages are sent only to workstations with entities
   that can potentially perceive the change i.e., ones to which the
   update is visible.}" [funkhouser1995ring] was able to reduce the
   number of messages processed by client workstations by a factor of

Westphal                Expires January 15, 2019                [Page 6]

Internet-Draft                  ICN-ARVR                       July 2018

   It is unclear that ICN can assist in identifying which workstations
   (or nowadays, which users) may perceive the status update of another
   user (but naming the data at the network layer may help).
   Nonetheless, the multicast tree to reach the set of clients that
   would require an update is dynamically modified and the support for
   multicast in ICN definitly supports this dynamic behavior.

3.2.  Caching

   The caching feature of ICN allows prefetching of data near the edge
   some of the more static use cases; further, in the case of multiple
   users sharing a VE, the caching allows to perform the content
   placement phase for some users at the same time as the content
   distribution phase of others, thereby reducing bandwidth consumption.

   Caching is a prominent feature in an AR system: the data must be
   nearby to reduce the round-trip time to access the data.  Further, AR
   data has a strong local component and therefore caching allows to
   keep the information within the domain where it will be accessed.

   ICN naturally supports caching, and provides content-based security
   to allow any edge cache to hold and deliver the data.

3.3.  Naming

   Since only a partial Field of View is accessed from the whole
   spherical view at any point in time, tiling the spherical view into
   smaller areas and requesting the tiles that are viewed would reduce
   the bandwidth consumption of AR/VR systems.  This raises the obvious
   question of naming semantics for tiles.  New naming schemes that
   allow for tiling should be devised.

3.4.  Privacy

   By enabling caching at the edge, ICN enhances the privacy of the
   users.  The user may access data locally, and thereby will not reveal
   information beyond the network edge.

3.5.  Other benefits?

   TBD: any other aspects to consider.

3.6.  Security Considerations


Westphal                Expires January 15, 2019                [Page 7]

Internet-Draft                  ICN-ARVR                       July 2018

4.  References

4.1.  Normative References

   [RFC7933]  "Adaptive Video Streaming over Information-Centric
              Networking (ICN)", RFC 7933, august 2016.

4.2.  Informative References

              and , "DIVE: A scaleable network architecture for
              distributed virtual environments", Distributed Systems
              Engineering vol 5, number 3 , 1998.

              and , "RING: a client-server system for multi-user virtual
              environments", ACM symposium on Interactive 3D graphics ,

              and , "Exploiting reality with multicast groups: a network
              architecture for large-scale virtual environments",
              Virtual Reality Annual International Symposium , 1995.

              and , "On the Benefit of Information Centric Networks for
              Traffic Engineering", IEEE ICC , 2014.

              and , "A survey of augmented reality technologies,
              applications and limitations", International Journal of
              Virtual Reality , 2010.

Author's Address

   Cedric Westphal

   Email: Cedric.Westphal@huawei.com

Westphal                Expires January 15, 2019                [Page 8]