Skip to main content

Directions for Computing in the Network
draft-kutscher-coinrg-dir-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft whose latest revision state is "Replaced".
Authors Dirk Kutscher , Teemu Karkkainen , Joerg Ott
Last updated 2019-07-08
Replaced by draft-irtf-coinrg-dir
RFC stream (None)
Formats
Stream Stream state (No stream defined)
Associated None milestones
Dec 2019
Capture the SoTA of the COIN landscape
Dec 2019
Articulate COIN challenges
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state I-D Exists
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-kutscher-coinrg-dir-00
COINRG                                                       D. Kutscher
Internet-Draft                 University of Applied Sciences Emden/Leer
Intended status: Experimental                            T. Kaerkkaeinen
Expires: January 9, 2020                                          J. Ott
                                           Technical University Muenchen
                                                           July 08, 2019

                Directions for Computing in the Network
                      draft-kutscher-coinrg-dir-00

Abstract

   In-network computing can be conceived in many different ways - from
   active networking, data plane programmability, running virtualized
   functions, service chaining, to distributed computing.

   This memo proposes a particular direction for Computing in the
   Networking (COIN) research and lists suggested research challenges.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 9, 2020.

Copyright Notice

   Copyright (c) 2019 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must

Kutscher, et al.         Expires January 9, 2020                [Page 1]
Internet-Draft   Directions for Computing in the Network       July 2019

   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Computing in the Network vs Networked Computing vs Packet
       Processing  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Networked Computing . . . . . . . . . . . . . . . . . . .   4
     3.2.  Packet Processing . . . . . . . . . . . . . . . . . . . .   5
     3.3.  Computing in the Network  . . . . . . . . . . . . . . . .   6
     3.4.  Elements for Computing in the Network . . . . . . . . . .   8
   4.  Research Challenges . . . . . . . . . . . . . . . . . . . . .  10
     4.1.  Categorization of Different Use Cases for Computing in
           the Network . . . . . . . . . . . . . . . . . . . . . . .  10
     4.2.  Networking and Remote-Method-Invocation Abstractions  . .  10
     4.3.  Transport Abstractions  . . . . . . . . . . . . . . . . .  12
     4.4.  Programming Abstractions  . . . . . . . . . . . . . . . .  13
     4.5.  Security, Privacy, Trust Model  . . . . . . . . . . . . .  14
     4.6.  Failure Handling, Debugging, Management . . . . . . . . .  15
   5.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  15
   6.  Informative References  . . . . . . . . . . . . . . . . . . .  15
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  16

1.  Introduction

   Recent advances in platform virtualization, link layer technologies
   and data plane programmability have led to a growing set of use cases
   where computation near users or data consuming applications is needed
   - for example for addressing minimal latency requirements for
   compute-intensive interactive applications (networked Augmented
   Reality, AR), for addressing privacy sensitivity (avoiding raw data
   copies outside a perimeter by processing data locally), and for
   speeding up distributed computation by putting computation at
   convenient places in a network topology.

   In-network computing has mainly been perceived in five variants so
   far: 1) Active Networking [ACTIVE], adapting the per-hop-behavior of
   network elements with respect to packets in flows, 2) Edge Computing
   as an extension of virtual-machine (VM) based platform-as-a-service,
   3) programming the data plane of SDN switches (through powerful
   programmable CPUs and programming abstractions, such as P4 [SAPIO]),
   4) application-layer data processing frameworks, and 5) Service
   Function Chaining (SFC).

Kutscher, et al.         Expires January 9, 2020                [Page 2]
Internet-Draft   Directions for Computing in the Network       July 2019

   Active Networking has not found much deployment in the past due to
   its problematic security properties and complexity.

   Programmable data planes can be used in data centers with uniform
   infrastructure, good control over the infrastructure, and the
   feasibility of centralized control over function placement and
   scheduling.  Due to the still limited, packet-based programmability
   model, most applications today are point solutions that can
   demonstrate benefits for particular optimizations, however often
   without addressing transport protocol services or data security that
   would be required for most applications running in shared
   infrastructure today.

   Edge Computing (as traditional cloud computing) has a fairly coarse-
   grained (VM-based) computation-model and is hence typically deploying
   centralized positioning/scheduling though virtual infrastructure
   management (VIM) systems.

   Microservices can be seen as a (light-weight) extension of the cloud
   computing model (application logic in containers and orchestrators
   for resource allocation and other management functions), leveraging
   more light-weight platforms and fine-grained functions.  Compared to
   traditional VM-based systems, microservice platforms typically employ
   a "stateless" approach, where the service/application state is not
   tied to the compute platform, thus achieving fault tolerance with
   respect to compute platform/process failures.

   Application-layer data processing such as Apache Flink [FLINK]
   provide attractive dataflow programming models for event-based stream
   processing and light-weight fault-tolerance mechanisms - however
   systems such as Flink are not designed for dynamic scheduling of
   compute functions.

   Modern distributed applications frameworks such as Ray [RAY], Sparrow
   [SPARROW] or Canary [CANARY] are more flexible in this regard - but
   since they are conceived as application-layer frameworks, their
   scheduling logic can only operate with coarse-granular cost
   information.  For example, application-layer frameworks in general,
   can only infer network performance, anomalies, optimization potential
   indirectly (through observed performance or failure), so most
   scheduling decisions are based on metrics such as platform load.

   Service Function Chaining (SFC, [RFC7665]) is about establishing IP
   tunnels between processing functions that are expected to work on
   packets or flows - for applications such as inspection and
   classification - not for general Computing in the Network purposes.

Kutscher, et al.         Expires January 9, 2020                [Page 3]
Internet-Draft   Directions for Computing in the Network       July 2019

2.  Terminology

   We are using the following terms in this memo:

   Program:  a set of computations requested by a user

   Program Instance:  one currently executing instance of a program

   Function:  a specific computation that can be invoked as part of a
      program

   Execution Platform:  a specific host platform that can run function
      code

   Execution Environment:  a class of target environments (execution
      platforms) for function execution, for example, a JVM-based
      execution environment that can run functions represented in JVM
      byte code

3.  Computing in the Network vs Networked Computing vs Packet Processing

   Many applications that might intuitively be characterized as
   "computing in the network" are actually either about connecting
   compute nodes/processes or about IP packet processing in fairly
   traditional ways.

   Here, we try to contrast these existing and wildly successful systems
   (that probably do not require new research) with a more novel
   "computing in the network (COIN)" approach that revisits the function
   split between computing and networking.

3.1.  Networked Computing

   Networked Computing exists in various facets today (as described in
   the Introduction).  Fundamentally, these systems make use of
   networking to connect compute instances - be it VMs, containers,
   processes or other forms of distributed computing instances.

   There are established frameworks for connecting these instances, from
   general purpose Remote Method/Procedure Invocation to system-specific
   application-layer protocols.  With that, these systems are not
   actually realizing "computing in the network" - they are just using
   the network (and taking connectivity as granted).

   Most of the challenges here are related to compute resource
   allocation, i.e., orchestration methods for instantiating the right
   compute instance on a corresponding platform - for achieving fault
   tolerance, performance optimization and cost reduction.

Kutscher, et al.         Expires January 9, 2020                [Page 4]
Internet-Draft   Directions for Computing in the Network       July 2019

   Examples of successful applications of networked computing are
   typical overlay systems such as CDNs.  As overlays they do not need
   to be "in the network" - they are effectively applications.  (Note:
   we sometimes refer to CDN as an "in-network" service because of the
   mental model of HTTP requests that are being directed and potentially
   forwarded by CDN systems.  However, none of this happens "in the
   network" - it is just a successful application of HTTP and underlying
   transport protocols.)

3.2.  Packet Processing

   Packet processing is a function "in the network" - in a sense that
   middleboxes reside in the network as transparent functions that apply
   processing functions (inspection, classification, filtering, load
   management etc.) - mostly _transparent_ to endpoints.  Some middlebox
   functions (TCP split proxies, video optimizers) are more invasive in
   a sense that they do not only operate on IP flows but also try to
   impersonate transport endpoints (or interfere with their behavior).

   Since these systems can have severe impacts on service availability,
   security/privacy, and performance they are typically not very
   _programmable_.

   Active Networking can be characterized as an attempt to offer
   abstractions for programmable packet processing from an "endpoint
   perspective", i.e., by using data packets to specify intended
   behavior in the network with the mentioned security problems.

   Programmable Data Plane approach such as P4 are providing
   abstractions of different types of network switch hardware (NPUs,
   CPUs, FPGA, PISA) from a switch/network programming perspective.
   Corresponding programs are constrained by the capabilities
   (instruction set, memory) of the target platform and typically
   operate on packets/flow abstractions (for example match-action-style
   processing).

   Network Functions Virtualization (NFV) is essentially a "Networked
   Computing" approach (after all, Network Functions are just
   virtualized compute functions that get instantiated on compute
   platforms by an orchestrator).  However, some VNFs happen do process/
   forward packets (e.g., gateways in provider networks, NATs or
   firewalls).  Still that does not affect their fundamental properties
   as virtualized computing functions.

Kutscher, et al.         Expires January 9, 2020                [Page 5]
Internet-Draft   Directions for Computing in the Network       July 2019

3.3.  Computing in the Network

   In some deployments, networked computing and packet processing go
   well together, for example when network virtualization (multiplexing
   physical infrastructure for multiple isolated subnetworks) is
   achieved through data-plane programming (SDN-style) to provide
   connectivity for VMs of a tenant system.

   While such deployments are including both computing and networking,
   they are not really doing computing _in the network_. VM/containers
   are virtualized hosts/processes using the existing network, and
   packet processing/programmable networks is about packet-level
   manipulation.  While it is possible to implement certain
   optimizations (for example, processing logic for data aggregation) -
   the applicability is limited especially for applications where
   application-data units do not map to packets and where additional
   transport protocols and security requirements have to be considered.

   Distributed Computing (stream processing, edge computing) on the
   other side is an area where many application-layer frameworks exist
   that actually _could_ benefit from a better integration of computing
   and networking, i.e., from a new "computing in the network" approach.

   For example, when running a distributed application that requires
   dynamic function/process instantiation, traditional frameworks
   typically deploy an orchestrator that keeps track of available host
   platforms and assigned functions/processes.  The orchestrator
   typically has good visibility of the availability of and current load
   on host platforms, so it can pick suitable candidates for
   instantiating a new function.

   However, it is typically agnostic of the network itself - as
   application layer overlays the function instances and orchestrators
   take the network as a given, assuming full connectivity between all
   hosts and functions.  While some optimizations may still be feasible
   (for example co-locating interacting functions/processes on a single
   host platform), these systems cannot easily reason about

   o  shortest paths between function instances;

   o  function off-loading opportunities on topologically convenient
      next-hops; and

   o  availability of new, not yet utilized resources in the network.

   While it is possible to perform optimizations like these in
   application layers overlays, it involves significant monitoring
   effort and would often duplicate information (topology, latency) that

Kutscher, et al.         Expires January 9, 2020                [Page 6]
Internet-Draft   Directions for Computing in the Network       July 2019

   is readily available inside the network.  In addition to the
   associated overhead, such systems also operate at different time
   scales so that direct reaction in fine-grained computing environments
   is difficult to achieve.

   When asking the question of how the network can support distributed
   computing better, it may be helpful to characterize this problem as a
   resource allocation optimization problem: Can we integrate computing
   and networking in a way that enables a joint optimization of
   computing and networking resource usage?  Can we apply this approach
   to achieve certain optimization goals such as:

   o  low latency for certain function calls or compute threads;

   o  high throughput for a pipeline of data processing functions;

   o  high availability for an overall application/service;

   o  load management (balancing, concentration) according to
      performance/cost constraints; and

   o  consideration of security/privacy constraints with respect to
      platform selection and function execution?

   o  Also: can we do this at the speed of network dynamics, which may
      be substantially higher than the rate at which distributed
      computing applications change?

   Considering computing and networking resource holistically could be
   the key for achieving these optimization goals (without considerable
   overhead through telemetry, management and orchestration systems).
   If we are able to dissolve the layer boundaries between the
   networking domain (that is typically concerned with routing,
   forwarding, packet/flow-level load balancing) and the distributed
   computing domain (that is typically concerned with 'processor'
   allocation, scaling, reaction to failure for functions/processes), we
   might get a handle to achieve a joint resource optimization and
   enable the distributed computing layer to leverage network-provided
   mechanisms directly.

   For example, if distributing information about available/suitable
   compute platform could be a routing function, we might be able to
   obtain and utilize this information in a distributed fashion.  If
   instantiating a new function (or offloading some piece of
   computation) could consider live performance data obtained from a in-
   network forwarding/offloading service (similar to IP packet
   forwarding in traditional IP networks), the "next-hop" decision could
   be based both on network performance and node load/availability).

Kutscher, et al.         Expires January 9, 2020                [Page 7]
Internet-Draft   Directions for Computing in the Network       July 2019

   Integrating computing and networking in this manner would not rule
   out highly optimized systems leveraging sophisticated orchestrators.
   Instead, it would provide a (possibly somewhat uniform) framework
   that could allow several operating and optimization modes, including
   totally distributed modes, centralized orchestration, or hybrid
   forms, where policies or intents are injected into the distributed
   decision-making layer, i.e., as parameters for resource allocation
   and forwarding decisions.

3.4.  Elements for Computing in the Network

   In-network computing requires computing resources (CPU, possibly
   GPUs, memory, ...), physical or virtualized to some extent by a
   suitable platform.  These computing resources may be available in a
   number of places, as partly already discussed above, including:

   o  They may be found on dedicated machines co-locating with the
      routing infrastructure, e.g., having a set of servers next to each
      router as one may find in access network concentrators.  This
      would come closest to today's principles of edge computing.

   o  They may be integrated with routers or other network operations
      infrastructure and thus be tightly integrated within the same
      physical device.

   o  They may be integrated within switches, similar to the (limited)
      P4 compute capabilities offered today.

   o  They may be located on NICs (in hosts) or line cards (routers) and
      be able to proactively perform some application functions, in the
      sense of a generalized variant of "offloading" that protocol
      stacks perform to reduce main CPU load.

   o  They might add novel types of dedicated hardware to execute
      certain functions more efficiently, e.g., GPU nodes for
      (distributed) analytics.

   o  They may also encompass additional resources at the edge of the
      network, such as sensor nodes.  Associated sensors could be
      physical (as in IoT) or logical (as in MIB data about a network
      device).

   o  Even user devices along the lines of crowd computing \cite{crowd-
      computing} or mist computing \cite{mist-computing} may contribute
      compute resources and dynamically become part of the network.

   Depending on the type of execution platform, as already alluded to
   above, a suitable execution framework must be put in place: from

Kutscher, et al.         Expires January 9, 2020                [Page 8]
Internet-Draft   Directions for Computing in the Network       July 2019

   lambda functions to threads to processes or process VMs to unikernels
   to containers to full-blown VMs.  This should support mutual
   isolation and, depending on the service in question, a set of
   security features (e.g., authentication, trustworthy execution,
   accountability).  Further, it may be desirable to be able to compose
   the executable units, e.g., by chaining lambda functions or allowing
   unikernels to provide services to each other - both within a local
   execution platform and between remote platform instances across the
   network.

   The code to be executed may be pre-installed (as firmware, as
   microcode, as operating system functions, as libraries, as *aaS
   offering, among others) or may be dynamically supplied.  While the
   former is governed by the entity operating the execution device or
   supplying it (the vendor), the code to be executed may have different
   origins.  Fundamentally, we can distinguish between two cases:

   1.  The code may be "centrally" provisioned, originating from an
       application or other service provider inside the network.  This
       is analogous to CDNs, in which an application provider contracts
       a CDN provider to host content and service logic on its behalf.
       The deployment is usually long-term, even if instantiations of
       the code may vary.  The code thus originates from rather few -
       known - sources.  In this setting, applications only invoke this
       code and pass on their parameters, context, data, etc.

   2.  The code may be "decentrally" provided from a user device or
       other service that requires a certain function or service to be
       carried out.  At the coarse granularity of entire application
       images, this has been explored as "code offloading"; recent
       approaches have moved towards finer granularities of offloading
       (sets of) functions, for which also some frameworks for
       smartphones were developed, leading to finer granularities down
       to individual functions.  In this setting, application transfer
       mobile code - along with suitable parameters, etc. - into the
       network that is executed by suitable execution platforms.  This
       code is naturally expected to be less trusted as it may come from
       an arbitrary source.

   Obviously, 1. and 2. may be combined as mobile code may make use of
   other in-network functions and services, allowing for flexible
   application decomposition.  Essentially, in-network computing may
   support everything from full application offloading to decomposing an
   application into small snippets of code (e.g., at class, objects, or
   function granularity) that are fully distributed inside the network
   and executed in a distributed fashion according to the control flow
   of the application.  This may lead to iterative or recursive calling

Kutscher, et al.         Expires January 9, 2020                [Page 9]
Internet-Draft   Directions for Computing in the Network       July 2019

   from application code on the initiating host to mobile code to pre-
   provisioned code.

   Another dimension beyond where the code comes from is how tightly the
   code and the data are coupled.  At one extreme approaches like Active
   Messages combine the data and the code that operates (only) on that
   data into transmission units, while at the other extreme approaches
   like Network Function Virtualization are only concerned with the
   instantiation of the code in the network.  The underlying
   architectural question is whether the goal is to enable the network
   to perform computations on the data passing through it, or whether
   the goal is to enable distributed computational processes to be built
   in the network.

   With these different existing and possibly emerging platforms and
   execution environments and different ways to provision functions in
   the network, it does not seem useful to assume any particular
   platform and any particular "mobile code" representation as _the_
   "computing in the network" environment.  Instead, it seems more
   promising to reason about properties that are relevant with respect
   to distributed program semantics and protocols/interfaces that would
   be used to integrate functions on heterogeneous platforms into one
   application context.  We discuss these ideas and associated
   challenges in the following section.

4.  Research Challenges

   Conceiving computing in the network as a joint resource optimization
   problem as described above incurs a set of interesting, novel
   research challenges that are particularly relevant from an Internet
   Research perspective.

4.1.  Categorization of Different Use Cases for Computing in the Network

   There are different applications but also different configuration
   classes of Computing in the Network systems.  For example, a data
   processing pipeline might be different from a distributed application
   employing some stateful actor components.  It is worthwhile analyzing
   different typical use cases and identify commonalities (for example,
   fundamental protocol elements etc.) and differences.

4.2.  Networking and Remote-Method-Invocation Abstractions

   In distributed systems, there are different classes of functions that
   can be distinguished, for example:

   1.  Strictly stateless functions that do not keep any context state
       beyond their activation time

Kutscher, et al.         Expires January 9, 2020               [Page 10]
Internet-Draft   Directions for Computing in the Network       July 2019

   2.  Stateful functions/modules/programs that can be instantiated,
       invoked and eventually destroyed that do keep state over a series
       of function invocations

   Modern frameworks such as Ray are offering a clear separation of
   stateless functions and stateful actors and offer corresponding
   abstractions in their programming environment.  The aforementioned
   analysis of use cases should provide a diverse set of use cases for
   deriving a minimal yet sufficient set of function classes.

   Beyond this fundamental categorization of functions/actors, there is
   the question of interfaces and protocols mechanisms - as building
   blocks to utilize functions in programs.  For example, stateful
   functions are typically invoked through some Remote Method Invocation
   (RMI) protocol that identifies functions, allows for specifying/
   transferring parameters and function results etc.  Stateful actors
   could provide class-like interfaces that offer a set of functions
   (some of which might manipulate actor state).

   Another aspect is about identity (and naming) of functions and
   actors.  For actors that are typically used to achieve real-world
   effects or to enable multiple invocations of functions manipulating
   actor state over time, it is obvious that there needs to be a concept
   of specific instances.  Invoking an actor function would then require
   specifying some actor instance identifier.

   Stateless functions may be different: an invoking instance may be
   oblivious function identify and locus (on an execution platform) and
   might just want to leave it to the network to find the "best"
   instance or locus for a new instantiation.  Some fine-granular
   functions might just be instantiated for one invocation.  On the
   other hand, a function might be tied to a particular execution
   platform, for example an GPU-supported host system.  The naming and
   identity framework must allow for specifying such a function (or at
   least equivalence classes) accordingly.

   Stateful functions may share state within the same program context,
   i.e., across multiple invocations by the same application (as, e.g.,
   holds for web services that preserve context - locally or on the
   client side).  But stateful functions may also hold state across
   applications and possibly across different instantiations of a
   function on different compute nodes.  Such will require data
   synchronization mechanisms and the implementation of suitable data
   structure to achieve a certain degree of consistency.  The targeted
   degree of consistency may vary depending on the function and so may
   the mechanisms used to achieve the desired consistency.

Kutscher, et al.         Expires January 9, 2020               [Page 11]
Internet-Draft   Directions for Computing in the Network       July 2019

   Finally, execution platforms will require efficient resource
   management techniques to operate with different types of stateless
   and stateful functions and their associated resources, as well as for
   dynamically instantiated mobile code.  Besides the aforementioned
   location of suitable compute platforms and scheduling (possibly
   queuing) functions and function invocations, this also includes
   resource recovery ("garbage collection").

4.3.  Transport Abstractions

   When implementing Computing in the Network and building blocks such
   as function invocation it seems that IP packet processing is not the
   right abstraction.  First of all, carrying the context for some
   function invocation might require many IP packets - possibly
   something like Application Data Units (ADUs).  But even if such ADUs
   could be fit into network layer packets, other problems still need to
   be addressed, for example message formats, reliability mechanisms,
   flow and congestion control etc.

   It could be argued that today's distributed computing overlays solve
   that by using TCP and corresponding application layer formats (such
   as HTTP) - however this bears the question whether a fine-granular
   distributed computing system, aiming to leverage the network for
   certain tasks, is best served by a TCP/IP-based approach that entails
   issues such as

   o  need for additional resolution/mapping system to find IP addresses
      for functions;

   o  possible overhead for establishing TCP connections for fine-
      granular function invocation; and

   o  mismatch between TCP end-to-end semantics and the intention to
      defer next-hop selection etc. to the network.

   Moreover, some Computing in the Network applications such as Big Data
   processing (Hadoop-style etc.) can benefit significantly from data-
   oriented concepts such as

   o  in-network caching (of data objects that represent function
      parameters or results);

   o  reasoning about the tradeoffs between moving data to function vs.
      moving code to data assets; and

   o  sharing data (e.g., function results) between sets of consuming
      entities.

Kutscher, et al.         Expires January 9, 2020               [Page 12]
Internet-Draft   Directions for Computing in the Network       July 2019

   RMI systems such as RICE [RICE] [I-D.kutscher-icnrg-rice] enable
   Remote Method Invocation of ICN (data-oriented network/transport).
   Research questions include investigating how such approaches can be
   used to design general-purpose distributed computing systems.  More
   specifically, this would involve questions such as:

   o  What is the role of network elements in forwarding RMI requests?

   o  What visibility into load, performance and other properties should
      endpoints and the network have to make forwarding/offloading
      decisions?

   o  What is the notion of transport services in this concept and how
      intertwined is traditional transport with RMI invocation?

   o  What kind of feedback mechanisms would be desirable for supporting
      corresponding transport services?

4.4.  Programming Abstractions

   When creating SDKs and programming environments (as opposed to
   individual point solutions) questions arise such as:

   o  How to use concepts such as stateless functions, actor models and
      RMI in actual programs, i.e., what are minimal/ideal bindings or
      extensions to programming languages so that programmers can take
      advantage of Computing in the Network?

   o  Are there additional, potentially higher-layer, abstractions that
      are needed/useful, for example data set synchronization, data
      types for distributed computing such as CRDTs?

   In addition to programming languages, bindings, and data types, there
   is the question of execution environments and mobile code
   representation.  With the vast amount of different platforms (CPUs,
   GPUs, FPGAs etc.) it does not seem useful to assume exactly one
   environment.  Instead, interesting applications might actually
   benefit from running one particular function on a highly optimized
   platform but are agnostic with respect to platforms for other, less
   performance-critical functions.  Being able to support a
   heterogenous, evolving set of execution environments brings about
   questions such as:

   o  How to discover available platforms (and understand their
      properties)?

   o  How to specify application needs and map them to available
      platforms?

Kutscher, et al.         Expires January 9, 2020               [Page 13]
Internet-Draft   Directions for Computing in the Network       July 2019

   o  Can a certain function/application service be provided with
      different fidelity levels, e.g., can an application leverage a GPU
      platform if available and fall back to a reduced feature set in
      case such a platform is not available?

   In this context, updates and versioning could entail another
   dimension of variability for Computing in the Network:

   o  How to manage coexistence of multiple versions of functions and
      services, also for service routing and request forwarding?

   o  Is there potential for fallback and version negotiation if needed
      (considering the risk of "bidding downs" attacks?)

   o  How to retire old versions?

   o  How to securely and reliably deal with function updates and
      corresponding maintenance tasks?

4.5.  Security, Privacy, Trust Model

   Computing in the Network has interesting security-related challenges,
   including:

   o  How can a caller trust that a remove function works as expected?
      This entails several questions such as

      *  How to securely bind "function names" to actual function code?

      *  How to trust the execution platform (in its entirety)?

      *  How to trust the network that is forwards requests (and result
         messages) reliably and securely?

   o  What levels of authentication are needed for callers (assuming
      that not everybody can invoke any function)?

   o  How to authenticate and achieve confidentiality for requests,
      their parameters and result data (especially when considering
      sharing of results)?

   Many of these questions are related to other design decisions such as

   o  What kind of session concept do we assume, i.e., is there a
      concept of distributed application session that represents a trust
      domain for its members?

Kutscher, et al.         Expires January 9, 2020               [Page 14]
Internet-Draft   Directions for Computing in the Network       July 2019

   o  Where is trust anchored?  Can the system enable decentralized
      operation?

   All of these questions are not new, but conceiving networking and
   computing holistically seems to revisit distributed systems and
   network security - because some established concepts and technologies
   may not be directly applicable (such as transport layer security and
   corresponding web PKI).

4.6.  Failure Handling, Debugging, Management

   Distributed computing naturally provides different types of failures
   and exceptions.  In fine-granular distributed computing, some
   failures may by more tolerable (think microservices), i.e., platform
   crash or function abort due to isolated problems could be handled by
   just re-starting/re-running a particular function.  Similarly,
   "message loss" or incorrect routing information may be repairable by
   the system itself (after time).

   When failure cannot be repaired (or just tolerated) by the
   distributed computing framework, this raises questions such as:

   o  What are strategies for retrying vs aborting function invocation?

   o  How to signal exceptions and enable robust response to failures?

   Failure handling and debugging also has a management aspect that
   leads to questions such as:

   o  What monitoring and instrumentation interfaces are needed?

   o  How can we represent, visualize, and understand the (dynamically
      changing) properties of Computing in the Network infrastructure as
      well as of the currently running/instantiated entities?

5.  Acknowledgements

   The authors would like to thank Dave Oran, Michal Krol, Spyridon
   Mastorakis, Yiannis Psaras, and Eve Schooler for previous fruitful
   discussions on Computing in the Network topics.

6.  Informative References

   [ACTIVE]   Tennenhouse, D. and D. Wetherall, "Towards an active
              network architecture", ACM SIGCOMM Computer Communication
              Review Vol. 26, pp. 5-17, DOI 10.1145/231699.231701, April
              1996.

Kutscher, et al.         Expires January 9, 2020               [Page 15]
Internet-Draft   Directions for Computing in the Network       July 2019

   [CANARY]   Qu et al, H., "Canary -- A scheduling architecture for
              high performance cloud computing", 2016,
              <https://arxiv.org/abs/1602.01412>.

   [FLINK]    Katsifodimos, A. and S. Schelter, "Apache Flink: Stream
              Analytics at Scale", 2016 IEEE International Conference on
              Cloud Engineering Workshop (IC2EW),
              DOI 10.1109/ic2ew.2016.56, April 2016.

   [I-D.kutscher-icnrg-rice]
              Krol, M., Habak, K., Oran, D., Kutscher, D., and I.
              Psaras, "Remote Method Invocation in ICN", draft-kutscher-
              icnrg-rice-00 (work in progress), October 2018.

   [RAY]      Moritz et al, P., "Ray -- A Distributed Framework for
              Emerging AI Applications", 2018,
              <http://dl.acm.org/citation.cfm?id=3291168.3291210>.

   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
              Chaining (SFC) Architecture", RFC 7665,
              DOI 10.17487/RFC7665, October 2015,
              <https://www.rfc-editor.org/info/rfc7665>.

   [RICE]     KrA^3l, M., Habak, K., Oran, D., Kutscher, D., and I.
              Psaras, "RICE", Proceedings of the 5th ACM Conference on
              Information-Centric Networking - ICN '18,
              DOI 10.1145/3267955.3267956, 2018.

   [SAPIO]    Sapio, A., Abdelaziz, I., Aldilaijan, A., Canini, M., and
              P. Kalnis, "In-Network Computation is a Dumb Idea Whose
              Time Has Come", Proceedings of the 16th ACM Workshop on
              Hot Topics in Networks - HotNets-XVI,
              DOI 10.1145/3152434.3152461, 2017.

   [SPARROW]  Ousterhout, K., Wendell, P., Zaharia, M., and I. Stoica,
              "Sparrow", Proceedings of the Twenty-Fourth ACM Symposium
              on Operating Systems Principles - SOSP '13,
              DOI 10.1145/2517349.2522716, 2013.

Authors' Addresses

   Dirk Kutscher
   University of Applied Sciences Emden/Leer
   Constantiaplatz 4
   Emden  D-26723
   Germany

   Email: ietf@dkutscher.net

Kutscher, et al.         Expires January 9, 2020               [Page 16]
Internet-Draft   Directions for Computing in the Network       July 2019

   Teemu Kaerkkaeinen
   Technical University Muenchen
   Boltzmannstrasse 3
   Munich
   Germany

   Email: kaerkkae@in.tum.de

   Joerg Ott
   Technical University Muenchen
   Boltzmannstrasse 3
   Munich
   Germany

   Email: jo@in.tum.de

Kutscher, et al.         Expires January 9, 2020               [Page 17]