Skip to main content

Minutes IETF117: coinrg
minutes-117-coinrg-01

Meeting Minutes Computing in the Network Research Group (coinrg) RG
Date and time 2023-07-25 16:30
Title Minutes IETF117: coinrg
State Active
Other versions markdown
Last updated 2023-08-19

minutes-117-coinrg-01

Date: Tuesday, 25 July 2023 -- Session I
Time: 09:30 - 11:30 PT (San Francisco) -- 120 mins

Chairs: J/E/M
Jianfei (Jeffrey) He jefhe@foxmail.com
Eve Schooler eve.schooler@gmail.com
Marie-Jose Montpetit
marie@mjmontpetit.com

Join via Meetecho:
https://meetings.conf.meetecho.com/ietf117/?session=30628
Materials:
https://datatracker.ietf.org/meeting/117/session/coinrg
Shared Notetaking: https://notes.ietf.org/notes-ietf-117-coinrg
Chat room: https://zulip.ietf.org/#narrow/stream/coinrg

Available post session:
Recording: http://www.meetecho.com/ietf116/recordings#COINRG

Notetakers: Ryo Yanagida, (Your name here!)

1. Chair Update (J/E/M) - 10 mins

  • Notewell
  • Call for Notetakers
  • IRTF Policy reminder
  • Agenda

2. Revisit of COIN with the advances of the last 4 years (30 mins)

(Marie-José Montpetit)

This is from the perspective as a co-chair as well as someone being
active in this area.
Will try to address what really is the compute in the network?

Past 4 years saw a uptake in programmable devices, more edge compute,
more AI etc.
Acknowledgement: David Clark (MIT), Francois-Xavier Devailluy (MILA),
and Noa Zilberman(Oxford)

5G brought connectivity to 'everything' 'everywhere'. This includes more
compute using those.
As the 'next G' arrives, we are expecting more autonomous, compute in
network for the particular network itself.

Rise of IoT with containers and orchestration for local.
We are seeing rise in edge/cloud continuoum, i.e. compute moves around,
they co-exist.

The primary goals: lower latency, higher reliability, and
sustainability.

Data is becoming more valuable than ever, but standardisation and
storage remains an issue due to how heterogeneous the data systems are.
Lack of common format, standard for this remains a challenge.

Rise of AI for the Network:
While AI is helpful for the network, networks helps AI.

Challenges:

  • Efficiency — is it more efficient to have AI on the edge? etc.
  • Data vs. physical models
  • Functional distribution

Digital Twins in Networking:
Twins of networked system and nodes, this can enable testing without the
presence of the whole network by sending the twins and putting the DUT
in the environment w/ the twin.

Movement towards 'de-cloudization' this comes from IoT/edge area.
Effectively a generalised container and enables 'a network of neural
network'

Opportunities for COIN/IRTF research;
AI/ML, AR and federated learning

Conclusion:
We've gone beyond just programmable networking devices, a whole
programmable framework for advanced IoT and data driven network
automation. We want edge-cloud support for advanced network services and
applications based on an enabling AI/ML.

Q&A (and comments)

David Oran: see zulip chat session.

Dean Bongdanovic: Background in embedded world where a single system
handles everything, not sending things back and forth. In embedded
systems, not everything has to be processed away, nor in central
location, but just with neighbours. We currently don't have a very good
way of doing this. There are definitely interesting work in distributed
compute and connectivity issues.
Marie: I understand; and there are issues with sensors, formats
incompatibility, etc. And yes, there are lot to do. And they are all
related to COIN, so this is the right place to bring this up.

Colin Perkins: Good review of the area; I think we need a good focus. It
would be good to check the vision is correct, and is achievable.
Marie: What's your suggestion for the future
Colin: I'd encourage the chairs and the group to not only focus the
draft and things that are ready to finished but to focus on the vision
and directions.
Chairs: Agreed, we'll take the discussion offline.

3. Research topics (20 mins each)

3.1 Oakestra: A Lightweight Hierarchical Orchestration Framework for Edge Computing

(Nitinder Mohan, Technical University Munich (TUM))
https://www.usenix.org/conference/atc23/presentation/bartolomeo

  • COIN could mean many things; some thinks programmable switches, some
    thinks about little comptuers spread around the network.

    • This focuses on distributed compute and edge compute, smart
      infrastructure building etc.
  • Edge computing could mean a lot and are often highly heterogeneous:

    • Could be of varying sizes
    • Has varying processing capabilities
    • varying connectivity
    • or owned by various operators in various environments\
    • and these have virtualisation/runtime mechanisms that are wildly
      different
  • Applications for edge compute are often the cases where it needs to
    be offloaded but not too 'far'

    • e.g. image recognition on vehicle
    • These require coordination of various resources like
      microservices and pipelines of these microservices
  • The microservices that needs to be coordinated can be vastly
    complex.

  • On top of this, they need to exist somewhere to actually run. This
    pose challenges orchestrating them.

    • Kubernetes family exists in cloud
  • Now, in the edge, everything is heterogeneous.

    • This presents challenges in management
    • Unlike data centre, they are everywhere w/ varying distance
    • Under the management of different operators/management
    • The resource limits are much tighter
    • Many kubernetes like mechanism exists to manage these Edge
      compute specific issues but not quite 'there'
  • Oakestra aims to address these Edge Compute specific requirements
    mentioned previously in orchestration

    • Oakestra consists of root orchestrator, and localised individual
      clusters
    • These clusters contains worker node

      • This consists of node engine, networking manager, and
        execution runtime
      • Node engine manages the node; keeps tracks of what is
        running, the condition of the node etc.
      • Network manager looks after the comms; it creates and
        manages tunnels, name spaces etc.
    • A Cluster can be composed of multiple workers

    • Cluster orchestrator for a particular cluster manages the
      workers in the cluster by:

      • aggregates the information
      • keeping track of tasks etc.
    • Individual clusters can be administered by different entities

      • These clusters provides aggregate reports to the root
        orchestrator but they don't have to be aware of the other
        cluster
      • Root orchestrator can see what's going on just enough to see
        how to distribute workload, and manage the load of
        individual clusters (abstraction)
  • These components are described in JSON format

    • These specification can be pushed to the root orchestrator

      • these are then issued to the clusters
    • The root orchestrator delegates tasks to a cluster based on the
      knoledge of the aggregate cluster information at the time

    • The respective cluster orchestrator then distributes to
      appropriate worker accordingly to the knowledge those individual
      cluster knows about the workers
    • If that is not possible, it could be re-allocated
  • Implementation:

    • Opensource
    • Python and GoLang
    • Modular design (possible extensions include custom scheduling)
    • light weight
  • Performance:

    • Orkestra requires much smaller overhead

Q&A and Comments

Eve: How is MQTT used for inter/intra cluster communciation?
Nitinder: Actually inter cluster is RESTful as this is low-rate
Diego Lopez: You showed the heterogeneous domain. Do you expect the
cluster to belong to one and only one root orchestrator?
Nitinder: it is designed for the cluster to work for multiple root
orchestrator, as it aggregates and provides the 'overview' of the
cluster and lets cluster to manage its worker nodes
Diego: how much overhead is used for the overlay networking?
Nitinder: It is minimal; we use lightweight tunnelling
Eve: We should take the rest to the list

3.2. Attached Share-Nothing EdgeAI using SDN Pipelines: Explained by mobility geolocation network functions

(Sharon Barkai)

  • Cloud uses network as purely 'bus'
  • Network cloud purely thinks about computing in network elements
  • Things like CDN, Service mesh, MEC, and NFC are somewhere in between
    (CoIN Synergy)

    • Generative EdgeAI pipelines is the addition to this
  • Shared cloud is a big computer; assumes big shared data plane
    co-located capacity, and stateful

  • Share nothing cloud is a network

  • NFV-SFC e.g.

    • These are set of network functions that are implemented as
      processes
    • Dynamic pipeline
  • W/ LISP's network-based mobility mechanism, it masks actual
    geo-location IP as it uses logical identifier IP

  • GenAI:

    • For a given input language, an output is provided via pipelines
      of processing
    • No one model does everything; this leads to pipeline of models
    • Issue 1: it's slow — at least 100ms, not fast enough for any
      realtime vision
    • Issue 2: computationally heavy weight
  • Example: Semantic vCam

    • 'Camera over the city' which is a virtual camera actually fed by
      the vehicles on grownd moving around
    • Images are processed via pipelined AI models to process
      segmentation, labelling, and localisation
    • vehicle ahead of other can provide foresight to the following
      vehicle (this gives time for AI pipelines to process before the
      following vehicle arrives?)
    • Model itself is stateless
    • This could then be expanded for heterogeneous camera, vehicles
      in non urban environment

Q&A and Comments

Eve: You mentioned multipe PoCs, what are the toolkit that you are
using/reusing?
Sharon:
Eve: Do take a look at zulip chat session for a relevant conversation

4. RG drafts (10 mins)

(Jungha Hong, ETRI)

4.1 Use case analysis

draft-irtf-coinrg-use-case-analysis
https://datatracker.ietf.org/doc/html/draft-irtf-coinrg-use-case-analysis-01

  • Took over from Ike
  • Separating the analysis away from the descriptions of use-cases are
    valuable

    • This actually presents more general research direction for COIN
    • Proposes to change the title to ‘Research challenges of
      computing in the network’ or ‘Opportunities and Challenges of
      computing in the network’
    • Asks to suggest a better title if anyone has any
  • Current status:

    • The research questions are provided through a layered
      categorisation and does not have requirements and opportunities
  • Plan:

    • Change the title
    • Widen the scope of the draft for more general research question
    • Requirements may need to be removed from the document
    • Calls fore more contributors; at least two more

4.2 Terminology

draft-irtf-coinrg-coin-terminology
https://datatracker.ietf.org/doc/html/draft-irtf-coinrg-coin-terminology-01

  • This was part of usecase as there were many COIN specific
    terminologies

    • This adds a new term 'COIN Function'
  • The draft aims to:

  • Future plans:

    • Q: Should this be a living collections of terminology? (i.e.
      maintain the current version instead of publishing) Or should
      this be published now?

Q&A

Dean: It's good to have one definitive set but terminology is a living
thing... RG needs to be ready to produce updates without completely
writing a new one. I prefer to keep the usecase draft. And keep the
research/opportunities draft separate from usecase
Eve: There is a Usecase draft and usecase-analysis
Dean: Understood, I missed that
Colin: Having common terminology document is useful. It makes more sense
if the group is trying to build a system. It can help in the research if
the group as a whole is trying to adopt them. Without either of those
happening, the usefulness can diminish. Is it getting a traction?
Jungha: This is to gain the common understanding within the group as we
discuss together
Eve: Agreed. The document perhaps should remain as draft, i.e.
living-thing as discussion matures
Colin: perhaps the RG should see if these term usage gains tractions

5. Individual Drafts (10 mins)

5.1 An Evolution of Cooperating Layered Architecture for SDN (CLAS) for Compute and Data Awareness

(Luis M. Contreras, Telefónica)
https://datatracker.ietf.org/doc/draft-contreras-coinrg-clas-evolution/

  • Proposes a layred control architecture between transport and
    functions
  • Original design consisted of service and transport stratum
  • adds connectivity stratum and compute stratum in place of transport
  • Changes:

    • Renames the plane from learning plane to telemetry plane
    • Improved the content in the telemetry plane in terms of research
      discussions
    • Adds cloud-edge continuum and network-application integration
  • Next steps:

    • We need a better scope for the draft
    • We need feedback from the RG
    • Will continue to work towards 118 meeting
      ### Q&A {#qa-1}

      Marie-Jose: Interesting work but I wonder why this approach is
      any different from another layered architecture that has the
      similar elements?
      Luis: The interplay between the different elemenents are the
      distinguishing factor; this would be a good area to clarify.

Dean: Telemetry may not be the best term, as it's both part of
data/control plane
Luis: it's been difficult to name that plane, it's a name to describe a
set of data to control and manage the system
Dean: Telemetry can mean different things, [so probably best to
clarify?]

Diego: Comment: We are working on a draft for QIRG, bringing the current
experiences in QKD to a generalized approach to quantum networks, and we
are planning to use this evolved CLAS concept

5.2 COIN Security

(Pascal Urien, Telecom Paris)
https://datatracker.ietf.org/doc/draft-urien-coin-sec-01
Note: Updated Identity section re single/multi tenants issue.

  • COIN involves compute device on the network naturally
  • This raises a need for security

    • This includes both infrastructural security
    • And program security of the COIN System
  • Intrinsic COIN security

    • COIN comms should be fully encrypted
    • COIN might need billing mechanism built-in
    • We'll need a key management system
      • if we need a centralised architecture, we'll need
        Authentication center that proivdes KMS
  • COIN program security

    • We need program itself to be secure;
      • programs could have varying security requirements
      • How do we measure the usage to calculate fees?
  • Identity

  • single/multiple tenant distributed assym. architecture.
  • single/multiple tenant centralised assym. architecture.
    • AC provides credentials to PNDs, user uses ticket/token to use
      the PND

Q&A

Marie-Jean: Your system doesn't seem to have anything coming from the
user other than the tokens. It seems like a very operator centric view
of the system.
Pascal: This is for a muli-tenant system and user is a 'generic term'
Marie-Jean: The draft probably needs updated for more clear definitions
of the terms like 'user' and should discuss in the list further.

6 RG topics (10 mins each)

Interim meeting?

IETF 118, Prague, Nov 4-10
Schedule the session orthogonal to the NFV SDN conference
(Nov 7-9,2023 – Dresden, Germany)

Closing:

Marie-Jean: Will take Colin's feedback, we'll plan something towards end
of Sept. We'll try to avoid clash with a NFV-SDN conference in Dresden
Eve: Thanks everyone for joining from all the different timezones.

Total: 120 minutes