IRTF Open Meeting

Monday, 25 July 2022, at 15:00 - 17:00 US/Eastern
Room: Liberty D

Chair: Colin Perkins
Minutes: Mat Ford

The main focus of the IRTF Open Meeting will be the Applied Networking
Research Prize (ANRP) award talks given by Tushar Swamy and Sam Kumar.

The ANRP is awarded to recognise the best recent results in applied
networking, interesting new research ideas of potential relevance to the
Internet standards community, and upcoming people that are likely to
have an impact on Internet standards and technologies, with a particular
focus on cases where these people or ideas would not otherwise get much
exposure or be able to participate in the discussion.

Introduction and Status Update

IRTF Chair
Slides: https://datatracker.ietf.org/meeting/114/materials/slides-114-irtfopen-irtf-open-meeting-agenda-for-ietf-114-02

Taurus: A Data Plane Architecture for Per-Packet ML

Tushar Swamy
Paper: https://dl.acm.org/doi/10.1145/3503222.3507726
Slides: https://datatracker.ietf.org/meeting/114/materials/slides-114-irtfopen-taurus-a-data-plane-architecture-for-per-packet-ml-00

Q&A
Barry Leiba (channeling Dave Oran): I assume the class of anomalies you
can detect are those that can be detected by the header fields within
the width of the ALU in the switch. Things in the packet fields beyond
the headers won't be seen, is that correct?

TS: In the case of anomaly detection we used the KDD NSL data set that
had a record of different attacks calculated from header fields, or
aggregate fields across headers. So you can create a histogram using the
MAC checking tables across packets. Packet headers are limited by packet
header vector size that's moving between stages in the switch pipeline.
You don't need to be limited to fields in the header because the control
plane can install different types of metadata in the match-action tables
and you can do your own processing in the match-action tables over time.
So, the headers are just the starting point for the features here.

George Michaelson: First comment - I suspect the paper is very important
for interpreting that last table - hard to understand the meaning of the
columns and their impact on a comparison to the baseline - a lot of
implicit knowledge in your table structure. i'm sure paper explains it.

Second point - At the start of your talk, you made a case to say that
the delay between doing a packet sample, constructing table match rules
in the controller, injecting those rules down to functional plane and
applying them had a huge packet loss and mismatch interval, but it seems
to me the delay to perform the ML operation, tune your ML, have a model
that is representative and then install that has a similar cost. It's
not to say there's no benefit of ML, I think it's huge, but the
component that's about the delay cost of doing an instantiation of rules
I don't think is a basis of doing it, I think you're on stronger ground
arguing that it's about the ability to do complex match at line rate
than the static cost of doing the rule installation.

TS: You're right about installing the model itself. The idea is that you
could be sampling packets from your network and be sending different
kinds of metadata to the control plane. Essentially doing training
offline - install model weights or replace model weights as needed.
Whatever is operating in the data plane itself has nothing to do with
the installation model itself.

GM: I thought the idea that you could do the model training
asynchronously is very beneficial. But if you consider a new class of
attack - you have to understand it and do some form of Bayesian analysis
and classification which is completely unmodelled - exactly how you do
that training, how long it takes isn't about the speed of the chipset,
it's about your ability to do the good/bad classification a priori to
inform the model and then download it, that's quite a high cost in time.

TS: Yes, so this is always the trouble with security - if you want to do
an on-the-fly analysis of a brand new attack that's not really what
we're targetting.

GM: In engineering terms your case that this is extremely fast at line
rate, was very well made and I enjoyed listening to it a lot, thankyou.

Keshamu?: Excellent work. Doing machine learning in data plane will
consume more energy, but we are trying to reduce energy consumption of
routers and switches - have you looked at this issue?

TS: I think energy consumption needs to be looked at more holistically.
While you are increasing by some small percentage the energy that you'd
be consuming in the switch itself you can consider that (if you're doing
anomaly detection) you're removing the cost of running an anomaly
detection application in software on a server somewhere else. You're
consuming less power in the switch than you would running it in software
elsewhere, so on the whole you're reducing power costs, but for the
switch itself you' be increasing it minimally.

CP: This is an IRTF meeting which is colocated with the IETF and
obviously the question is then to what extent have you given any thought
to how this might change or affect the type of work the IETF does. Are
there any implications for these types of systems for the way we design
standards or other types of protocols that the IETF designs or is this
just an optimisation that fits with the existing architecture?

TS: One of the things that Keshamu who asked a question earlier brought
to my attention earlier was what kind of standardisation is needed for
packet headers if we're going to be using them as features or carrying
model weights, doing this kind of ML-assist type of operations. So
making a cleaner definition of what has to happen at the packet
standardisation level to support this machine learning, to make it
easier for different types of ML systems to interoperate.

CP: That makes a lot of sense. Presumably there's also something in
terms of the control plane and standardised programming model for that
to specify the model, is that right?

TS: As a complement to P4 we went with map-reduce, not necessarily
married to the idea of using a map-reduce block. Bigger idea is doing
inference in the dataplane. Standardisation of P4 would help but for
map-reduce element. You could even consider an extra control block in P4
as map-reduce and we have another paper in submission on what language
level constructs would look like. So definitely an area for
standardisation as well.

Performant TCP for Low-Power Wireless Networks

Sam Kumar Paper:
https://www.usenix.org/conference/nsdi20/presentation/kumar Slides:
https://datatracker.ietf.org/meeting/114/materials/slides-114-irtfopen-performant-tcp-for-low-power-wireless-networks-01

Q&A

Matthias Waelisch: One of co-founders of RIOT. Great work, thankyou. You
argued that supporting TCP is important because it's popular. Now QUIC
is becoming popular, did you do any comparison?

SK: So we didn't do a comparison against QUIC. But it's a good point
that other transports are becoming popular. Many of the issues that we
addressed aren't specific to TCP they apply broadly to TCP and other
protocols needed for bulk transfer. For example working with hidden
terminals, playing well with link-layer scheduling apply broadly to any
protocol that's transmitting a lot of data and wants a lot of bandwidth.
So many of our conclusions apply equally well to QUIC as they do to TCP.

MW: In your paper you note that you also have an ipmlementation for do
you also plan to submit a PR to upstream the implementation?

SK: Had plans at some point but RIOT OS adopted a different TCP stack so
contributing another one seemed redundant. Recently we have contributed
our code to openthread which now uses it as its default TCP stack.

MW: I highly encourage you to submit a PR.

MW: You said that a packet is lost when fragment is lost. Depends on
fragmentation scheme right - fragment recovery means it doesn't matter
too much if fragment is lost for the whole packet.

SK: My understanding of the way 6lowpan was implemented in the operating
systems we looked at was indeed that if fragment is lost you lose the
whole packet. But agree there are protocols you can use to recover lost
fragment without losing the entire packet. Could also help with problem
of making the entire packet bigger and amortising TCP IP headers even
better.

Tommy Pauly: Thanks for the talk. I'm happy to see the use of TCP here.
You were talking about memory saving and the ability to have a flat
buffer. Are you able to guarantee that you'll never need to allocate
memory or is it like just most of the time and there's a failover case
where you do have to allocate?

SK: We ensure that you never have to dynamically allocate memory. Have a
bitmap that tracks which bits contain out-of-order data. The bitmap can
be assigned statically because it depends on the array size which is
also static.

TP: Using this to get to Internet hosts - in your tests you were testing
against end-to-end Interent connectiongs - do you need to modify the
Internet servers? Is there something that needs tuning on the Internet
hosts to ensure that they are friendly to the Lowpan devices or can you
use completely unmodified Internet hosts?

SK: Hosts on the Linux side were completely unmodified. The timing that
we adjusted for the randomised delay was not at the TCP level but at the
link-layer, so other side doesn't see any of that. Also why we used
full-scale TCP stack from FreeBSD - battle-tested in real world,
interoperable with other TCP stacks on the Internet. There are
interoperability problems in the embedded space - many embedded TCP
stacks have subtle interoperability problems - we sidestepped this issue
by using FreeBSD TCP implementation as the basis of our study.

Thomas Schmidt: From RIOT community, thankyou for using RIOT - another
encouragement to use GRC - you have a generic packet buffer there that
you could reuse and reduce your memory overhead even further - that's
just a comment. Question about your multi-hop experiments - avoiding the
hidden terminal problem. Was that in a clean environment without cross
traffic with only a single TCP connection?

SK: The hidden terminal problem effects even a single TCP connection in
isolation, we verified that our randomised backoff fixes the problem in
that case.

TS: Yes, but only in this case right? You normally have background
traffic right?

SK: If there is background traffic that's why we have randomised delays
and not fixed delays. It doesn't matter if interference is coming from
same stream or a different stream - you'll backoff a randomised amount
and hopefully transmit again without colliding. There are several
protocols we could use that look at TCP state in some way and having it
just be a randomised delay at the link-layer gives some confidence that
it would work across TCP streams regardless of the source of traffic
whether it's TCP, different TCP streams or something else.

TS: Did you consider experimenting with more flexible MAC layers than
CSMA-CA for instance the DSME MAC layer that is also supported by RIOT?

SK: No we didn't experiment with that - we looked at CSMA because it was
most commonly supported across all the operating systems and networking
protocols that we tried across TinyOS, RIOT and openthread so it seemed
most natural to focus on that.

Dave Oran: In the multi-hop environment, forwarding devices are also
very low power mode devices. Did you see that TCP traffic would put more
stress on the buffers of the forwarding multi-path wireless nodes?

SK: Buffers used at intermediate routers aren't TCP buffers just general
packet buffers used for forwarding - end-to-end TCP connection, there's
no TCP state in intermediate nodes.

DO: Sure, but TCP may put a different aggregate load on those buffers
than say CoAP traffic or something that's more simple request/response
related.

SK: Of course it's the case that when you're transmitting at higher
bandwidth you'll place more stress on the buffers at the intermediate
routers. We do a couple of things in our study to help mitigate that.
Added AQM to intermediate routers where we mark packets as congested
using ECN to prevent TCP filling up the entire buffer and keeping queues
short. Primary reason we did this was to improve fairness for TCP flows
competing for buffer space at intermediate routers and reduce latency of
traffic. Has side effect of limiting amount of buffer space a single TCP
flow can take up.

DO: Thanks I was looking for the AQM angle.

CP: I like the idea of headers spanning multiple link-layer frames. Does
this put any constraints on the link layer or does the 6lowpan layer
handle all of that?

SK: Some of these can be handled at 6lowpan layer, others do have to do
with link-layer directly. For example, randomised delays added to avoid
hidden terminals is something that operates at the link-layer. Don't
naturally have visibility into link-layer ACKs at 6lowpan layer.

CP: Are there requirements that link-layer delivers frames in order to
avoid damaging the headers or does 6lowpan handle that?

SK: Reordering and reassembly is handled by 6lowpan - there is no strict
requirement to transmit frames consecutively. One thing I skipped was a
technique we have to manage concurrent frames - how to schedule frames
when some are going to wall-powered devices and some to battery-powered
devices. We prioritise sending frames to battery-powered devices to
reduce duty cycle and allow it to sleep as soon as possible. This is one
case where we might interrupt another transmission and not send its
frames consecutively.

Gabriel Montenegro: Comment on the comparison with COAP - I think the
specification for COAP was not entirely based on not using TCP, but more
on not using HTTP. Justification was for using a RESTFUL interface for
application layer - not every application in IoT wishes to do that, but
a lot of incentive. When the RESTFUL folks started to become interested
in IOT only alternative was HTTP1.1 which I agree is terrible - text
based, can't be compressed, verbose, it's terrible. HTTP2 came along and
we had an ANRW paper about using that over 6lowpan. Now have HTTP3 and
QUIC and it's all binary, so I'd encourage looking at those layers as
would address significant portion of the application layer incentives
for IoT.

SK: I do acknowledge that COAP has evolved quite a bit - some of that
evolution took place after we published this work. Indeed I think that
COAP is useful and has its uses. I have noticed that COAP is evolving
towards the same kind of abstractions that TCP provides. An application
built on COAP with all latest features would also be wise to consider
using TCP directly given that TCP is a viable option.

Tushmin Selzanek?: You talked mainly about applications for this in LANs
- do you see applications for longer range networks like mobile adhoc
networks or things of that sort?

SK: All of our experimentation was done using IEEE802.15.4 - personal
area network protocol. Motivated by recent interest in adopting that
technology in smart home and IoT space. Some of these lessons might
carry over to mobile and adhoc networks - I'm not sure I'll be able to
say much as I don't have experience with those networks, but my first
gut reaction would be that there's probably a way to make TCP work well
given that it's been adapted to work with so many different kinds of
networks in so many different environments, other than that I'm not sure
which of the specific techniques would directly carry over.

CP: Thankyou very much. Thanks to both speakers, two really great talks.
Both Sam and Tushar will be around all week, happy to talk about their
work - please make them welcome to IETF and to IRTF. Congratulations to
both on the ANRP award. Look out for more ANRP award talks at IETF 115
in London in November. Nominations for 2023 ANRP awards will be open in
September, so if you know of good work please nominate. Look out for
ANRW which is taking place tomorrow. Thanks again everyone.

Close

Recordings of the talks, and links to the papers, will be made available
from https://irtf.org/anrp/