Thursday, 18 March 2024, at 17:30-18:30 Australian Eastern Standard Time
Room: P2
Chair: Colin Perkins
Minutes: Colin Perkins
Recording: YouTube
Speaker: Colin Perkins, IRTF Chair
Colin Perkins introduced the meeting. He gave a reminder that the IRTF
is in the process of developing a code of conduct and solicited feedback on this from the group.
He also highlighted the open call for papers for the ACM/IRTF Applied
Networking Research Workshop (ANRW'24) with a deadline of 22 April 2024.
Speaker: Dongqi Han
Discussion:
Dongqi: The labelling is a key step between shift explanation and shift
adaptation. Important samples induce the shift and it's important to
filter the anomalies because the model is only trained with normal data.
In the experiments described around 5000 flows needed to be labelled in
half a year. Log anomaly detections systems have more overhead and need
more labelling but the effort to label each is simpler. SCADA systems
require significant effort.
Brian Trammell confirmed that the training data is zero positive, and
asked where the normal data was obtained from (i.e., how to ensure there
is no anomalous data in the training data)?
Dongqi: the assumption is that the training data is anomaly free and
so the data will need to be checked
Alexander Railean noted that one of the last slides showed an anomaly
being detected, an FTP server update went wrong, and asked was the
explanation determined by the system or by the human?
Dongqi: the system identifies representative samples of the problematic
traffic.
Colin Perkins also about robustness of the system to inaccuracies in
the labelling of the training data?
Dongqi: it's in the paper; the system is okay with small noise in the
data.
Hannes Tschofenig: can you say more about the experience with the SCADA
system, the similarity of the training traffic, and the amount of training
needed for this system?
Dongqi: This is a common problem for anomaly detection systems and the
algorithms can only work when normality is stable and anomalies are well
defined.
Colin Perkins noted that this meeting is co-located with the IETF, and
asked if there's anything the IETF, IRTF, or operator community could do
to help facilitate this type of research? For example, makes test data
available?
Colin Perkins thanked the speakers and reminded the group about the call
for paper for ANRW 2024.