TY - JOUR
T1 - MARINA: Realizing ML-driven Real-time Network Traffic Monitoring at Terabit Scale
AU - Seufert, Michael
AU - Dietz, Katharina
AU - Wehner, Nikolas
AU - Geißler, Stefan
AU - Schüler, Joshua
AU - Wolz, Manuel
AU - Hotho, Andreas
AU - Casas-Hernandez, Pedro
AU - Hoßfeld, Tobias
AU - Feldmann, Anja
PY - 2024/3/27
Y1 - 2024/3/27
N2 - Network operators require real-time traffic monitoring insights to provide high performance and security to their customers. It has been shown that artificial intelligence and machine learning (ML) can improve the visibility of telemetry systems, especially with encrypted traffic. However, current solutions cannot cope with high traffic rates and volumes in large-scale networks. To realize the ML-driven network intelligence paradigm at terabit scale, we design Marina, a system that spreads monitoring over a highly efficient data plane, which can extract traffic statistics at line rate, and a powerful ML server, which can run monitoring inference using complex ML models. We apply temporal micro-aggregation into sub-second time slots and extract moment-based statistics. These allow to flexibly obtain accurate ML-based monitoring decisions during the next time slot. To demonstrate the scalability of our design, we implement and evaluate a Marina data plane prototype on a Barefoot Wedge 100BF-65X P4 switch, which can monitor more than 520,000 concurrent flows at full switching capacity of 6.4Tbps. We validate the analytics capabilities enabled by our Marina implementation for four ML-driven real-time monitoring tasks with a broad set of standard ML models, achieving comparable or better than state-of-the-art results.
AB - Network operators require real-time traffic monitoring insights to provide high performance and security to their customers. It has been shown that artificial intelligence and machine learning (ML) can improve the visibility of telemetry systems, especially with encrypted traffic. However, current solutions cannot cope with high traffic rates and volumes in large-scale networks. To realize the ML-driven network intelligence paradigm at terabit scale, we design Marina, a system that spreads monitoring over a highly efficient data plane, which can extract traffic statistics at line rate, and a powerful ML server, which can run monitoring inference using complex ML models. We apply temporal micro-aggregation into sub-second time slots and extract moment-based statistics. These allow to flexibly obtain accurate ML-based monitoring decisions during the next time slot. To demonstrate the scalability of our design, we implement and evaluate a Marina data plane prototype on a Barefoot Wedge 100BF-65X P4 switch, which can monitor more than 520,000 concurrent flows at full switching capacity of 6.4Tbps. We validate the analytics capabilities enabled by our Marina implementation for four ML-driven real-time monitoring tasks with a broad set of standard ML models, achieving comparable or better than state-of-the-art results.
KW - Network Traffic Monitoring
KW - Programmable Data Planes
KW - in-Network Machine Learning
U2 - 10.1109/TNSM.2024.3382393
DO - 10.1109/TNSM.2024.3382393
M3 - Article
SN - 1932-4537
VL - 21
SP - 2773
EP - 2790
JO - IEEE Transactions on Network and Service Management
JF - IEEE Transactions on Network and Service Management
IS - 3
ER -