Traffic Refinery: Cost-Aware Data Representation for Machine Learning on Network Traffic

被引:9
|
作者
Bronzino, Francesco [1 ]
Schmitt, Paul [2 ]
Ayoubi, Sara [3 ]
Kim, Hyojoon [4 ]
Teixeira, Renata [5 ]
Feamster, Nick [6 ]
机构
[1] Univ Savoie Mt Blanc, LISTIC, Annecy Le Vieux, France
[2] USC Informat Sci Inst, Los Angeles, CA USA
[3] Nokia Bell Labs, Paris Saclay, France
[4] Princeton Univ, Princeton, NJ 08544 USA
[5] Inria, Paris, France
[6] Univ Chicago, Chicago, IL 60637 USA
关键词
network systems; network traffic; QoS inference; malware detection;
D O I
10.1145/3491052
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Network management often relies on machine learning to make predictions about performance and security from network traffic. Often, the representation of the traffic is as important as the choice of the model. The features that the model relies on, and the representation of those features, ultimately determine model accuracy, as well as where and whether the model can be deployed in practice. Thus, the design and evaluation of these models ultimately requires understanding not only model accuracy but also the systems costs associated with deploying the model in an operational network. Towards this goal, this paper develops a new framework and system that enables a joint evaluation of both the conventional notions of machine learning performance (e.g., model accuracy) and the systems-level costs of different representations of network traffic. We highlight these two dimensions for two practical network management tasks, video streaming quality inference and malware detection, to demonstrate the importance of exploring different representations to find the appropriate operating point. We demonstrate the benefit of exploring a range of representations of network traffic and present Traffic Refinery, a proof-of-concept implementation that both monitors network traffic at 10 Gbps and transforms traffic in real time to produce a variety of feature representations for machine learning. Traffic Refinery both highlights this design space and makes it possible to explore different representations for learning, balancing systems costs related to feature extraction and model training against model accuracy.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] A cost-aware parallel workload allocation approach based on machine learning techniques
    Department of Computer Science, Jinan University, Guangzhou 510632, China
    不详
    不详
    Lect. Notes Comput. Sci., (506-515):
  • [22] Supervised Representation Learning for Network Traffic With Cluster Compression
    Wang, Xiaojuan
    Zhang, Yu
    He, Mingshu
    Guo, Shize
    Yang, Liu
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2024, 9 (01): : 1 - 13
  • [23] Learning Invariant Representation for Malicious Network Traffic Detection
    Bartos, Karel
    Sofka, Michal
    Franc, Vojtech
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1132 - 1139
  • [24] The Learning and Prediction of Network Traffic: A Revisiting to Sparse Representation
    Wang, Yitu
    Nakachi, Takayuki
    ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
  • [25] Applying of Machine Learning for Analyzing Network Traffic in the Conditions of an Unbalanced Data Sample
    Rzayev, Babyr
    Lebedev, Ilya
    INTELLIGENT DISTRIBUTED COMPUTING XIV, 2022, 1026 : 69 - 78
  • [26] Traffic Aware Virtual Machine Packing in Cloud Data Centers
    Liu, Jun
    Guo, Jinhua
    Ma, Di
    2016 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC), AND IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2016, : 256 - 261
  • [27] Adaptive learning on mobile network traffic data
    Liu, Zhen
    Japkowicz, Nathalie
    Wang, Ruoyu
    Tang, Deyu
    CONNECTION SCIENCE, 2019, 31 (02) : 185 - 214
  • [28] Investigation of Machine Learning Based Network Traffic Classification
    Fan, Zhong
    Liu, Ran
    2017 INTERNATIONAL SYMPOSIUM ON WIRELESS COMMUNICATION SYSTEMS (ISWCS), 2017, : 1 - 6
  • [29] Using Machine Learning to Analyze Network Traffic Anomalies
    Khudoyarova, Anastasia
    Burlakov, Mikhail
    Kupriyashin, Mikhail
    PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 2344 - 2348
  • [30] Machine learning based network traffic classification: a survey
    Shen, Y. (shenyi_1979@njau.edu.cn), 2012, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (09):