交通流量精炼:网络流量机器学习的成本-软件数据代表 (Traffic Refinery: Cost-Aware Data Representation for Machine Learning on Network Traffic)

Network management often relies on machine learning to make predictions about performance and security from network traffic. Often, the representation of the traffic is as important as the choice of the model. The features that the model relies on, and the representation of those features, ultimately determine model accuracy, as well as where and whether the model can be deployed in practice. Thus, the design and evaluation of these models ultimately requires understanding not only model accuracy but also the systems costs associated with deploying the model in an operational network. Towards this goal, this paper develops a new framework and system that enables a joint evaluation of both the conventional notions of machine learning performance (e.g., model accuracy) and the systems-level costs of different representations of network traffic. We highlight these two dimensions for two practical network management tasks, video streaming quality inference and malware detection, to demonstrate the importance of exploring different representations to find the appropriate operating point. We demonstrate the benefit of exploring a range of representations of network traffic and present Traffic Refinery, a proof-of-concept implementation that both monitors network traffic at 10 Gbps and transforms traffic in real time to produce a variety of feature representations for machine learning. Traffic Refinery both highlights this design space and makes it possible to explore different representations for learning, balancing systems costs related to feature extraction and model training against model accuracy.

翻译：网络管理往往依靠机器学习来预测网络交通的性能和安全性能。通常,网络交通的体现与模型的选择一样重要。模型所依赖的特点和这些特点的表述方式最终决定了模型准确性,以及模型在哪些地方和是否实际部署。因此,这些模型的设计和评价最终不仅需要了解模型准确性,而且需要了解与在操作网络中部署模型有关的系统成本。为此,本文件开发了新的框架和系统,以便共同评价机器学习绩效(例如模型准确性)的传统概念和网络交通不同表现的系统成本。我们强调两个实用网络管理任务的两个层面,即视频流质量推断和恶意检测,以表明探索不同表述方式以找到适当的运行点的重要性。我们展示了探索网络交通和当前交通精度模型的一系列表述方式的好处。一个验证性的实施方法,即监测10千兆字节的网络交通流量,并实时将交通量转换为不同网络交通的系统水平成本。我们强调这两个层面的这两个层面,即两个层面的网络管理任务,即视频流质量和恶意检测,以显示不同特征展示方式进行空间学习的进度,以便进行不同的空间定位设计。

相关内容

Networking

关注 0

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

专知会员服务

171+阅读 · 2020年5月10日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

117+阅读 · 2020年3月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日