Support for Machine Learning (ML) applications in networks has significantly improved over the last decade. The availability of public datasets and programmable switching fabrics (including low-level languages to program them) present a full-stack to the programmer for deploying in-network ML. However, the diversity of tools involved, coupled with complex optimization tasks of ML model design and hyperparameter tuning while complying with the network constraints (like throughput and latency), put the onus on the network operator to be an expert in ML, network design, and programmable hardware. This multi-faceted nature of in-network tools and expertise in ML and hardware is a roadblock for ML to become mainstream in networks, today. We present Homunculus, a high-level framework that enables network operators to specify their ML requirements in a declarative, rather than imperative way. Homunculus takes as input, the training data and accompanying network constraints, and automatically generates and installs a suitable model onto the underlying switching hardware. It performs model design-space exploration, training, and platform code-generation as compiler stages, leaving network operators to focus on acquiring high-quality network data. Our evaluations on real-world ML applications show that Homunculus's generated models achieve up to 12% better F1 score compared to hand-tuned alternatives, while requiring only 30 lines of single-script code on average. We further demonstrate the performance of the generated models on emerging per-packet ML platforms to showcase its timely and practical significance.
翻译:过去十年来,网络中机器学习(ML)应用的支持有了显著改善。公共数据集和可编程的转换结构(包括用于编程的低语言)对程序程序员来说是一个完整的障碍,供在网络内部署ML。然而,所涉工具的多样性,加上ML模型设计和超参数调的复杂优化任务,再加上ML模型设计和超分仪的复杂优化任务,同时遵守网络的制约因素(如吞吐量和延缓度),将网络操作员作为ML、网络设计和可编程硬件的专家。网络内工具和专门知识的多面性性质是ML在网络中成为主流的障碍。我们介绍Homunculus,这是一个高层次框架,使网络操作员能够以宣导而不是迫不得已的方式具体指定ML的要求。Homunculus将投入、培训数据及伴随的网络限制,并且将一个合适的模型自动生成和安装在基础的转换硬件上。它进行模型设计空间的探索、培训和平台生成,将ML的代码生成作为编程中的模型,同时将显示我们的平均网络应用在12个标准级的高级模型上,将显示我们生成的单一的ML。