The rat race between user-generated data and data-processing systems is currently won by data. The increased use of machine learning leads to further increase in processing requirements, while data volume keeps growing. To win the race, machine learning needs to be applied to the data as it goes through the network. In-network classification of data can reduce the load on servers, reduce response time and increase scalability. In this paper, we introduce IIsy, implementing machine learning classification models in a hybrid fashion using off-the-shelf network devices. IIsy targets three main challenges of in-network classification: (i) mapping classification models to network devices (ii) extracting the required features and (iii) addressing resource and functionality constraints. IIsy supports a range of traditional and ensemble machine learning models, scaling independently of the number of stages in a switch pipeline. Moreover, we demonstrate the use of IIsy for hybrid classification, where a small model is implemented on a switch and a large model at the backend, achieving near optimal classification results, while significantly reducing latency and load on the servers.
翻译:用户生成的数据和数据处理系统之间的鼠标竞赛目前由数据赢得。机器学习的更多使用导致处理要求进一步增加,而数据数量却在不断增长。为了赢得比赛,机器学习需要随着数据在网络的运行而应用。网络内数据分类可以减少服务器的负荷,减少反应时间,提高可缩放性。在本文中,我们引入了IIsy,使用现成网络装置以混合方式实施机器学习分类模型。IIsy针对网络分类的三大挑战:(一) 绘制网络设备分类模型(二) 提取所需的功能,以及(三) 解决资源和功能限制。IIsy支持一系列传统和堆积型机器学习模型,与交换管道中各个阶段的数量分开。此外,我们展示了使用IIsy进行混合分类的情况,在后端采用一个开关和一个大型模型,实现接近最佳的分类结果,同时大大减少服务器的拉特和载荷。