This paper proposes a neural architecture search (NAS) method for split computing. Split computing is an emerging machine-learning inference technique that addresses the privacy and latency challenges of deploying deep learning in IoT systems. In split computing, neural network models are separated and cooperatively processed using edge servers and IoT devices via networks. Thus, the architecture of the neural network model significantly impacts the communication payload size, model accuracy, and computational load. In this paper, we address the challenge of optimizing neural network architecture for split computing. To this end, we proposed NASC, which jointly explores optimal model architecture and a split point to achieve higher accuracy while meeting latency requirements (i.e., smaller total latency of computation and communication than a certain threshold). NASC employs a one-shot NAS that does not require repeating model training for a computationally efficient architecture search. Our performance evaluation using hardware (HW)-NAS-Bench of benchmark data demonstrates that the proposed NASC can improve the ``communication latency and model accuracy" trade-off, i.e., reduce the latency by approximately 40-60% from the baseline, with slight accuracy degradation.
翻译:本文建议使用神经结构搜索方法来进行分裂计算。 分裂计算是一种新兴的机器学习推导技术,它解决了在IoT系统中部署深层学习的隐私和隐蔽性挑战。 在分裂计算中,使用边缘服务器和通过网络的 IoT 设备对神经网络模型进行分离和合作处理。 因此,神经网络模型的架构对通信有效载荷大小、模型精确度和计算负荷产生了重大影响。 在本文件中,我们讨论了优化神经网络结构以进行分裂计算的挑战。 为此,我们建议采用NASC, 共同探索最佳模型架构和一个分离点,以便在满足长期性要求(即计算和通信的总时间比某一阈值要小)的同时实现更高的准确性。 NASC使用一次性的NAS,不需要重复模型培训来进行计算高效的结构搜索。 我们使用硬件(HW)-NAS- Bench基准数据的绩效评估表明, 拟议的NASC 能够改进“ 通信延迟性和模型精确度” 交易, 也就是说, 将最小的精确度从基线中降低大约40-60 % 的精确度, 。