Ubiquitous artificial intelligence (AI) is considered one of the key services in 6G systems. AI services typically rely on deep neural network (DNN) requiring heavy computation. Hence, in order to support ubiquitous AI, it is crucial to provide a solution for offloading or distributing computational burden due to DNN, especially at end devices with limited resources. We develop a framework for assigning the computation tasks of DNN inference jobs to the nodes with computing resources in the network, so as to reduce the inference latency in the presence of limited computing power at end devices. To this end, we propose a layered graph model that enables to solve the problem of assigning computation tasks of a single DNN inference job via simple conventional routing. Using this model, we develop algorithms for routing DNN inference jobs over the distributed computing network. We show through numerical evaluations that our algorithms can select nodes and paths adaptively to the computational attributes of given DNN inference jobs in order to reduce the end-to-end latency.
翻译:人工智能(AI)被认为是6G系统的关键服务之一。 人工智能服务通常依赖于需要大量计算的深神经网络(DNN)。 因此,为了支持无处不在的AI,必须提供一个解决方案,解决由于DNN的计算负担,特别是在资源有限的终端设备中。 我们开发了一个框架,将DNN的推论工作的计算任务分配给网络中拥有计算资源的节点,以减少终端设备存在有限计算能力时的推论延缓。 为此,我们建议了一个多层图形模型,通过简单的常规路由解决单 DNNN的推论工作的计算任务分配问题。我们使用这一模型,为分布式计算机网络上的DNN的推论工作制定路线推论。我们通过数字评估显示,我们的算法可以选择与给 DNNE的推论工作的计算属性相适应的节点和路径,以便减少终端到终端的拖力。