Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing. For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data to a centralized location from distinct ones. This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices and the learning algorithm run on-device, with the aim of relaxing the burden at a central entity/server. Although gossip-based approaches have been used for this purpose in different use cases, they suffer from high communication costs, especially when the number of devices is large. To mitigate this, incremental-based methods are proposed. We first introduce incremental block-coordinate descent (I-BCD) for the decentralized ML, which can reduce communication costs at the expense of running time. To accelerate the convergence speed, an asynchronous parallel incremental BCD (API-BCD) method is proposed, where multiple devices/agents are active in an asynchronous fashion. We derive convergence properties for the proposed methods. Simulation results also show that our API-BCD method outperforms state of the art in terms of running time and communication costs.
翻译:机器学习(ML)是大型数据驱动型建模和分析大规模基于智能和无处不在的物联网的智能智能智能智能智能计算机的关键技术。对于快速增长的应用和数据数量而言,分布式学习是一个充满希望的新模式,因为将数据从不同的地点共享/汇总到集中地点往往不切实际或不切实际,因为通常不切实际或效率低下。本文研究的是,在分散的系统上培训ML模型的问题,该模型的数据分布在许多用户设备上,学习算法运行在设备上,目的是减轻中央实体/服务器的负担。虽然在不同使用的情况下为此目的使用了八卦本,但它们面临着高昂的通信成本,特别是当装置数量巨大时。为了减轻这一差异,我们提出了渐进式方法。我们首先为分散的ML系统引入了递增区划协调的下降(I-BCD),这样可以降低运行时间的通信成本。为了加速趋同速度,提出了一种不同步的平行递增BCD(API-BCD)方法,而多个装置/代理人在同步方式上很活跃,特别是在装置数量庞大的情况下,它们会面临高昂的通信费用。我们模拟化的通信成本。我们为SIM-BIPI的同步计算。我们为S-RO-CFS-C-C-CFADMFMLMFS。我们提出的方法的同步计算结果。