In this paper, we propose an over-the-air (OTA)-based approach for distributed matrix-vector multiplications in the context of distributed machine learning (DML). Thanks to OTA computation, the column-wise partitioning of a large matrix enables efficient workload distribution among workers (i.e., local computing nodes) based on their computing capabilities. In addition, without requiring additional bandwidth, it allows the system to remain scalable even as the number of workers increases to mitigate the impact of slow workers, known as stragglers. However, despite the improvements, there are still instances where some workers experience deep fading and become stragglers, preventing them from transmitting their results. By analyzing the mean squared error (MSE), we demonstrate that incorporating more workers in the OTA-based approach leads to MSE reduction without the need for additional radio resources. Furthermore, we introduce an analog coding scheme to further enhance the performance and compare it with conventional coded multiplication (CM) schemes. Through simulations, it is shown that the OTA-based approach achieves comparable performance to CM schemes while potentially requiring fewer radio resources.
翻译:暂无翻译