Convergence bounds are one of the main tools to obtain information on the performance of a distributed machine learning task, before running the task itself. In this work, we perform a set of experiments to assess to which extent, and in which way, such bounds can predict and improve the performance of real-world distributed (namely, federated) learning tasks. We find that, as can be expected given the way they are obtained, bounds are quite loose and their relative magnitude reflects the training rather than the testing loss. More unexpectedly, we find that some of the quantities appearing in the bounds turn out to be very useful to identify the clients that are most likely to contribute to the learning process, without requiring the disclosure of any information about the quality or size of their datasets. This suggests that further research is warranted on the ways -- often counter-intuitive -- in which convergence bounds can be exploited to improve the performance of real-world distributed learning tasks.
翻译:趋同界限是获取关于分布式机器学习任务业绩信息的主要工具之一, 然后再运行任务本身。 在这项工作中, 我们进行了一系列实验, 以评估这些界限能够预测和改进分布式( 联合的) 实际世界学习任务( 联合的) 的绩效的程度和方式。 我们发现, 以获得方式可以预期的方式, 界限相当松散, 其相对规模反映的是培训, 而不是测试损失。 更出人意料的是, 我们发现, 界限中出现的一些数量对于确定最有可能对学习过程作出贡献的客户非常有用, 而不需要披露关于其数据集质量或大小的任何信息。 这表明, 有必要进一步研究如何( 往往是反直觉的) 利用趋同界限来改进实际世界分布式学习任务的绩效。