A major driver behind the success of modern machine learning algorithms has been their ability to process ever-larger amounts of data. As a result, the use of distributed systems in both research and production has become increasingly prevalent as a means to scale to this growing data. At the same time, however, distributing the learning process can drastically complicate the implementation of even simple algorithms. This is especially problematic as many machine learning practitioners are not well-versed in the design of distributed systems, let alone those that have complicated communication topologies. In this work we introduce Launchpad, a programming model that simplifies the process of defining and launching distributed systems that is specifically tailored towards a machine learning audience. We describe our framework, its design philosophy and implementation, and give a number of examples of common learning algorithms whose designs are greatly simplified by this approach.
翻译:现代机器学习算法成功背后的一个主要驱动因素是,它们有能力处理越来越多的数据,因此,在研究和生产中使用分布式系统越来越普遍,作为扩大这种不断增长的数据规模的手段。但与此同时,传播学习过程可能使甚至简单的算法的实施大为复杂化。这尤其成问题,因为许多机器学习实践者在设计分布式系统方面没有很好的经验,更不用说那些具有复杂通信结构的系统。在这项工作中,我们引入了“启动”模型,这是一个简化定义和启动分配式系统的过程的编程模型,专门为机器学习受众量身定制。我们描述了我们的框架、设计理念和执行,并举了一些共同学习算法的例子,其设计因这一方法而大大简化。