Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e.g., TensorFlow, CNTK, and Theano). However, existing toolkits - both static and dynamic - require that the developer organize the computations into the batches necessary for exploiting high-performance algorithms and hardware. This batching task is generally difficult, but it becomes a major hurdle as architectures become complex. In this paper, we present an algorithm, and its implementation in the DyNet toolkit, for automatically batching operations. Developers simply write minibatch computations as aggregations of single instance computations, and the batching algorithm seamlessly executes them, on the fly, using computationally efficient batched operations. On a variety of tasks, we obtain throughput similar to that obtained with manual batches, as well as comparable speedups over single-instance learning on architectures that are impractical to batch manually.
翻译:PyTorrch、DyNet和Claxer等动态神经网络工具包为实施与静态申报计算(如TensorFlow、CNTK和Theano)操作的工具包(如TensorFlow、CNTK和Theano)相比,能够应对不同尺寸和结构的数据的模型提供了更大的灵活性。然而,现有工具包(既有静态工具,也有动态工具)要求开发者将计算结果组织成利用高性能算法和硬件所必需的批量。这一批量任务一般很困难,但随着结构的复杂,它成为了一大障碍。在本文件中,我们为自动批量作业提出了一个算法,并在DyNet工具包中加以实施。开发者简单地将微型批量计算作为单例计算的总合,而批量算则在飞行上无缝地执行。在一系列任务中,我们获得了类似于手工批量的批量的批量量的批量,以及在单项学习中可比的速率。