Group convolution has been widely used in order to reduce the computation time of convolution, which takes most of the training time of convolutional neural networks. However, it is well known that a large number of groups significantly reduce the performance of group convolution. In this paper, we propose a new convolution methodology called ``two-level'' group convolution that is robust with respect to the increase of the number of groups and suitable for multi-GPU parallel computation. We first observe that the group convolution can be interpreted as a one-level block Jacobi approximation of the standard convolution, which is a popular notion in the field of numerical analysis. In numerical analysis, there have been numerous studies on the two-level method that introduces an intergroup structure that resolves the performance degradation issue without disturbing parallel computation. Motivated by these, we introduce a coarse-level structure which promotes intergroup communication without being a bottleneck in the group convolution. We show that all the additional work induced by the coarse-level structure can be efficiently processed in a distributed memory system. Numerical results that verify the robustness of the proposed method with respect to the number of groups are presented. Moreover, we compare the proposed method to various approaches for group convolution in order to highlight the superiority of the proposed method in terms of execution time, memory efficiency, and performance.
翻译:为了减少革命的计算时间,人们广泛使用集团集团革命来减少革命的计算时间,这种计算时间占用了革命神经网络的大部分培训时间,然而,众所周知,许多团体大大降低了集团革命的性能。在本文件中,我们提出了称为“两层”集团革命的新的革命方法,该方法在增加集团数目和适合多GPU平行计算方面是强有力的。我们首先指出,集团革命可以被解释为对标准革命的一级组合,这是数字分析领域流行的概念。在数字分析方面,对两个层次的方法进行了许多研究,其中提出了一种解决业绩退化问题的跨集团结构,而没有令人不安的平行计算。我们为此提出了一种粗略的层次结构,它能促进集团之间的交流,而不会成为集团革命的瓶颈。我们发现,在分布式的记忆系统中可以高效率地处理由粗糙层次结构引起的所有额外工作。在数字分析中,我们用数字结果来核查拟议方法的稳健性性,将拟议的改革方法与集团的优越性化方法相比较。在集团中,提出了关于集团业绩的拟议方法与集团的顺序。