In this work, we present an alternative to conventional residual connections, which is inspired by maxout nets. This means that instead of the addition in residual connections, our approach only propagates the maximum value or, in the leaky formulation, propagates a percentage of both. In our evaluation, we show on different public data sets that the presented approaches are comparable to the residual connections and have other interesting properties, such as better generalization with a constant batch normalization, faster learning, and also the possibility to generalize without additional activation functions. In addition, the proposed approaches work very well if ensembles together with residual networks are formed.
翻译:在这项工作中,我们提出了一种传统剩余连接的替代方法,这种替代方法受最大网驱的启发,这意味着我们的方法不是增加剩余连接,而是仅仅传播最大值,或者在泄漏的配方中传播两者的一定百分比。 在我们的评估中,我们在不同的公共数据集中显示,所提出的方法与剩余连接相仿,具有其他有趣的特性,例如更好地概括,使批次不断正常化,更快地学习,以及在不增加激活功能的情况下推广。 此外,如果与剩余网络组合在一起,拟议的方法效果很好。