One of the most fundamental design choices in neural networks is layer width: it affects the capacity of what a network can learn and determines the complexity of the solution. This latter property is often exploited when introducing information bottlenecks, forcing a network to learn compressed representations. However, such an architecture decision is typically immutable once training begins; switching to a more compressed architecture requires retraining. In this paper we present a new layer design, called Triangular Dropout, which does not have this limitation. After training, the layer can be arbitrarily reduced in width to exchange performance for narrowness. We demonstrate the construction and potential use cases of such a mechanism in three areas. Firstly, we describe the formulation of Triangular Dropout in autoencoders, creating models with selectable compression after training. Secondly, we add Triangular Dropout to VGG19 on ImageNet, creating a powerful network which, without retraining, can be significantly reduced in parameters. Lastly, we explore the application of Triangular Dropout to reinforcement learning (RL) policies on selected control problems.
翻译:神经网络中最根本的设计选择之一是分层宽度:它影响到一个网络能够学习的东西的能力,并且决定了解决方案的复杂性。后一种属性在引入信息瓶颈时常常被利用,迫使一个网络学习压缩的表达方式。然而,这种结构决定通常一旦培训开始就不可改变;转换到一个更压缩的结构需要再培训。在本文中,我们提出了一个新的层次设计,称为三角下降,没有这种限制。在培训之后,该层可以任意缩小宽度,以交换狭小的性能。我们展示了这种机制在三个领域的构建和潜在使用案例。首先,我们描述了自动编码器中三角倾弃的配方,在培训后用可选压缩的模型创建。第二,我们在图像网络上将三角倾弃点添加到VGG19,创建一个强大的网络,无需再培训,就可以大大降低参数。最后,我们探索三角淡出用于强化选定控制问题(RL)的学习政策。