In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input. We show that the convexity constraints can be enforced on both fully connected and convolutional layers, making them applicable to most architectures. The convexity constraints include restricting the weights (for all but the first layer) to be non-negative and using a non-decreasing convex activation function. Albeit simple, these constraints have profound implications on the generalization abilities of the network. We draw three valuable insights: (a) Input Output Convex Neural Networks (IOC-NNs) self regularize and reduce the problem of overfitting; (b) Although heavily constrained, they outperform the base multi layer perceptrons and achieve similar performance as compared to base convolutional architectures and (c) IOC-NNs show robustness to noise in train labels. We demonstrate the efficacy of the proposed idea using thorough experiments and ablation studies on standard image classification datasets with three different neural network architectures.
翻译:在本文中,我们调查了神经网络的受限配制,其输出是输入的同流函数。我们展示了三点有价值的洞察力:(a) 输入的同流神经网络(IOC-NN)自我规范,并减少过装问题;(b) 尽管受到很大限制,但是它们比基本多层透视器的功能要优于基本多层,并取得与基本革命结构相似的性能;(c) 国际海洋学委员会-NNUS显示在火车标签上对噪音的稳健性。我们利用对三个不同神经网络结构的标准图像分类数据集的彻底试验和对比研究,展示了拟议构想的有效性。