Feed-forward neural networks (NN) are a staple machine learning method widely used in many areas of science and technology. While even a single-hidden layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow using fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids non-linear fitting of neural network parameters. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of a NN. We demonstrate the approach by fitting the potential energy surface of the water molecule. Without requiring any non-linear optimization, the additive GPR based approach outperforms a conventional NN in the high accuracy regime, where a conventional NN suffers more from overfitting.
翻译:进食神经网络(NN)是一种主机学习方法,在科学和技术的许多领域广泛使用。即使单隐藏层NN是一个通用的近似器,但其表达力因使用简单的神经激活功能(如类形功能)而受到限制,这些功能通常对所有神经元都是一样的。更灵活的神经激活功能将允许使用较少的神经元和层,从而节省计算成本,提高表达力。我们表明,添加式高斯进程回归(GPR)可以用来构建每个神经元都属于个人的最佳神经激活功能。还采用了一种避免神经网络参数非线性安装的方法。由此形成的方法将线性回归的强性优势与NNN的更高表达力结合起来。我们通过匹配水分子的潜在能源表面来展示这一方法。在不要求任何非线性优化的情况下,基于添加式GPR的方法在高精度系统中超越了常规的NNN值,而常规的NNN因过度适应而受到损害。