Unsupervised learning requiring only raw data is not only a fundamental function of the cerebral cortex, but also a foundation for a next generation of artificial neural networks. However, a unified theoretical framework to treat sensory inputs, synapses and neural activity together is still lacking. The computational obstacle originates from the discrete nature of synapses, and complex interactions among these three essential elements of learning. Here, we propose a variational mean-field theory in which only the distribution of synaptic weight is considered. The unsupervised learning can then be decomposed into two interwoven steps: a maximization step is carried out as a gradient ascent of the lower-bound on the data log-likelihood, and an expectation step is carried out as a message passing procedure on an equivalent or dual neural network whose parameter is specified by the variational parameter of the weight distribution. Therefore, our framework explains how data (or sensory inputs), synapses and neural activities interact with each other to achieve the goal of extracting statistical regularities in sensory inputs. This variational framework is verified in restricted Boltzmann machines with planted synaptic weights and learning handwritten digits.
翻译:仅需要原始数据的未经监督的学习不仅是一个大脑皮层的基本功能,而且也是下一代人工神经网络的基础。 但是,仍然缺乏一个统一的理论框架来同时处理感官输入、突触和神经活动。 计算障碍源于突触的离散性质,以及这三种基本学习要素之间的复杂互动。 这里, 我们提出一个只考虑合成重量分布的变异平均场理论。 然后, 未监督的学习可以分解成两个相互交织的步骤: 最大化步骤作为数据日志相似性较低输入的梯度进行, 而期望步骤则作为信息传递程序在等同或双神经网络上进行, 其参数由重量分布的变化参数指定。 因此, 我们的框架解释数据( 或感应输入)、 神经和神经活动如何相互作用, 以便实现感应输入的统计常规目标。 这个变异性框架由限制的波尔茨曼机器和安装的数码计算机校校校校。