Learning, taking into account full distribution of the data, referred to as generative, is not feasible with deep neural networks (DNNs) because they model only the conditional distribution of the outputs given the inputs. Current solutions are either based on joint probability models facing difficult estimation problems or learn two separate networks, mapping inputs to outputs (recognition) and vice-versa (generation). We propose an intermediate approach. First, we show that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model. This connection allows to interpret DNN as a probabilistic model of the output and all hidden units given the input. Second, we propose that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the recognition and generator network should be related by transposition. We demonstrate in a tentative experiment that such a coupled pair can be learned generatively, modelling the full distribution of the data, and has enough capacity to perform well in both recognition and generation.
翻译:考虑到数据的全面分布(称为基因化),在深神经网络(DNNs)中,学习(考虑到数据的全面分布)是不可行的,因为它们只是模拟有条件地分配投入产出的模型。目前的解决方案要么基于面临难以估计问题的联合概率模型,要么基于两个不同的网络,即对产出(识别)和反向(生成)的输入进行绘图。我们建议一种中间办法。首先,我们表明,在DNs中,用后勤类样活化进行预先计算,相当于在直接的概率多层模型中简化的巴伊西亚近似推理。这种连接可以将DNe解释为输出的概率模型和输入的所有隐藏单位。第二,我们提议,为了使识别和生成网络与数据的联合模型更加一致,识别和生成网络的权重应该通过转换联系起来。我们在一项试验性试验中表明,这种结合的对子可以进行基因化学习,模拟数据的全面分布,并有足够的能力在识别和生成两方面进行良好的工作。