Dropout has recently emerged as a powerful and simple method for training neural networks preventing co-adaptation by stochastically omitting neurons. Dropout is currently not grounded in explicit modelling assumptions which so far has precluded its adoption in Bayesian modelling. Using Bayesian entropic reasoning we show that dropout can be interpreted as optimal inference under constraints. We demonstrate this on an analytically tractable regression model providing a Bayesian interpretation of its mechanism for regularizing and preventing co-adaptation as well as its connection to other Bayesian techniques. We also discuss two general approximate techniques for applying Bayesian dropout for general models, one based on an analytical approximation and the other on stochastic variational techniques. These techniques are then applied to a Baysian logistic regression problem and are shown to improve performance as the model become more misspecified. Our framework roots dropout as a theoretically justified and practical tool for statistical modelling allowing Bayesians to tap into the benefits of dropout training.
翻译:最近,辍学现象已成为一种强大而简单的方法,用于培训神经网络,防止神经网络通过随机忽略的神经元进行共同适应;目前,辍学现象并非基于明确的模型假设,迄今为止,这些假设使Bayesian模型无法采用。我们利用Bayesian entropic推理表明,辍学可以被解释为制约下的最佳推论。我们用一种分析可导引回归模型来证明这一点,这种模型为Bayesian人提供了一种解释,说明其常规和预防共同适应的机制及其与其他Bayesian技术的联系。我们还讨论了将Bayesian人辍学应用于普通模型的两种一般近似技术,一种基于分析近似法,另一种基于随机变异技术。这些技术随后被应用于一个Baysian物流回归问题,并证明随着模型的描述更加错误,可以提高绩效。我们的框架根辍学现象是一种理论上合理和实用的统计模型工具,使Bayesian人能够利用辍学培训的好处。