Bayesian 自动编码器模型选择模式 (Model Selection for Bayesian Autoencoders)

We develop a novel method for carrying out model selection for Bayesian autoencoders (BAEs) by means of prior hyper-parameter optimization. Inspired by the common practice of type-II maximum likelihood optimization and its equivalence to Kullback-Leibler divergence minimization, we propose to optimize the distributional sliced-Wasserstein distance (DSWD) between the output of the autoencoder and the empirical data distribution. The advantages of this formulation are that we can estimate the DSWD based on samples and handle high-dimensional problems. We carry out posterior estimation of the BAE parameters via stochastic gradient Hamiltonian Monte Carlo and turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space. Consequently, we obtain a powerful alternative to variational autoencoders, which are the preferred choice in modern applications of autoencoders for representation learning with uncertainty. We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results, outperforming multiple competitive baselines.

翻译：我们开发了一种新颖的方法,通过先前的超参数优化,为巴伊西亚自动编码器(BAEs)进行模型选择。在二类最大可能性优化的常见做法及其等同于Kullback-利伯尔差异最小化的激励下,我们提议优化自动编码器输出与经验数据分配之间的分布切片-瓦瑟斯坦距离(DSWD ) 。这种配方的优点是,我们可以根据样本对DSWD进行估计,并处理高维度问题。我们通过随机梯度梯度汉密尔顿·蒙特卡洛对BAE参数进行事后估计,并通过在潜在空间安装灵活的Drichlet混合物模型,将我们的BAE转化为一种基因模型。因此,我们获得了一种强大的替代变异自动编码器的替代方法,这是现代应用自动编码器在不确定的情况下进行模拟学习的首选方法。我们用大量实验活动来评估我们的方法的质量与数量上的一些不可靠的学习任务。我们用大量试验来评估我们的方法,并表明,在小型数据系统中,我们的方法提供了先有先有的状态的多竞争基线。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

专知会员服务

39+阅读 · 2020年11月3日

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

【SIGIR2020】多检索系统的贝叶斯推理风险评估，Bayesian Inferential Risk Evaluation On Multiple IR Systems

专知会员服务

9+阅读 · 2020年6月10日

手写实现李航《统计学习方法》书中全部算法