We propose a framework called HyperVAE for encoding distributions of distributions. When a target distribution is modeled by a VAE, its neural network parameters \theta is drawn from a distribution p(\theta) which is modeled by a hyper-level VAE. We propose a variational inference using Gaussian mixture models to implicitly encode the parameters \theta into a low dimensional Gaussian distribution. Given a target distribution, we predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(\theta). HyperVAE can encode the parameters \theta in full in contrast to common hyper-networks practices, which generate only the scale and bias vectors as target-network parameters. Thus HyperVAE preserves much more information about the model for each task in the latent space. We discuss HyperVAE using the minimum description length (MDL) principle and show that it helps HyperVAE to generalize. We evaluate HyperVAE in density estimation tasks, outlier detection and discovery of novel design classes, demonstrating its efficacy.
翻译:我们建议了一个名为 HyperVAE 的编码分布框架。 当目标分布由 VAE 进行模拟时, 其神经网络参数\theta 来自于一个以高水平VAE为模型的分布 p(\theta) 。 我们建议使用高斯混合模型进行变式推断, 将参数\theta 隐含成一个低维高斯分布。 根据一个目标分布, 我们预测了潜在代码的后端分布, 然后使用矩阵网络解码器生成后端分布 q(theta) 。 HyperVAE 可以将参数充分编码为参数\theta, 与普通的超高端网络做法形成对比, 后者仅生成比例和偏向矢量作为目标网络参数。 因此, 超升VAE 保存了更多关于潜层中每项任务模型的信息。 我们使用最小描述长度( MDL) 原则来讨论超端VAE, 并显示它有助于超度VAE 的普及。 我们评估密度估计任务、 外部探测和新设计类的发现, 展示其功效。