Variational autoencoders (VAEs) are susceptible to adversarial attacks. An adversary can find a small perturbation in the input sample to change its latent encoding non-smoothly, thereby compromising the reconstruction. A known reason for such vulnerability is the latent space distortions arising from a mismatch between approximated latent posterior and a prior distribution. Consequently, a slight change in the inputs leads to a significant change in the latent space encodings. This paper demonstrates that the sensitivity around a data point is due to a directional bias of a stochastic pullback metric tensor induced by the encoder network. The pullback metric tensor measures the infinitesimal volume change from input to latent space. Thus, it can be viewed as a lens to analyse the effect of small changes in the input leading to distortions in the latent space. We propose robustness evaluation scores using the eigenspectrum of a pullback metric. Moreover, we empirically show that the scores correlate with the robustness parameter $\beta$ of the $\beta-$VAE.
翻译:易变自动编码器(VAEs) 容易受到对抗性攻击。 对手可以在输入样本中找到一个微小的扰动, 以改变其潜在编码, 不移动, 从而损害重建。 这种脆弱性的一个已知原因是, 潜在潜在后层和先前分布之间的不匹配导致的潜在空间扭曲。 因此, 输入的微小变化导致潜在空间编码发生重大变化。 本文表明, 数据点周围的敏感度是由于由编码器网络导引导的随机回拉强器方向偏差造成的。 拉回指标拉强度测量从输入到潜在空间的无限量变化。 因此, 可以将这种脆弱性视为分析输入小变化导致潜在空间扭曲效应的透镜。 我们建议使用拉回指标的eigenspectrum 进行稳健性评价分数。 此外, 我们从经验上表明, 得分数与 $\beta- $ VAE 的坚固度参数 $\ beta$有关。