Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting. In recent years, the family of Any-Order Autoregressive Models (AO-ARMs) -- which includes popular models such as XLNet -- has shown breakthrough performance in arbitrary conditional tasks across a sweeping range of domains. But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. First, we show that AO-ARMs suffer from redundancy in their probabilistic model, i.e., they define the same distribution in multiple different ways. We alleviate this redundancy by training on a smaller set of univariate conditionals that still maintains support for efficient arbitrary conditional inference. Second, we upweight the training loss for univariate conditionals that are evaluated more frequently during inference. Our method leads to improved performance with no compromises on tractability, giving state-of-the-art likelihoods in arbitrary conditional modeling on text (Text8), image (CIFAR10, ImageNet32), and continuous tabular data domains.
翻译:对任意的变量子子集的有条件推断是概率推论中一个核心问题,因为隐蔽语言模型和图像涂色等重要应用,例如隐形语言模型和图像涂色。近些年来,包括XLNet等流行模型在内的任意自动递减模型(AO-ARMs)家族在一系列广泛的领域任意有条件任务中表现出了突破性表现。但是,尽管取得了成功,我们在本文件中确定了对AO-ARMs先前的配方应作出的重大改进。首先,我们表明AO-ARMs在概率模型中存在冗余,即它们以多种不同的方式界定了同样的分布。我们通过在一系列小型的单向性条件上进行培训来减轻这种冗余,这些条件仍然支持有效的任意有条件推断。第二,我们增加了在推断期间更经常评估的单性条件的培训损失。我们的方法导致在可容性方面没有妥协性地改进性,在任意的有条件模型(TAFAR10)中赋予了状态的可能性。