Learning the tail behavior of a distribution is a notoriously difficult problem. By definition, the number of samples from the tail is small, and deep generative models, such as normalizing flows, tend to concentrate on learning the body of the distribution. In this paper, we focus on improving the ability of normalizing flows to correctly capture the tail behavior and, thus, form more accurate models. We prove that the marginal tailedness of an autoregressive flow can be controlled via the tailedness of the marginals of its base distribution. This theoretical insight leads us to a novel type of flows based on flexible base distributions and data-driven linear layers. An empirical analysis shows that the proposed method improves on the accuracy -- especially on the tails of the distribution -- and is able to generate heavy-tailed data. We demonstrate its application on a weather and climate example, in which capturing the tail behavior is essential.
翻译:学习分布的尾巴行为是一个臭名昭著的困难问题。 从定义上看,尾巴的样本数量很小,而诸如正常流等深层基因化模型往往侧重于了解分布体。在本文中,我们侧重于提高正常流的能力,以正确捕捉尾部行为,从而形成更准确的模型。我们证明,自动递减流的边尾巴可以通过其基础分布边缘的尾巴来控制。这种理论洞察力引导我们找到一种基于灵活基点分布和数据驱动线性层的新型流动。一项实证分析表明,拟议方法提高了准确性,特别是分布尾巴的准确性,能够生成重尾巴数据。我们用天气和气候实例来证明,在其中捕尾巴行为至关重要。