We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^p$, Wasserstein, Kolmogorov-Smirnov, and other common distances between probability distributions. Specifically, under a range of smoothness assumptions on the population and outlier distributions, we show that a re-scaled thresholding wavelet series estimator achieves minimax optimal convergence rates under a wide variety of losses. Finally, based on connections that have recently been shown between nonparametric density estimation under IPM losses and generative adversarial networks (GANs), we show that certain GAN architectures also achieve these minimax rates.
翻译:我们研究了休伯污染模型中非对称密度估计的最小趋同率,其中一部分数据来自未知的外部分布。我们提供了大量损失(称为Besov综合概率度量(IPMs))下这一问题的第一批结果,其中包括$mathcal{L ⁇ p$、Wasserstein、Kolmogorov-Smirnov,以及概率分布之间的其他常见距离。具体地说,在一系列关于人口和外部分布的平稳假设下,我们显示一个重新标定的临界值波子序列估计器在各种损失下达到了最小最大最佳趋同率。最后,根据最近在IPM损失和基因对抗网络下显示的非对称密度估计值(GANs)之间的连接,我们显示某些GAN结构也达到了这些微缩式速度。