使用有条件抽样对联合分发进行财产测试 (Property Testing of Joint Distributions using Conditional Samples)

In this paper, we consider the problem of testing properties of joint distributions under the Conditional Sampling framework. In the standard sampling model, the sample complexity of testing properties of joint distributions is exponential in the dimension, resulting in inefficient algorithms for practical use. While recent results achieve efficient algorithms for product distributions with significantly smaller sample complexity, no efficient algorithm is expected when the marginals are not independent. We initialize the study of conditional sampling in the multidimensional setting. We propose a subcube conditional sampling model where the tester can condition on an (adaptively) chosen subcube of the domain. Due to its simplicity, this model is potentially implementable in many practical applications, particularly when the distribution is a joint distribution over $\Sigma^n$ for some set $\Sigma$. We present algorithms for various fundamental properties of distributions in the subcube-conditioning model and prove that the sample complexity is polynomial in the dimension $n$ (and not exponential as in the traditional model). We present an algorithm for testing identity to a known distribution using $\tilde{\mathcal{O}}(n^2)$-subcube-conditional samples, an algorithm for testing identity between two unknown distributions using $\tilde{\mathcal{O}}(n^5)$-subcube-conditional samples and an algorithm for testing identity to a product distribution using $tilde{\mathcal{O}}(n^5)$-subcube-conditional samples. The central concept of our technique involves an elegant chain rule which can be proved using basic techniques of probability theory yet powerful enough to avoid the curse of dimensionality.

翻译：在本文中, 我们考虑在条件抽样框架内测试联合分布特性的问题。在标准抽样模型中, 联合分布的测试特性的样本复杂性在维度上呈指数指数化, 导致实际使用的算法效率低。虽然最近的结果在样本复杂性小得多的产品分销中实现了高效的算法, 但是当边际不独立时, 我们无法在多维环境下开始对有条件采样的研究。我们提出一个子管有条件采样模型, 测试者可以以( 适应的) 所选域的子立方块为条件。由于该模型的简单性, 联合分布的精度在多个实际应用应用程序中可能是指数化的, 特别是当该分布是在美元和Sigma_ 美元的基础上联合分配的, 而在子管调制模型中, 我们提出各种分布特性的精度是多元的( 而不是像传统模型那样的指数化 ) 。我们提出一种测试身份的算法, 使用 $\ cal_ cal{ ral_ ral_ (n_ x% b) 核心测试一个不为的身份标码。