The study of distribution testing has become ubiquitous in the area of property testing, both for its theoretical appeal, as well as for its applications in other fields of Computer Science as well as in various real-life statistical tasks. The original distribution testing model relies on samples drawn independently from the distribution to be tested. However, when testing distributions over the $n$-dimensional Hamming cube $\left\{0,1\right\}^{n}$ for a large $n$, even reading a few samples is infeasible. To address this, Goldreich and Ron [ITCS 2022] have defined a model called the \emph{huge object model}, in which the samples may only be queried in a few places. In this work, we initiate a study of a general class of huge object model properties, those that are invariant under a permutation of the indices of the vectors in $\left\{0,1\right\}^{n}$, while still not being necessarily fully symmetric as per the definition used in traditional distribution testing. In particular, we prove that every index-invariant property satisfying a bounded VC-dimension restriction admits a property tester with a number of queries independent of $n$. To complement this result, we argue that satisfying only index-variance or only a VC-dimension bound is insufficient to guarantee a test whose query complexity is independent of $n$. We also touch upon the question of the number of queries required for non-adaptive testing, showing that it can be at most quadratic in the number of queries required for an adaptive tester. This is in contrast with the tight (easily provable) exponential gap between adaptive and non-adaptive testers for general non-index-invariant properties.
翻译:在财产测试领域,分配测试的研究已经变得无处不在,既包括理论吸引力,也包括计算机科学其他领域的应用以及各种真实的统计任务。最初的分发测试模型依靠的是独立于要测试的分发量之外的样本。然而,当一个大美元(美元)的Hamming 立方元元值 $\ left ⁇ 0,1\right ⁇ n} 美元进行分配测试时,即使读取一些样本也是行不通的。为了解决这个问题,Goldreich和Ron[ITS 2022] 已经定义了一个叫做“emph{huge 对象模型”的复杂度模型,在这个模型中,样本只能在少数地方查询。在这项工作中,我们开始研究一个大型物体模型属性的一般类别,在以$( left ⁇ 0,1\right ⁇ n) 的矢量指数的变异度下,而对于传统的分发测试中所使用的定义,其数值不一定完全对等。特别是,我们证明,每个指数性差值的不易变度,在不易变数的数值测试结果中, 只能进行精确的VC号测试。