性质测试中的计算复杂性研究 (Computational Complexity in Property Testing)

We initiate a systematic study of the computational complexity of property testing, focusing on the relationship between query and time complexity. While traditional work in property testing has emphasized query complexity, relatively little is known about the computational hardness of property testers. Our goal is to chart the landscape of time-query interplay and develop tools for proving time complexity lower bounds. Our first contribution is a pair of time-query hierarchy theorems for property testing. For all suitable nondecreasing functions $q(n)$ and $t(n)$ with $t(n)\geq q(n)$, we construct properties with query complexity $\tilde{\Theta}(q(n))$ and time complexity $\tilde\Omega(t(n))$. Our weak hierarchy holds unconditionally, whereas the strong version-assuming the Strong Exponential Time Hypothesis-provides better control over the time complexity of the constructed properties. We then turn to halfspaces in $\mathbb{R}^d$, a fundamental class in property testing and learning theory. We study the problem of approximating the distance from the input function to the nearest halfspace within additive error $\epsilon$. For the distribution-free distance approximation problem, known algorithms achieve query complexity $O(d/\epsilon^2)$, but take time $\tilde{\Theta}(1/\epsilon^d)$. We provide a fine-grained justification for this gap: assuming the $k$-SUM conjecture, any algorithm must have running time ${\Omega}(1/\epsilon^{d/2})$. This fine-grained lower bound yields a provable separation between query and time complexity for a natural and well-studied (tolerant) testing problem. We also prove that any Statistical Query (SQ) algorithm under the standard Gaussian distribution requires $(1/\epsilon)^{\Omega(d)}$ queries if the queries are answered with additive error up to $\epsilon^{\Omega(d)}$, revealing a fundamental barrier even in the distribution-specific setting.

翻译：我们首次系统性地研究性质测试的计算复杂性，重点关注查询复杂性与时间复杂性之间的关系。传统性质测试研究主要强调查询复杂性，而对性质测试器的计算困难性知之甚少。我们的目标是描绘时间-查询交互的图景，并开发证明时间复杂性下界的方法。我们的首要贡献是提出了一对关于性质测试的时间-查询层次定理。对于所有满足$t(n)\\geq q(n)$的合适非递减函数$q(n)$和$t(n)$，我们构造了具有查询复杂性$\\tilde{\\Theta}(q(n))$和时间复杂性$\\tilde\\Omega(t(n))$的性质。弱层次定理无条件成立，而强版本——基于强指数时间假设——能更好地控制所构造性质的时间复杂性。接着我们转向$\\mathbb{R}^d$中的半空间，这是性质测试与学习理论中的基础类别。我们研究了在加性误差$\\epsilon$内逼近输入函数到最近半空间距离的问题。对于无分布距离逼近问题，已知算法可实现$O(d/\\epsilon^2)$的查询复杂性，但需要$\\tilde{\\Theta}(1/\\epsilon^d)$的时间。我们为这一差距提供了细粒度解释：基于$k$-SUM猜想，任何算法的运行时间必须为${\\Omega}(1/\\epsilon^{d/2})$。该细粒度下界为自然且被深入研究的（容忍性）测试问题提供了查询与时间复杂性的可证明分离。我们还证明，在标准高斯分布下，若查询应答允许高达$\\epsilon^{\\Omega(d)}$的加性误差，则任何统计查询（SQ）算法需要$(1/\\epsilon)^{\\Omega(d)}$次查询，这揭示了即使在特定分布设置中也存在根本性障碍。