匹配问题与推土机距离的紧致配对查询下界 (Tight Pair Query Lower Bounds for Matching and Earth Mover's Distance)

How many adjacency matrix queries (also known as pair queries) are required to estimate the size of a maximum matching in an $n$-vertex graph $G$? We study this fundamental question in this paper. On the upper bound side, an algorithm of Bhattacharya, Kiss, and Saranurak [FOCS'23] gives an estimate that is within $\epsilon n$ of the right bound with $n^{2-\Omega_\epsilon(1)}$ queries, which is subquadratic in $n$ (and thus sublinear in the matrix size) for any fixed $\epsilon > 0$. On the lower bound side, while there has been a lot of progress in the adjacency list model, no non-trivial lower bound has been established for algorithms with adjacency matrix query access. In particular, the only known lower bound is a folklore bound of $\Omega(n)$, leaving a huge gap. In this paper, we present the first superlinear in $n$ lower bound for this problem. In fact, we close the gap mentioned above entirely by showing that the algorithm of [BKS'23] is optimal. Formally, we prove that for any fixed $\delta > 0$, there is a fixed $\epsilon > 0$ such that an estimate that is within $\epsilon n$ of the true bound requires $\Omega(n^{2-\delta})$ adjacency matrix queries. Our lower bound also has strong implications for estimating the earth mover's distance between distributions. For this problem, Beretta and Rubinstein [STOC'24] gave an $n^{2-\Omega_\epsilon(1)}$ time algorithm that obtains an additive $\epsilon$-approximation and works for any distance function. Whether this can be improved generally, or even for metric spaces, had remained open. Our lower bound rules out the possibility of any improvements over this bound, even under the strong assumption that the underlying distances are in a (1, 2)-metric.

翻译：估计一个$n$顶点图$G$中最大匹配的大小需要多少次邻接矩阵查询（也称为配对查询）？本文研究这一基础问题。在上界方面，Bhattacharya、Kiss和Saranurak [FOCS'23]的算法能以$n^{2-\Omega_\epsilon(1)}$次查询给出与真实值相差$\epsilon n$以内的估计，这对任意固定$\epsilon > 0$都是$n$的次二次（即矩阵规模的次线性）复杂度。在下界方面，尽管邻接表模型已取得诸多进展，但针对具有邻接矩阵查询访问权限的算法尚未建立任何非平凡下界。特别地，目前仅知的$\Omega(n)$下界属于经典结论，存在巨大空白。本文首次给出了该问题的超线性下界。事实上，我们通过证明[BKS'23]算法的最优性完全填补了上述空白。严格来说，我们证明对于任意固定$\delta > 0$，存在固定$\epsilon > 0$使得获得与真实值相差$\epsilon n$以内的估计需要$\Omega(n^{2-\delta})$次邻接矩阵查询。我们的下界对估计分布间推土机距离也具有深刻意义。针对该问题，Beretta和Rubinstein [STOC'24]提出了$n^{2-\Omega_\epsilon(1)}$时间算法，可获得加法$\epsilon$近似且适用于任意距离函数。此界限能否在一般情况甚至度量空间中得到改进，此前一直悬而未决。我们的下界排除了改进此界限的可能性，即使强假设底层距离属于(1,2)-度量空间亦然。