强力恢复几何人造林匹配 (Strong recovery of geometric planted matchings)

We study the problem of efficiently recovering the matching between an unlabelled collection of $n$ points in $\mathbb{R}^d$ and a small random perturbation of those points. We consider a model where the initial points are i.i.d. standard Gaussian vectors, perturbed by adding i.i.d. Gaussian vectors with variance $\sigma^2$. In this setting, the maximum likelihood estimator (MLE) can be found in polynomial time as the solution of a linear assignment problem. We establish thresholds on $\sigma^2$ for the MLE to perfectly recover the planted matching (making no errors) and to strongly recover the planted matching (making $o(n)$ errors) both for $d$ constant and $d = d(n)$ growing arbitrarily. Between these two thresholds, we show that the MLE makes $n^{\delta + o(1)}$ errors for an explicit $\delta \in (0, 1)$. These results extend to the geometric setting a recent line of work on recovering matchings planted in random graphs with independently-weighted edges. Our proof techniques rely on careful analysis of the combinatorial structure of partial matchings in large, weakly dependent random graphs using the first and second moment methods.

翻译：我们研究如何有效地恢复未贴标签的美元收集点($mathbb{R ⁇ d$)与这些点的少量随机扰动之间的匹配问题。我们考虑一种模式,将初始点(i.d.d.d)定为标准高斯矢量,通过添加i.i.d.d.d.高斯矢量(差价)和美元=(n)美元而扰动。在这个背景下,在多年度时间中,可以找到最大可能性估计点(MLE),作为线性分配问题的解决方案。我们为MLE设定了以$sigma_2$为单位的阈值,以便完全恢复配置匹配(没有错误),并大力恢复配置匹配点(为$(n)o)矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量矢量。在这两个阈值之间,我们显示MLE为美元=delta+ o(1)}第二位误差值是明确的 $delta\ in (0, 1) $。这些结果延伸到了MLE的几度线测量线, 以最近的一线定线, 以独立地设定了我们重新测量比重度对比模型结构的精度平比重方法的最近的工作线。

相关内容

极大似然估计

关注 5

极大似然估计方法（Maximum Likelihood Estimate，MLE）也称为最大概似估计或最大似然估计，是求估计的另一种方法，最大概似是1821年首先由德国数学家高斯（C. F. Gauss）提出，但是这个方法通常被归功于英国的统计学家罗纳德·费希尔（R. A. Fisher）它是建立在极大似然原理的基础上的一个统计方法，极大似然原理的直观想法是，一个随机试验如有若干个可能的结果A，B，C，... ，若在一次试验中，结果A出现了，那么可以认为实验条件对A的出现有利，也即出现的概率P(A)较大。极大似然原理的直观想法我们用下面例子说明。设甲箱中有99个白球，1个黑球；乙箱中有1个白球．99个黑球。现随机取出一箱，再从抽取的一箱中随机取出一球，结果是黑球，这一黑球从乙箱抽取的概率比从甲箱抽取的概率大得多，这时我们自然更多地相信这个黑球是取自乙箱的。一般说来，事件A发生的概率与某一未知参数theta有关， theta取值不同，则事件A发生的概率P(A/theta)也不同，当我们在一次试验中事件A发生了，则认为此时的theta值应是t的一切可能取值中使P(A/theta)达到最大的那一个，极大似然估计法就是要选取这样的t值作为参数t的估计值，使所选取的样本在被选的总体中出现的可能性为最大。

【经典书】机器学习黑客秘笈(Machine Learning for Hackers)，322页pdf

专知会员服务

46+阅读 · 2021年2月8日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日