We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise with unknown statistics. Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure; that is, a structure belonging to the equivalence class containing the true tree. This paper presents a systematic improvement of Katiyar et al. (2020). First, we present a novel impossibility result by deriving a bound on the necessary number of samples for partial recovery. Second, we derive a significantly improved sample complexity result in which the dependence on the minimum correlation $\rho_{\min}$ is $\rho_{\min}^{-8}$ instead of $\rho_{\min}^{-24}$. Finally, we propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery. We provide error exponent analyses and extensive numerical results on a variety of trees to show that the sample complexity of SGA is significantly better than the algorithm of Katiyar et al. (2020). SGA can be readily extended to Gaussian models and is shown via numerical experiments to be similarly superior.
翻译:Katiyar等人(2020年)指出,尽管无法找到确切的树结构,但人们可以恢复部分树结构,即属于包含真树等同类的结构。本文系统地改进了Katiyar等人(2020年),首先,我们提出了一种新颖的不可能的结果,即根据部分恢复所需的样本数量得出一个界限。其次,我们得出了一个显著改进的样本复杂性,即对最小相关值$rho ⁇ ⁇ min}美元的依赖度是$rho ⁇ ⁇ ⁇ ⁇ -8}美元,而不是$rho ⁇ ⁇ ñ ⁇ ⁇ 24}美元。最后,我们提议对部分树的恢复采用Summmmetric化几何测算法(SGA),这是在统计上更可靠的算法。我们提供了对各种树木进行错误的推算法和大量数字结果,以表明SGA的样本复杂性大大高于Katiyar等人(2020年)的算法。SGA可以很容易扩展至Gaussian模型,并且通过数字实验显示其相似性。