We provide high-probability sample complexity guarantees for exact structure recovery and accurate predictive learning using noise-corrupted samples from an acyclic (tree-shaped) graphical model. The hidden variables follow a tree-structured Ising model distribution, whereas the observable variables are generated by a binary symmetric channel taking the hidden variables as its input (flipping each bit independently with some constant probability $q\in [0,1/2)$). In the absence of noise, predictive learning on Ising models was recently studied by Bresler and Karzand (2020); this paper quantifies how noise in the hidden model impacts the tasks of structure recovery and marginal distribution estimation by proving upper and lower bounds on the sample complexity. Our results generalize state-of-the-art bounds reported in prior work, and they exactly recover the noiseless case ($q=0$). In fact, for any tree with $p$ vertices and probability of incorrect recovery $\delta>0$, the sufficient number of samples remains logarithmic as in the noiseless case, i.e., $\mathcal{O}(\log(p/\delta))$, while the dependence on $q$ is $\mathcal{O}\big( 1/(1-2q)^{4} \big)$, for both aforementioned tasks. We also present a new equivalent of Isserlis' Theorem for sign-valued tree-structured distributions, yielding a new low-complexity algorithm for higher-order moment estimation.
翻译:我们为精确的结构恢复和准确预测学习提供了高概率抽样复杂性保证,使用环状(树形)图形模型的噪音干扰样本(2020年)进行精确的结构恢复和准确预测学习。隐藏变量遵循树结构的Ising模型分布模式,而可观测变量则由一个二进制的对称频道生成,将隐藏变量作为输入(以某种恒定概率独立计算每个位数,以某种恒定概率[0.1/2)美元 。在没有噪音的情况下,Bresler和Karzand(202020年)最近对Ising模型的预测学习进行了研究;本文量化了隐藏模型中的噪音如何影响结构恢复和边际分布估计的任务,证明了样本复杂性的上限和下限。我们的结果将先前工作中报告的状态和艺术界限作为输入(美元=0美元) 。事实上,对于任何具有美元脊椎的树和不正确回收概率的概率(美元=delta>0美元),新的样本数量仍然对无噪音案例(美元)进行对数值估计。 美元=========O=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx