We introduce a new method for neighbourhood selection in linear structural equation models that improves over classical methods such as best subset selection (BSS) and the Lasso. Our method, called KL-BSS, takes advantage of the existence of underlying structure in SEM -- even when this structure is unknown -- and is easily implemented using existing solvers. Under weaker eigenvalue conditions compared to BSS and the Lasso, KL-BSS can provably recover the support of linear models with fewer samples. We establish both the pointwise and minimax sample complexity for recovery, which KL-BSS obtains. Extensive experiments on both real and simulated data confirm the improvements offered by KL-BSS. While it is well-known that the Lasso encounters difficulties under structured dependencies, it is less well-known that even BSS runs into trouble as well, and can be substantially improved. These results have implications for structure learning in graphical models, which often relies on neighbourhood selection as a subroutine.
翻译:我们提出了一种用于线性结构方程模型中邻域选择的新方法,该方法改进了经典方法,如最优子集选择(BSS)和Lasso。我们的方法称为KL-BSS,它利用了SEM中潜在结构的存在——即使该结构未知——并且易于使用现有求解器实现。在比BSS和Lasso更弱的特征值条件下,KL-BSS能够以更少的样本可证明地恢复线性模型的支持集。我们建立了KL-BSS所实现的恢复的逐点及极小极大样本复杂度。在真实数据和模拟数据上的大量实验证实了KL-BSS带来的改进。尽管众所周知Lasso在结构化依赖下会遇到困难,但较少为人知的是,即使BSS也会陷入困境,并且可以得到显著改进。这些结果对图模型中的结构学习具有启示意义,后者通常依赖邻域选择作为子程序。