This paper investigates in which cases continuous optimization for directed acyclic graph (DAG) structure learning can and cannot perform well and why this happens, and suggests possible directions to make the search procedure more reliable. Reisach et al. (2021) suggested that the remarkable performance of several continuous structure learning approaches is primarily driven by a high agreement between the order of increasing marginal variances and the topological order, and demonstrated that these approaches do not perform well after data standardization. We analyze this phenomenon for continuous approaches assuming equal and non-equal noise variances, and show that the statement may not hold in either case by providing counterexamples, justifications, and possible alternative explanations. We further demonstrate that nonconvexity may be a main concern especially for the non-equal noise variances formulation, while recent advances in continuous structure learning fail to achieve improvement in this case. Our findings suggest that future works should take into account the non-equal noise variances formulation to handle more general settings and for a more comprehensive empirical evaluation. Lastly, we provide insights into other aspects of the search procedure, including thresholding and sparsity, and show that they play an important role in the final solutions.
翻译:本文研究了在哪些情况下,有向无环图(DAG)结构学习的连续优化可以表现良好,以及为什么会出现这种情况,并提出了可能的方向,使搜索过程更可靠。Reisach等人(2021)指出,几种连续结构学习方法的显着性能主要是由较高的边际方差递增顺序和拓扑顺序之间的一致性驱动的,并证明了这些方法在数据标准化后表现不佳。我们分析了连续方法在假设噪声方差相等和不等的情况下的现象,并通过提供反例、解释和可能的替代解释,表明该语句可能在任何一种情况下都不成立。我们进一步证明,非凸性特别是在假设噪声方差不等的情况下可能是主要问题,而最近连续结构学习的进展未能在这种情况下实现改进。我们的发现表明,未来的工作应考虑假设噪声方差不等,以处理更一般的情况,以及进行更全面的经验评估。最后,我们提供了关于搜索过程的其他方面的见解,包括阈值和稀疏性,并证明它们在最终解决方案中起着重要作用。