Causal structure discovery (CSD) models are making inroads into several domains, including Earth system sciences. Their widespread adaptation is however hampered by the fact that the resulting models often do not take into account the domain knowledge of the experts and that it is often necessary to modify the resulting models iteratively. We present a workflow that is required to take this knowledge into account and to apply CSD algorithms in Earth system sciences. At the same time, we describe open research questions that still need to be addressed. We present a way to interactively modify the outputs of the CSD algorithms and argue that the user interaction can be modelled as a greedy finding of the local maximum-a-posteriori solution of the likelihood function, which is composed of the likelihood of the causal model and the prior distribution representing the knowledge of the expert user. We use a real-world data set for examples constructed in collaboration with our co-authors, who are the domain area experts. We show that finding maximally usable causal models in the Earth system sciences or other similar domains is a difficult task which contains many interesting open research questions. We argue that taking the domain knowledge into account has a substantial effect on the final causal models discovered.
翻译:造成结构发现(CSD)的模型正在进入几个领域,包括地球系统科学。但是,由此产生的模型往往没有考虑到专家的域知识,往往需要反复修改所产生的模型。我们提出了一个工作流程,需要将这种知识考虑在内,并在地球系统科学中应用CSD的算法。与此同时,我们描述了仍然需要解决的开放研究问题。我们提出了一个互动修改CSD算法产出的方法,并争辩说,用户互动可以仿照对可能性功能的本地最大异质解决办法的贪婪发现,这种贪婪发现包括因果模型的可能性和专家用户知识的先前分布。我们使用一个真实世界数据集,作为与我们的共同作者合作构建的范例,他们是域专家。我们表明,在地球系统科学或其他类似领域找到最有用的因果模型是一项艰巨的任务,其中有许多有趣的公开研究问题。我们说,考虑到域知识对最终的因果模型具有实质性影响。