Accurately inferring Gene Regulatory Networks (GRNs) is a critical and challenging task in biology. GRNs model the activatory and inhibitory interactions between genes and are inherently causal in nature. To accurately identify GRNs, perturbational data is required. However, most GRN discovery methods only operate on observational data. Recent advances in neural network-based causal discovery methods have significantly improved causal discovery, including handling interventional data, improvements in performance and scalability. However, applying state-of-the-art (SOTA) causal discovery methods in biology poses challenges, such as noisy data and a large number of samples. Thus, adapting the causal discovery methods is necessary to handle these challenges. In this paper, we introduce DiscoGen, a neural network-based GRN discovery method that can denoise gene expression measurements and handle interventional data. We demonstrate that our model outperforms SOTA neural network-based causal discovery methods.
翻译:准确地推断基因调控网络(GRN)是生物学中的一个关键和具有挑战性的任务。GRN将基因之间的激活和抑制相互作用进行建模,本质上是因果关系。为了准确地识别GRN,需要扰动数据。然而,大多数GRN发现方法只能操作于观测数据。神经网络因果发现方法的最新进展显著提高了因果发现能力,包括处理干预数据,提高性能和可扩展性。然而,在生物学中应用最先进的(SOTA)因果发现方法面临着诸多挑战,如噪声数据和大量的样本。因此,需要调整因果发现方法以应对这些挑战。本文介绍了DiscoGen,一种基于神经网络的GRN发现方法,它能够去噪音基因表达、处理干预数据。我们证明了我们的模型优于SOTA神经网络的因果发现方法。