The problem of inferring unknown graph edges from numerical data at a graph's nodes appears in many forms across machine learning. We study a version of this problem that arises in the field of \emph{landscape genetics}, where genetic similarity between organisms living in a heterogeneous landscape is explained by a weighted graph that encodes the ease of dispersal through that landscape. Our main contribution is an efficient algorithm for \emph{inverse landscape genetics}, which is the task of inferring this graph from measurements of genetic similarity at different locations (graph nodes). Inverse landscape genetics is important in discovering impediments to species dispersal that threaten biodiversity and long-term species survival. In particular, it is widely used to study the effects of climate change and human development. Drawing on influential work that models organism dispersal using graph \emph{effective resistances} (McRae 2006), we reduce the inverse landscape genetics problem to that of inferring graph edges from noisy measurements of these resistances, which can be obtained from genetic similarity data. Building on the NeurIPS 2018 work of Hoskins et al. 2018 on learning edges in social networks, we develop an efficient first-order optimization method for solving this problem. Despite its non-convex nature, experiments on synthetic and real genetic data establish that our method provides fast and reliable convergence, significantly outperforming existing heuristics used in the field. By providing researchers with a powerful, general purpose algorithmic tool, we hope our work will have a positive impact on accelerating work on landscape genetics.
翻译:从图表节点的数值数据中推断出未知的图层边缘的问题出现在机器学习的多种形式中。 我们研究了这个问题的版本, 这个问题出现在 emph{ 地貌遗传学中。 在这种版本中, 生活在不同景观的生物体之间的基因相似性被一个加权图表所解释, 它能说明在这种景观中传播的容易程度。 我们的主要贡献是, 一种有效的地貌遗传学算法, 这是一项从不同地点的基因相似性测量中推断出这个图表的任务。 反面地貌遗传学对于发现物种散布的障碍非常重要, 从而威胁到生物多样性和长期物种生存。 特别是, 它被广泛用来研究气候变化和人类发展的影响。 利用有影响力的工作, 模型的生物分布利用图形 \ emph{ 有效抵抗 } (McRae, 2006), 我们减少地貌遗传基因遗传学的偏移学问题, 从这些抗力的粘稠度测量中推断出图表边缘, 可以从基因相似性数据数据中获取。 在2018年的NeurIPS 2018 工作上, 我们利用了一种不精确的实验方法,,, 我们用这种快速的实验性工作, 提供了一种不精确的模型, 。