使用Polica-Gamma计划,对大型占用数据集进行快速Bayesian快速推论 (Fast Bayesian inference for large occupancy data sets, using the Polya-Gamma scheme)

In recent years, the study of species' occurrence has benefited from the increased availability of large-scale citizen-science data. Whilst abundance data from standardized monitoring schemes are biased towards well-studied taxa and locations, opportunistic data are available for many taxonomic groups, from a large number of locations and across long timescales. Hence, these data provide opportunities to measure species' changes in occurrence, particularly through the use of occupancy models, which account for imperfect detection. However, existing Bayesian occupancy models are extremely slow when applied to large citizen-science data sets. In this paper, we propose a novel framework for fast Bayesian inference in occupancy models that account for both spatial and temporal autocorrelation. We express the occupancy and detection processes within a logistic regression framework, which enables us to use the Polya-Gamma scheme to perform inference quickly and efficiently, even for very large data sets. Spatial and temporal random effects are modelled using Gaussian processes, allowing us to infer the strength of spatio-temporal autocorrelation from the data. We apply our model to data on two UK butterfly species, one common and widespread and one rare, using records from the Butterflies for the New Millennium database, producing occupancy indices spanning 45 years. Our framework can be applied to a wide range of taxa, providing measures of variation in species' occurrence, which are used to assess biodiversity change.

翻译：近年来,物种发生情况的研究得益于大规模公民科学数据的提供量的增加。标准化监测计划的丰度数据偏向于研究周密的分类和地点,但许多分类群,包括许多地点和跨长时标,都可获得机会性数据。因此,这些数据为测量物种发生情况的变化提供了机会,特别是通过使用占有权模型衡量物种发生的变化,这种模型导致不完善的检测。但是,在应用大型公民科学数据集时,贝叶西亚占用模式极为缓慢。在本文中,我们提议建立一个新框架,用于迅速推断占有权模式中的贝叶斯快速推断,这种模型既考虑到空间和时间的自动调节关系。我们用逻辑回归框架来表达许多分类群的占用和探测过程,从而使我们能够利用聚氨基-伽马计划快速和有效地进行推断。空间和时间随机效应正在使用高斯进程模拟,让我们从数据中推算出空间-时间自主自动调节的自动调节数据强度。我们用在两个英国蝴蝶物种上的数据模型,从一个常见和罕见的数据范围到一个使用的数据库,可以提供我们使用的新税位指数的跨度。