Hidden Markov models (HMMs) are popular tools for analysing animal behaviour based on movement, acceleration and other sensor data. In particular, these models allow to infer how the animal's decision-making process interacts with internal and external drivers, by relating the probabilities of switching between distinct behavioural states to covariates. A key challenge arising in the statistical analysis of behavioural data using covariate-driven HMMs is the models' interpretation, especially when there are more than two states, as then several functional relationships between state-switching probabilities and covariates need to be jointly interpreted. The model-implied probabilities of occupying the different states, as a function of a covariate of interest, constitute a much simpler summary statistic. A pragmatic approximation of the state occupancy distribution, namely the hypothetical stationary distribution of the model's underlying Markov chain for fixed covariate values, has in fact routinely been reported in HMM-based analyses of ecological data. However, for stochastically varying covariates with relatively little persistence, we show that this approximation can be severely biased, potentially invalidating ecological inference. We develop two alternative approaches for obtaining the state occupancy distribution as a function of a covariate of interest - one based on resampling of the covariate process, the other obtained by regression analysis of the empirical state probabilities. The practical application of these approaches is demonstrated in simulations and a case study on Galápagos tortoise (Chelonoidis niger) movement data. Our methods enable practitioners to conduct unbiased inference on the relationship between animal behaviour and general types of covariates, thus allowing to uncover the factors influencing behavioural decisions made by animals.
翻译:隐马尔可夫模型(HMMs)是基于运动、加速度及其他传感器数据分析动物行为的常用工具。特别地,这些模型通过将不同行为状态间的转移概率与协变量相关联,能够推断动物的决策过程如何与内外驱动因素相互作用。在使用协变量驱动HMMs进行行为数据统计分析时,一个关键挑战在于模型解释——尤其是当存在多于两个状态时,此时需要联合解释状态转移概率与协变量之间的多重函数关系。作为关注协变量的函数,模型隐含的各状态占用概率构成了一种更为简洁的汇总统计量。事实上,在基于HMM的生态数据分析中,通常报告一种对状态占用分布的实用近似——即固定协变量值下模型底层马尔可夫链的假设平稳分布。然而,对于随机变化且持续性较弱的协变量,我们证明这种近似可能产生严重偏差,从而可能使生态推断失效。我们提出了两种获取作为关注协变量函数的状态占用分布的替代方法:一种基于协变量过程的重采样,另一种通过对经验状态概率进行回归分析得到。通过模拟实验和对加拉帕戈斯象龟(Chelonoidis niger)运动数据的案例研究,展示了这些方法的实际应用。我们的方法使实践者能够对动物行为与广义协变量类型之间的关系进行无偏推断,从而揭示影响动物行为决策的因素。