Hidden Markov models (HMMs) are a versatile statistical framework commonly used in ecology to characterize behavioural patterns from animal movement data. In HMMs, the observed data depend on a finite number of underlying hidden states, generally interpreted as the animal's unobserved behaviour. The number of states is a crucial parameter, controlling the trade-off between ecological interpretability of behaviours (fewer states) and the goodness of fit of the model (more states). Selecting the number of states, commonly referred to as order selection, is notoriously challenging. Common model selection metrics, such as AIC and BIC, often perform poorly in determining the number of states, particularly when models are misspecified. Building on existing methods for HMMs and mixture models, we propose a double penalized likelihood maximum estimate (DPMLE) for the simultaneous estimation of the number of states and parameters of non-stationary HMMs. The DPMLE differs from traditional information criteria by using two penalty functions on the stationary probabilities and state-dependent parameters. For non-stationary HMMs, forward and backward probabilities are used to approximate stationary probabilities. Using a simulation study that includes scenarios with additional complexity in the data, we compare the performance of our method with that of AIC and BIC. We also illustrate how the DPMLE differs from AIC and BIC using narwhal (Monodon monoceros) movement data. The proposed method outperformed AIC and BIC in identifying the correct number of states under model misspecification. Furthermore, its capacity to handle non-stationary dynamics allowed for more realistic modeling of complex movement data, offering deeper insights into narwhal behaviour. Our method is a powerful tool for order selection in non-stationary HMMs, with potential applications extending beyond the field of ecology.
翻译:暂无翻译