Hidden Markov models (HMMs) and their extensions have proven to be powerful tools for classification of observations that stem from systems with temporal dependence as they take into account that observations close in time to one another are likely generated from the same state (i.e. class). In this paper, we provide details for the implementation of four models for classification in a supervised learning context: HMMs, hidden semi-Markov models (HSMMs), autoregressive-HMMs and autoregressive-HSMMs. Using simulations, we study the classification performance under various degrees of model misspecification to characterize when it would be important to extend a basic HMM to an HSMM. As an application of these techniques we use the models to classify accelerometer data from Merino sheep to distinguish between four different behaviors of interest. In particular in the field of movement ecology, collection of fine-scale animal movement data over time to identify behavioral states has become ubiquitous, necessitating models that can account for the dependence structure in the data. We demonstrate that when the aim is to conduct classification, various degrees of model misspecification of the proposed model may not impede good classification performance unless there is high overlap between the state-dependent distributions.
翻译:隐藏的Markov模型(MMMs)及其扩展已证明是将来自具有时间依赖性的系统的观测结果进行分类的有力工具,因为它们考虑到从同一状态(即阶级)中可能得出彼此接近的观测结果。在本文件中,我们提供了在受监督的学习背景下实施四种分类模型的细节:HMMs、隐藏的半马尔科夫模型(HIMs)、自动递退-HMs和自动递减-HMMs。我们利用模拟,研究不同程度模型的分类性能,以说明何时必须把基本HMMm扩大到HSMM。作为这些技术的应用,我们利用这些模型对梅里诺羊的加速计数据进行分类,以区分四种不同感兴趣的行为。特别是在运动生态领域,收集精细的动物流动数据,以便查明行为状态。我们利用模拟,研究不同程度的分类性能模型可以说明数据中的依赖性结构。我们证明,在进行分类时,将模型的误差程度分为不同程度,但拟议的独立分布模式之间可能不妨碍良好的业绩分类。