Hidden Markov models (HMMs) and their extensions have proven to be powerful tools for classification of observations that stem from systems with temporal dependence as they take into account that observations close in time are likely generated from the same state (i.e.\ class). When information on the classes of the observations is available in advanced, supervised methods can be applied. In this paper, we provide details for the implementation of four models for classification in a supervised learning context: HMMs, hidden semi-Markov models (HSMMs), autoregressive-HMMs, and autoregressive-HSMMs. Using simulations, we study the classification performance under various degrees of model misspecification to characterize when it would be important to extend a basic HMM to an HSMM. As an application of these techniques we use the models to classify accelerometer data from Merino sheep to distinguish between four different behaviors of interest. In particular in the field of movement ecology, collection of fine-scale animal movement data over time to identify behavioral states has become ubiquitous, necessitating models that can account for the dependence structure in the data. We demonstrate that when the aim is to conduct classification, various degrees of model misspecification of the proposed model may not impede good classification performance unless there is high overlap between the state-dependent distributions, that is, unless the observation distributions of the different states are difficult to differentiate.
翻译:隐藏 Markov 模型( HMMs) 及其扩展证明是将来自具有时间依赖性的系统的观测进行分类的有力工具,因为它们考虑到在时间上接近的观测可能来自同一状态(即/类),当关于观测类别的信息以先进方式提供时,可以应用监督方法。在本文件中,我们为在监督的学习背景下实施四种分类模式提供了细节:HMMs、隐藏的半马尔科夫模型(HMMs)、自动递增-MMs和自动递增-HMMs。我们利用模拟,研究不同程度模型的分类性能,以辨别不同程度的分类,在有必要将基本 HMMM 扩展至 HS MM 的情况下,将不同程度的分类。作为这些技术的应用,我们使用模型对梅里诺羊的加速计数据进行分类,以区分四种不同感兴趣的行为。特别是在运动生态领域,收集精细的动物流动数据,以便查明行为状态,变得司空,需要根据不同程度的模型来计算数据中的依赖性结构。我们证明,除非为了良好的分类的目的,不同程度,否则,不同程度的分布可能阻碍不同的业绩的分类。