For multivariate time series driven by underlying states, hidden Markov models (HMMs) constitute a powerful framework which can be flexibly tailored to the situation at hand. However, in practice it can be challenging to choose an adequate emission distribution for multivariate observation vectors. For example, the marginal data distribution may not immediately reveal the within-state distributional form, and also the different data streams may operate on different supports, rendering the common approach of using a multivariate normal distribution inadequate. Here we explore a nonparametric estimation of the emission distributions within a multivariate HMM based on tensor-product B-splines. In two simulation studies, we show the feasibility of our modelling approach and demonstrate potential pitfalls of inappropriate choices of parametric distributions. To illustrate the practical applicability, we present a case study where we use an HMM to model the bivariate time series comprising the lengths and angles of goalkeeper passes during the UEFA EURO 2020, investigating the effect of match dynamics on the teams' tactics.
翻译:对于由基础国家驱动的多变时间序列,隐藏的Markov模型(HMMs)构成了一个强大的框架,可以灵活地适应手头的情况。然而,在实践中,选择一个适合多变观测矢量的适当的排放分布可能具有挑战性。例如,边际数据分布可能不会立即揭示国家内部分布形式,而不同的数据流可能使用不同的支持,使得使用多变正常分布的通用方法不够充分。在这里,我们探索了一种非参数性的估计,即基于 Exronor-product B-splines 的多变量 HMMm 的排放分布。在两个模拟研究中,我们展示了我们的模拟方法的可行性,并展示了参数分布不适当的选择的潜在陷阱。为了说明实际适用性,我们提出了一个案例研究,即我们使用 HMMm 来模拟由 EFA EURO 2020 期间目标控制者通行证的长度和角度构成的双变时间序列,调查匹配动态对团队战术的影响。