This work is motivated by an application in an industrial context, where the activity of sensors is recorded at a high frequency. The objective is to automatically detect abnormal measurement behaviour. Considering the sensor measures as functional data, we are formally interested in detecting outliers in a multivariate functional data set. Due to the heterogeneity of this data set, the proposed contaminated mixture model both clusters the multivariate functional data into homogeneous groups and detects outliers. The main advantage of this procedure over its competitors is that it does not require us to specify the proportion of outliers. Model inference is performed through an Expectation-Conditional Maximization algorithm, and the BIC criterion is used to select the number of clusters. Numerical experiments on simulated data demonstrate the high performance achieved by the inference algorithm. In particular, the proposed model outperforms competitors. Its application on the real data which motivated this study allows us to correctly detect abnormal behaviours.
翻译:这项工作的动机是在高频记录传感器活动的工业环境中应用。目标是自动检测异常测量行为。将传感器措施视为功能数据,我们正式有兴趣在多变量功能数据集中检测外部值。由于该数据集的异质性,拟议的受污染混合物模型将多变量功能数据组合成同质组,并检测外部值。这一程序对竞争对手的主要优势是,它不要求我们指定外部值的比例。模型推论是通过预期-条件最大化算法进行的,而BIC标准用于选择组数。模拟数据的数值实验显示了推断算法所取得的高性能。特别是,拟议的模型超越了竞争者。该模型在实际数据中的应用使得我们得以正确检测异常行为。