The rapid advancement of models based on artificial intelligence demands innovative monitoring techniques which can operate in real time with low computational costs. In machine learning, especially if we consider neural network (NN) learning algorithms, and in particular deep-learning architectures, the models are often trained in a supervised manner. Consequently, the learned relationship between the input and the output must remain valid during the model's deployment. If this stationarity assumption holds, we can conclude that the NN generates accurate predictions. Otherwise, the retraining or rebuilding of the model is required. We propose to consider the latent feature representation of the data (called "embedding") generated by the NN for determining the time point when the data stream starts being nonstationary. To be precise, we monitor embeddings by applying multivariate control charts based on the calculation of the data depth and normalized ranks. The performance of the introduced method is evaluated using various NNs with different underlying data formats.
翻译:以人工智能为基础的模型的快速进步要求创新的监测技术,这种技术可以以低计算成本实时运作。在机器学习中,特别是如果我们考虑神经网络(NN)学习算法,特别是深学习结构,模型往往会受到监督培训。因此,输入和产出之间的学习关系在模型部署期间必须保持有效。如果这种固定性假设成立,我们可以得出结论,NN会产生准确的预测。否则,就需要对模型进行再培训或重建。我们提议考虑NNN产生的数据(称为“编组”)的潜在特征,以确定数据流开始不固定的时间点。确切地说,我们通过在计算数据深度和标准级的基础上应用多变量控制图表来监测嵌入情况。采用的方法的性能是使用不同基本数据格式的各种NNM来评估。