Safety is a top priority for civil aviation. New anomaly detection methods, primarily clustering methods, have been developed to monitor pilot operations and detect any risks from such flight data. However, all existing anomaly detection methods are offlline learning - the models are trained once using historical data and used for all future predictions. In practice, new flight data are accumulated continuously and analyzed every month at airlines. Clustering such dynamically growing data is challenging for an offlline method because it is memory and time intensive to re-train the model every time new data come in. If the model is not re-trained, false alarms or missed detections may increase since the model cannot reflect changes in data patterns. To address this problem, we propose a novel incremental anomaly detection method based on Gaussian Mixture Model (GMM) to identify common patterns and detect outliers in flight operations from digital flight data. It is a probabilistic clustering model of flight operations that can incrementally update its clusters based on new data rather than to re-cluster all data from scratch. It trains an initial GMM model based on historical offlline data. Then, it continuously adapts to new incoming data points via an expectation-maximization (EM) algorithm. To track changes in flight operation patterns, only model parameters need to be saved. The proposed method was tested on three sets of simulation data and two sets of real-world flight data. Compared with the traditional offline GMM method, the proposed method can generate similar clustering results with significantly reduced processing time (57 % - 99 % time reduction in testing sets) and memory usage (91 % - 95 % memory usage reduction in testing sets). Preliminary results indicate that the incremental learning scheme is effective in dealing with dynamically growing data in flight data analytics.
翻译:安全是民用航空的最高优先事项。 新的异常探测方法,主要是群集方法,已经开发了新的异常探测方法,以监测试验操作,并发现来自此类飞行数据的任何风险。 但是,所有现有的异常探测方法都是在离子学习中, 模型一旦使用历史数据就经过培训, 并用于所有未来预测。 实际上, 新的飞行数据每月在航空公司不断积累和分析。 这种动态增长的数据对于离子方法具有挑战性, 因为每次新数据出现时, 都具有记忆力和时间密集性, 以便重新对模型进行再培训。 如果模型不经过再培训, 虚假的警报或错过的探测方法可能会增加, 因为模型无法反映数据模式的变化。 然而, 要解决这个问题, 我们建议一种全新的递增异常异常探测方法, 以使用历史的离子数据模式(Gausian Mixture 模型) 来识别通用模式, 并且从数字飞行运行中测算出外值。 它是一个具有概率性的飞行测试模型模型, 可以用新的数据而不是从抓中重新组合所有数据。 它可以根据历史离子数据学习的数据来训练一个初始的 GMM模型。 然后, 它会根据历史离子数据学习数据, 开始, 它会不断调整后的数据, 正在不断调整到通过运行中的数据运行中, 正在通过运行中的数据运行中的数据运行中的数据运行中的数据运行中的数据 。 。 正在通过运行中测算方法 。 正在通过运行中的数据 。 。 在运行中, 。