Anomaly detection plays a key role in air quality analysis by enhancing situational awareness and alerting users to potential hazards. However, existing anomaly detection approaches for air quality analysis have their own limitations regarding parameter selection (e.g., need for extensive domain knowledge), computational expense, general applicability (e.g., require labeled data), interpretability, and the efficiency of analysis. Furthermore, the poor quality of collected air quality data (inconsistently formatted and sometimes missing) also increases the difficulty of analysis substantially. In this paper, we systematically formulate design requirements for a system that can solve these limitations and then propose AQEyes, an integrated visual analytics system for efficiently monitoring, detecting, and examining anomalies in air quality data. In particular, we propose a unified end-to-end tunable machine learning pipeline that includes several data pre-processors and featurizers to deal with data quality issues. The pipeline integrates an efficient unsupervised anomaly detection method that works without the use of labeled data and overcomes the limitations of existing approaches. Further, we develop an interactive visualization system to visualize the outputs from the pipeline. The system incorporates a set of novel visualization and interaction designs, allowing analysts to visually examine air quality dynamics and anomalous events in multiple scales and from multiple facets. We demonstrate the performance of this pipeline through a quantitative evaluation and show the effectiveness of the visualization system using qualitative case studies on real-world datasets.
翻译:异常检测在空气质量分析中发挥着关键作用,提高了对情况的认识,提醒用户注意潜在的危害,从而在空气质量分析中提高了对情况的认识,从而在空气质量分析中发挥着关键作用;然而,现有空气质量分析异常检测方法在参数选择(例如需要广泛的域知识)、计算费用、一般适用性(例如需要贴标签的数据)、可解释性以及分析效率等方面都有其局限性;此外,所收集的空气质量数据质量数据质量差(格式不一致,有时缺失)也大大增加了分析难度;在本文件中,我们系统地为能够解决这些限制的系统制定设计要求,然后提出AQEyes,即一个用于高效监测、检测和审查空气质量数据异常的综合视觉分析系统;特别是,我们建议建立一个统一的端对端对端的金枪鱼型机器学习管道(例如,需要贴标签的数据)、可解释性以及处理数据质量问题所需的功能;管道整合一种高效的、不超超超超的异常检测方法,不使用标签数据,克服现有方法的局限性;此外,我们还开发一个互动的可视化系统,以便从管道中真实的定性分析、探测质量分析产出,从多级分析系统,从多级分析,从图像分析从图像分析到图像分析活动到展示。