使用 AI: 调查 (Outlier Detection using AI: A Survey)

An outlier is an event or observation that is defined as an unusual activity, intrusion, or a suspicious data point that lies at an irregular distance from a population. The definition of an outlier event, however, is subjective and depends on the application and the domain (Energy, Health, Wireless Network, etc.). It is important to detect outlier events as carefully as possible to avoid infrastructure failures because anomalous events can cause minor to severe damage to infrastructure. For instance, an attack on a cyber-physical system such as a microgrid may initiate voltage or frequency instability, thereby damaging a smart inverter which involves very expensive repairing. Unusual activities in microgrids can be mechanical faults, behavior changes in the system, human or instrument errors or a malicious attack. Accordingly, and due to its variability, Outlier Detection (OD) is an ever-growing research field. In this chapter, we discuss the progress of OD methods using AI techniques. For that, the fundamental concepts of each OD model are introduced via multiple categories. Broad range of OD methods are categorized into six major categories: Statistical-based, Distance-based, Density-based, Clustering-based, Learning-based, and Ensemble methods. For every category, we discuss recent state-of-the-art approaches, their application areas, and performances. After that, a brief discussion regarding the advantages, disadvantages, and challenges of each technique is provided with recommendations on future research directions. This survey aims to guide the reader to better understand recent progress of OD methods for the assurance of AI.

翻译：外部效应是一种事件或观察,被定义为异常活动、入侵或与人口不规则距离的可疑数据点。但是,外部效应事件的定义是主观的,取决于应用和域(能源、卫生、无线网络等),重要的是尽可能仔细地发现异常事件,以避免基础设施失灵,因为异常事件可能对基础设施造成轻微的严重破坏。例如微电网等网络物理系统受到攻击,可能会引发电压或频率不稳定,从而破坏智能反转器,需要非常昂贵的修复。微电网中的异常活动可能是机械故障、系统的行为变化、人类或仪器的错误或恶意攻击。因此,由于其变异性,外部检测(OD)是一个不断增长的研究领域。在本章中,我们用人工智能技术讨论OD方法的进展。因此,每个OD模式的基本概念都通过多种类别引入。广泛的OD方法分为六大类:基于统计的、基于距离的、基于密度的、基于密度的研究网络的异常变化的方法,以及基于我们学习的每一种方法、基于最近的方法和基于分类的方法的分类方法。