Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) before they ultimately lead to unavoidable failures, thus, supporting the following runtime fault-tolerant techniques. In this work, we propose a novel anomaly detection method named DistAD, which is based on the distribution of software runtime dynamic execution traces. Unlike other existing works using key performance indicators, the execution trace is collected during runtime via intrusive instrumentation. Instrumentation are controlled following a sampling mechanism to avoid excessive overheads. Bi-directional Long Short-Term Memory (Bi-LSTM), an architecture of Recurrent Neural Network (RNN) is used to achieve the anomaly detection. The whole framework is constructed under a One-Class Neural Network (OCNN) learning mode which can help eliminate the limits of lacking for enough labeled samples and the data imbalance issues. A series of controlled experiments are conducted on a widely used database system named Cassandra to prove the validity and feasibility of the proposed method. Overheads brought about by the intrusive probing are also evaluated. The results show that DistAD can achieve more than 70% accuracy and 90% recall (in normal states) with no more than 2 times overheads compared with unmonitored executions.
翻译:现代软件系统变得日益复杂,因此难以测试和验证。在运行的复杂系统中检测软件部分异常现象有助于处理无意的软件行为,避免灾难性软件故障,并改进软件运行时间的可用性。这些检测技术的目的是在错误(异常)最终导致不可避免的故障之前查明其表现(异常),从而支持随后的运行时间错误容忍技术。在这项工作中,我们建议采用名为DistAD的新式异常检测方法,该方法的基础是分发软件运行时间动态执行痕迹。与其他使用关键性能指标的现有工作不同,在运行期间通过侵入性仪器收集执行跟踪。在取样机制之后对仪器进行控制,以避免过度的间接费用。双向长期短期内存(BI-LSTM)是用来识别错误(异常)的常规神经网络(NNN)的架构,用于实现异常检测。整个框架是根据一个名为“Clas Neural”网络(OCNN)的学习模式构建的,该模式可以帮助消除缺少足够标签样品和数据不平衡问题的限度。一系列受控实验是在广泛使用的数据库系统上进行,而没有使用一个名为Cassandrabrodument rodialal-dealation 也能够以证明70次展示结果。