Constructing an ensemble from a heterogeneous set of unsupervised anomaly detection methods is challenging because the class labels or the ground truth is unknown. Thus, traditional ensemble techniques that use the response variable or the class labels cannot be used to construct an ensemble for unsupervised anomaly detection. We use Item Response Theory (IRT) -- a class of models used in educational psychometrics to assess student and test question characteristics -- to construct an unsupervised anomaly detection ensemble. IRT's latent trait computation lends itself to anomaly detection because the latent trait can be used to uncover the hidden ground truth. Using a novel IRT mapping to the anomaly detection problem, we construct an ensemble that can downplay noisy, non-discriminatory methods and accentuate sharper methods. We demonstrate the effectiveness of the IRT ensemble on an extensive data repository, by comparing its performance to other ensemble techniques.
翻译:构建一组不同且不受监督的异常现象探测方法的混合体具有挑战性,因为类标签或地面真相未知。 因此,使用响应变量或类标签的传统混合技术不能用来构建一个用于不受监督的异常现象探测的混合体。 我们使用项目反应理论(IRT) -- -- 教育心理测量中用来评估学生和测试问题特征的一组模型 -- -- 来构建一个不受监督的异常现象探测共体。 IRT的潜在特征计算有助于发现异常现象,因为潜在特征可以用来发现隐藏的地面真相。我们用新型的IRT绘制异常现象探测问题图,构建了一个可以降低吵闹、不歧视的方法和突出锐化方法的混合体。我们通过将其性能与其他组合技术进行比较,在广泛的数据储存库中展示了IRT组合的有效性。