Today there exists no shortage of outlier detection algorithms in the literature, yet the complementary and critical problem of unsupervised outlier model selection (UOMS) is vastly understudied. In this work we propose ELECT, a new approach to select an effective candidate model, i.e. an outlier detection algorithm and its hyperparameter(s), to employ on a new dataset without any labels. At its core, ELECT is based on meta-learning; transferring prior knowledge (e.g. model performance) on historical datasets that are similar to the new one to facilitate UOMS. Uniquely, it employs a dataset similarity measure that is performance-based, which is more direct and goal-driven than other measures used in the past. ELECT adaptively searches for similar historical datasets, as such, it can serve an output on-demand, being able to accommodate varying time budgets. Extensive experiments show that ELECT significantly outperforms a wide range of basic UOMS baselines, including no model selection (always using the same popular model such as iForest) as well as more recent selection strategies based on meta-features.
翻译:今天,文献中并不缺少外部探测算法,然而,未经监督的外部模型选择模式(UOMS)的互补和关键问题却被严重低估。在这项工作中,我们提议ELECT,这是一个选择有效候选模型的新方法,即外部检测算法及其超参数,在没有任何标签的情况下用于新的数据集。在其核心方面,ELECT以元学习为基础;转让关于历史数据集的先前知识(例如模型性能),这与促进UOMS的新数据基数相似(UOMS),其独特之处是,它采用基于性能的数据集,比过去使用的其他措施更直接、更受目标驱动。ELECT对类似的历史数据集进行适应性搜索,例如,它可以按需求进行输出,能够适应不同的时间预算。广泛的实验表明,ECE大大超越了广泛的UOMS基本基线,包括没有模型选择(总是使用相同的流行模型,例如IFAest)以及基于最新气象选择战略的模型。