Novelty detection methods aim at partitioning the test units into already observed and previously unseen patterns. However, two significant issues arise: there may be considerable interest in identifying specific structures within the novelty, and contamination in the known classes could completely blur the actual separation between manifest and new groups. Motivated by these problems, we propose a two-stage Bayesian semiparametric novelty detector, building upon prior information robustly extracted from a set of complete learning units. We devise a general-purpose multivariate methodology that we also extend to handle functional data objects. We provide insights on the model behavior by investigating the theoretical properties of the associated semiparametric prior. From the computational point of view we propose a suitable $\boldsymbol{\xi}$-sequence to construct an independent slice-efficient sampler that takes into account the difference between manifest and novelty components. We showcase our model performance through an extensive simulation study and applications on both multivariate and functional datasets, in which diverse and distinctive unknown patterns are discovered.
翻译:新颖的探测方法旨在将试验单位分割成已经观测到的和先前看不见的模式。然而,出现了两个重要问题:对查明新颖结构中的具体结构可能有很大的兴趣,已知类别中的污染可能完全模糊了表单组和新组之间的实际分离。受这些问题的驱使,我们建议以从一套完整的学习单元中可靠提取的先前信息为基础,采用两个阶段的巴伊西亚半对数新颖探测器。我们设计了一种通用的多变量方法,我们也将这种方法推广到处理功能性数据对象。我们通过调查相关的前半参数的理论特性,就模型行为提供了深刻的见解。我们从计算角度提出一个合适的 $\boldsymbol=xix}顺序,以建立一个独立的切片高效采样器,考虑到表和新组之间的差异。我们通过广泛的模拟研究和应用多变量和功能数据集来展示我们的模型性能,在这些数据集中发现了不同和独特的未知模式。