The primary assumption of conventional supervised learning or classification is that the test samples are drawn from the same distribution as the training samples, which is called closed set learning or classification. In many practical scenarios, this is not the case because there are unknowns or unseen class samples in the test data, which is called the open set scenario, and the unknowns need to be detected. This problem is referred to as the open set recognition problem and is important in safety-critical applications. We propose to detect unknowns (or unseen class samples) through learning pairwise similarities. The proposed method works in two steps. It first learns a closed set classifier using the seen classes that have appeared in training and then learns how to compare seen classes with pseudo-unseen (automatically generated unseen class samples). The pseudo-unseen generation is carried out by performing distribution shifting augmentations on the seen or training samples. We call our method OPG (Open set recognition based on Pseudo unseen data Generation). The experimental evaluation shows that the learned similarity-based features can successfully distinguish seen from unseen in benchmark datasets for open set recognition.
翻译:常规监督学习或分类的主要假设是,测试样品来自与培训样品相同的分布,即所谓的封闭式学习或分类。在许多实际假设中,情况并非如此,因为测试数据中存在未知或未知类样本,即开放式假设,需要检测未知。这个问题被称为开放式识别问题,在安全关键应用中很重要。我们提议通过学习对等相似性来探测未知(或隐蔽类样本)。拟议方法分两个步骤运作。首先,它利用在培训中出现的可见分类,学习如何将所见分类与伪隐蔽类样本(自动生成的隐蔽类样本)进行比较。伪未知生成是通过对已见或培训样本进行分布移动增强来进行的。我们称之为我们的方法OPG(基于Pseudo的未见数据生成的开源识别)。实验性评估表明,所学到的类似性特征可以成功地区分在公开设定识别的基准数据集中从看不见的类似特征。