One-shot learning has become an important research topic in the last decade with many real-world applications. The goal of one-shot learning is to classify unlabeled instances when there is only one labeled example per class. Conventional problem setting of one-shot learning mainly focuses on the data that is already in feature space (such as images). However, the data instances in real-world applications are often more complex and feature vectors may not be available. In this paper, we study the problem of one-shot learning on attributed sequences, where each instance is composed of a set of attributes (e.g., user profile) and a sequence of categorical items (e.g., clickstream). This problem is important for a variety of real-world applications ranging from fraud prevention to network intrusion detection. This problem is more challenging than conventional one-shot learning since there are dependencies between attributes and sequences. We design a deep learning framework OLAS to tackle this problem. The proposed OLAS utilizes a twin network to generalize the features from pairwise attributed sequence examples. Empirical results on real-world datasets demonstrate the proposed OLAS can outperform the state-of-the-art methods under a rich variety of parameter settings.
翻译:在过去十年中,一发学习已成为一个重要的研究课题,有许多现实世界应用。一发学习的目的是对每类只有一个标签示例的未标记实例进行分类。一发学习的常规问题设置主要侧重于已经在特征空间(如图像)中的数据。然而,现实世界应用中的数据实例往往更为复杂,特性矢量可能无法提供。在本文件中,我们研究对被分配序列进行一发学习的问题,每个实例由一系列属性(如用户概况)和绝对项目序列(如点击流)组成。对于从欺诈预防到网络入侵探测等各种现实世界应用来说,这个问题很重要。由于属性和序列之间有相互依存关系,这个问题比常规一发学习更具挑战性。我们设计了一个深层次学习的OLAS框架来解决这一问题。拟议的OLAS利用一个双网络来从相匹配的序列示例中概括各种特征。在现实世界数据系统(如:点击流)中取得的结果展示了拟议的OLAS多样化参数设置下的丰富度方法。