Polythetic classifications, based on shared patterns of features that need neither be universal nor constant among members of a class, are common in the natural world and greatly outnumber monothetic classifications over a set of features. We show that threshold meta-learners, such as Prototypical Networks, require an embedding dimension that is exponential in the number of task-relevant features to emulate these functions. In contrast, attentional classifiers, such as Matching Networks, are polythetic by default and able to solve these problems with a linear embedding dimension. However, we find that in the presence of task-irrelevant features, inherent to meta-learning problems, attentional models are susceptible to misclassification. To address this challenge, we propose a self-attention feature-selection mechanism that adaptively dilutes non-discriminative features. We demonstrate the effectiveness of our approach in meta-learning Boolean functions, and synthetic and real-world few-shot learning tasks.
翻译:基于同一类别成员之间既非普遍性也非常有特征的共同模式的多元分类,在自然界中很常见,远远胜于一套特征的单项分类。我们显示,诸如原型网络等临界超脱皮器需要嵌入一个在任务相关特征数量上指数化的嵌入维度,以效仿这些功能。相反,像匹配网络这样的关注分类器默认地具有多元性,能够用线性嵌入维度解决这些问题。然而,我们发现,在存在与任务相关的特征的情况下,由于元学习问题所固有的,关注模型很容易被错误分类。为了应对这一挑战,我们建议了一种自我注意的特选机制,以适应性的方式稀释非差异性特征。我们展示了我们在元学习布林功能以及合成和实际的微小学习任务方面的做法的有效性。