Nowadays, transformer-based models gradually become the default choice for artificial intelligence pioneers. The models also show superiority even in the few-shot scenarios. In this paper, we revisit the classical methods and propose a new few-shot alternative. Specifically, we investigate the few-shot one-class problem, which actually takes a known sample as a reference to detect whether an unknown instance belongs to the same class. This problem can be studied from the perspective of sequence match. It is shown that with meta-learning, the classical sequence match method, i.e. Compare-Aggregate, significantly outperforms transformer ones. The classical approach requires much less training cost. Furthermore, we perform an empirical comparison between two kinds of sequence match approaches under simple fine-tuning and meta-learning. Meta-learning causes the transformer models' features to have high-correlation dimensions. The reason is closely related to the number of layers and heads of transformer models. Experimental codes and data are available at https://github.com/hmt2014/FewOne
翻译:目前,基于变压器的模型逐渐成为人工智能先驱者的默认选择。 模型还显示即使在几发情景中也具有优越性。 在本文中, 我们重新审视古典方法, 并提出新的几发替代方案。 具体地说, 我们调查微小的单级问题, 这实际上是一个已知的样本, 用来检测未知实例是否属于同一类别。 这个问题可以从序列匹配的角度来研究 。 显示通过元学习, 经典序列匹配方法, 即比较- Aggregate, 明显优于变异器。 经典方法需要的培训费用要少得多。 此外, 我们在简单的微调和元化学习下对两种序列匹配方法进行实验性比较。 元化学习使变异器模型的特征具有高度的相联性。 原因与变异器模型的层次和头数密切相关。 实验代码和数据可在 https://github. com/ hmt2014/ FewOne中查阅 。