Natural language understanding (NLU) has made massive progress driven by large benchmarks, but benchmarks often leave a long tail of infrequent phenomena underrepresented. We reflect on the question: have transfer learning methods sufficiently addressed the poor performance of benchmark-trained models on the long tail? We conceptualize the long tail using macro-level dimensions (e.g., underrepresented genres, topics, etc.), and perform a qualitative meta-analysis of 100 representative papers on transfer learning research for NLU. Our analysis asks three questions: (i) Which long tail dimensions do transfer learning studies target? (ii) Which properties of adaptation methods help improve performance on the long tail? (iii) Which methodological gaps have greatest negative impact on long tail performance? Our answers highlight major avenues for future research in transfer learning for the long tail. Lastly, using our meta-analysis framework, we perform a case study comparing the performance of various adaptation methods on clinical narratives, which provides interesting insights that may enable us to make progress along these future avenues.
翻译:自然语言理解(NLU)在大型基准推动下取得了巨大进展,但基准往往造成非经常现象代表性不足的长期尾巴。我们思考了以下问题:转移学习方法是否足以解决基准培训模型在长尾的不良表现?我们利用宏观层面(例如代表性不足的基因、专题等)概念化长尾,对100份关于转移学习研究的代表文件进行定性元分析。我们的分析提出了三个问题:(一) 哪些长尾维度是转移学习目标? (二) 适应方法的哪些性质有助于改进长尾成绩? (三) 哪些方法差距对长尾业绩产生最大的消极影响?我们的答复突出了今后在转移学习方面进行长尾学习的主要途径。最后,我们利用我们的元分析框架,对临床叙述方面各种适应方法的绩效进行了案例研究,这提供了有趣的见解,使我们能够在这些未来途径上取得进展。