Model-agnostic meta-learning (MAML) has been recently put forth as a strategy to learn resource-poor languages in a sample-efficient fashion. Nevertheless, the properties of these languages are often not well represented by those available during training. Hence, we argue that the i.i.d. assumption ingrained in MAML makes it ill-suited for cross-lingual NLP. In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages (with a uniform prior), which is known as Bayes criterion. To increase its robustness to outlier languages, we create two variants of MAML based on alternative criteria: Minimax MAML reduces the maximum risk across languages, while Neyman-Pearson MAML constrains the risk in each language to a maximum threshold. Both criteria constitute fully differentiable two-player games. In light of this, we propose a new adaptive optimiser solving for a local approximation to their Nash equilibrium. We evaluate both model variants on two popular NLP tasks, part-of-speech tagging and question answering. We report gains for their average and minimum performance across low-resource languages in zero- and few-shot settings, compared to joint multi-source transfer and vanilla MAML.
翻译:最近提出了以抽样高效的方式学习资源贫乏语言的模型学元学习(MAML),作为学习资源贫乏语言的战略,不过,这些语言的特性往往没有在培训期间的可用标准中得到很好的体现。因此,我们认为,在MAML中的i.d. 假设使该假设不适合跨语言的NLP。 事实上,根据一个决定理论框架,MAML可以被解释为尽可能减少不同培训语言(统一之前的)的预期风险,这种培训语言被称为Bayes标准。为了提高这种语言对外来语言的坚固性,我们根据替代标准创建了两种MAML变异模式:Minimax MAML降低各种语言的最大风险,而Neyman-Pearson MAML将每种语言的风险限制在最高门槛。这两种标准构成完全不同的双玩游戏。 有鉴于此,我们建议一种新的适应性最佳选择方法解决方案,以达到当地对纳什均衡的近似。我们评估两种流行NLP任务的模式变式,部分为Speech标记,在最低水平和最低版本的MAMLMLMLMLM格式上,我们通过最低的成绩报告,在最低水平和最低版本中,在最低版本和最低版本中进行。