In low-resource settings, model transfer can help to overcome a lack of labeled data for many tasks and domains. However, predicting useful transfer sources is a challenging problem, as even the most similar sources might lead to unexpected negative transfer results. Thus, ranking methods based on task and text similarity may not be sufficient to identify promising sources. To tackle this problem, we propose a method to automatically determine which and how many sources should be exploited. For this, we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources, resulting in performance increases of up to 24 F1 points.
翻译:在低资源环境下,模式转让可有助于克服许多任务和领域缺乏标签数据的问题,然而,预测有用的转让来源是一个具有挑战性的问题,因为即使是最相似的来源也可能导致出乎意料的负面转让结果。因此,基于任务和文本相似性的排名方法可能不足以确定有希望的来源。为解决这一问题,我们建议一种方法,自动确定哪些来源和应利用多少来源。为此,我们研究模式转让对不同领域和任务的顺序标签的影响,并表明我们基于模式相似性和辅助矢量机器的方法能够预测有希望的来源,导致性能增加多达24F1点。