The rise of pre-trained language models has yielded substantial progress in the vast majority of Natural Language Processing (NLP) tasks. However, a generic approach towards the pre-training procedure can naturally be sub-optimal in some cases. Particularly, fine-tuning a pre-trained language model on a source domain and then applying it to a different target domain, results in a sharp performance decline of the eventual classifier for many source-target domain pairs. Moreover, in some NLP tasks, the output categories substantially differ between domains, making adaptation even more challenging. This, for example, happens in the task of aspect extraction, where the aspects of interest of reviews of, e.g., restaurants or electronic devices may be very different. This paper presents a new fine-tuning scheme for BERT, which aims to address the above challenges. We name this scheme DILBERT: Domain Invariant Learning with BERT, and customize it for aspect extraction in the unsupervised domain adaptation setting. DILBERT harnesses the categorical information of both the source and the target domains to guide the pre-training process towards a more domain and category invariant representation, thus closing the gap between the domains. We show that DILBERT yields substantial improvements over state-of-the-art baselines while using a fraction of the unlabeled data, particularly in more challenging domain adaptation setups.
翻译:培训前语言模式的兴起在绝大多数自然语言处理(NLP)任务中取得了实质性进展。然而,在某些情况下,对培训前程序采取的一般性做法自然会是次优的。特别是,在源域上对培训前语言模式进行微调,然后将其应用于不同的目标域,导致最终分类者对许多源目标域对子的性能急剧下降。此外,在一些域内,产出类别差异很大,使适应更具挑战性。例如,对培训前程序采取的一般性做法可能在某些情况中是次优的。特别是,对培训前语言模式在源域内进行微调,目的是应对上述挑战。我们命名这个方案:与BERT一起进行不易变的学习,并定制在不受监督域内适应设置中进行方面抽取。DILBERT利用源和目标域的绝对信息来指导培训前的提取过程,例如餐馆或电子设备等,其兴趣方面可能非常不同。本文为BERT提出了一个新的微调计划,旨在应对上述挑战。我们称之为DLBILT(B)中的大部分域域域内的数据,从而缩小了未升级的数据。