Pre-trained Language Models (PLMs) have achieved remarkable performance gains across numerous downstream tasks in natural language understanding. Various Chinese PLMs have been successively proposed for learning better Chinese language representation. However, most current models use Chinese characters as inputs and are not able to encode semantic information contained in Chinese words. While recent pre-trained models incorporate both words and characters simultaneously, they usually suffer from deficient semantic interactions and fail to capture the semantic relation between words and characters. To address the above issues, we propose a simple yet effective PLM CLOWER, which adopts the Contrastive Learning Over Word and charactER representations. In particular, CLOWER implicitly encodes the coarse-grained information (i.e., words) into the fine-grained representations (i.e., characters) through contrastive learning on multi-grained information. CLOWER is of great value in realistic scenarios since it can be easily incorporated into any existing fine-grained based PLMs without modifying the production pipelines.Extensive experiments conducted on a range of downstream tasks demonstrate the superior performance of CLOWER over several state-of-the-art baselines.
翻译:预先培训的语言模型(PLM)在自然语言理解的众多下游任务中取得了显著的成绩。各种中国PLM(PLM)相继被推荐来学习更好的中文语言代表形式。然而,目前大多数模型使用中文字符作为投入,无法将中文文字信息编码;虽然最近经过培训的模型同时包括了文字和字符,但它们通常存在语义互动不足的问题,未能捕捉文字和字符之间的语义关系。为了解决上述问题,我们建议建立一个简单而有效的PLM COWER(PLM COWER),采用文字和字符的对比学习形式。特别是,CLOWER(COWER)通过对多语种信息进行对比学习,将粗略的文字信息(即文字)隐含括入精细的表达方式(即字符)成编码。CLOWER(COWER)在现实情景中具有重大价值,因为它可以很容易地纳入任何现有的精细的PLMS(PLMS),而不修改生产管道。在一系列下游任务上进行的广泛实验,显示COERArt-LERart基线的优劣业绩。