With the advent of Transformer, which was used in translation models in 2017, attention-based architectures began to attract attention. Furthermore, after the emergence of BERT, which strengthened the NLU-specific encoder part, which is a part of the Transformer, and the GPT architecture, which strengthened the NLG-specific decoder part, various methodologies, data, and models for learning the Pretrained Language Model began to appear. Furthermore, in the past three years, various Pretrained Language Models specialized for Korean have appeared. In this paper, we intend to numerically and qualitatively compare and analyze various Korean PLMs released to the public.
翻译:随着2017年用于翻译模型的变换器的出现,关注型结构开始引起人们的注意,此外,在BERT的出现之后,BERT的出现加强了作为变换器一部分的NLU专用编码器部分,GPT的架构加强了NLG专用编码器部分、各种方法、数据和学习预先培训语言模型的模型的出现。此外,过去三年中出现了各种针对朝鲜语的预先培训语言模型。在本文件中,我们打算从数字上和质量上比较并分析向公众发布的各种朝鲜语的多语言模型。