The problem of measuring sentence similarity is an essential issue in the natural language processing (NLP) area. It is necessary to measure the similarity between sentences accurately. There are many approaches to measuring sentence similarity. Deep learning methodology shows a state-of-the-art performance in many natural language processing fields and is used a lot in sentence similarity measurement methods. However, in the natural language processing field, considering the structure of the sentence or the word structure that makes up the sentence is also important. In this study, we propose a methodology combined with both deep learning methodology and a method considering lexical relationships. Our evaluation metric is the Pearson correlation coefficient and Spearman correlation coefficient. As a result, the proposed method outperforms the current approaches on a KorSTS standard benchmark Korean dataset. Moreover, it performs a maximum of 65% increase than only using deep learning methodology. Experiments show that our proposed method generally results in better performance than those with only a deep learning model.
翻译:衡量判决相似性问题是自然语言处理(NLP)领域的一个基本问题。 有必要精确地衡量判决相似性。 有很多衡量判决相似性的方法。 深层学习方法显示在许多自然语言处理领域最先进的表现,并大量使用判决相似性测量方法。 但是,在自然语言处理领域,考虑到判决的结构或构成判决的词结构也很重要。 在这项研究中,我们提出了一种方法,既结合深层学习方法,又结合一种考虑词汇关系的方法。 我们的评估指标是皮尔逊相关系数和斯皮尔曼相关系数。 结果是,拟议的方法比韩国标准韩国标准数据集目前的方法更优。 此外,该方法比仅使用深层学习方法高出65%。 实验表明,我们提出的方法通常比只有深层学习模式的方法更能产生更好的效果。