全面分析使用各种案文内嵌式的 (Comprehensive Analysis of Aspect Term Extraction Methods using Various Text Embeddings)

Recently, a variety of model designs and methods have blossomed in the context of the sentiment analysis domain. However, there is still a lack of wide and comprehensive studies of aspect-based sentiment analysis (ABSA). We want to fill this gap and propose a comparison with ablation analysis of aspect term extraction using various text embedding methods. We particularly focused on architectures based on long short-term memory (LSTM) with optional conditional random field (CRF) enhancement using different pre-trained word embeddings. Moreover, we analyzed the influence on the performance of extending the word vectorization step with character embedding. The experimental results on SemEval datasets revealed that not only does bi-directional long short-term memory (BiLSTM) outperform regular LSTM, but also word embedding coverage and its source highly affect aspect detection performance. An additional CRF layer consistently improves the results as well.

翻译：最近,在情绪分析领域,各种模型设计和方法已经蓬勃发展,然而,仍然缺乏对基于侧面情绪分析的广泛和全面的研究(ABSA),我们希望填补这一空白,并提议比较利用各种文字嵌入方法对侧面抽取进行的对比分析,我们特别侧重于基于长期短期内存(LSTM)的建筑,使用不同的预先培训的字嵌入,选择有条件随机字段(CRF)的增强。此外,我们分析了用性格嵌入延长单词传导步骤的性能的影响。SemEval数据集的实验结果显示,不仅双向短期内存(BILSTM)优于常规LSTM,而且单词嵌入覆盖范围及其来源也严重影响了方面检测性能。另外,一个额外的通用报告格式层也不断改进了结果。

相关内容

条件随机场

关注 341

条件随机域（场）（conditional random fields，简称 CRF，或CRFs），是一种判别式概率模型，是随机场的一种，常用于标注或分析序列资料，如自然语言文字或是生物序列。如同马尔可夫随机场，条件随机场为具有无向的图模型，图中的顶点代表随机变量，顶点间的连线代表随机变量间的相依关系，在条件随机场中，随机变量 Y 的分布为条件机率，给定的观察值则为随机变量 X。原则上，条件随机场的图模型布局是可以任意给定的，一般常用的布局是链结式的架构，链结式架构不论在训练（training）、推论（inference）、或是解码（decoding）上，都存在效率较高的算法可供演算。

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

专知会员服务

55+阅读 · 2020年4月15日