Sentiment analysis on software engineering (SE) texts has been widely used in the SE research, such as evaluating app reviews or analyzing developers sentiments in commit messages. To better support the use of automated sentiment analysis for SE tasks, researchers built an SE-domain-specified sentiment dictionary to further improve the accuracy of the results. Unfortunately, recent work reported that current mainstream tools for sentiment analysis still cannot provide reliable results when analyzing the sentiments in SE texts. We suggest that the reason for this situation is because the way of expressing sentiments in SE texts is largely different from the way in social network or movie comments. In this paper, we propose to improve sentiment analysis in SE texts by using sentence structures, a different perspective from building a domain dictionary. Specifically, we use sentence structures to first identify whether the author is expressing her sentiment in a given clause of an SE text, and to further adjust the calculation of sentiments which are confirmed in the clause. An empirical evaluation based on four different datasets shows that our approach can outperform two dictionary-based baseline approaches, and is more generalizable compared to a learning-based baseline approach.
翻译:对软件工程(SE)文本的感官分析在SE研究中被广泛使用,例如评价应用审查或分析开发者在承诺信息中的情绪。为了更好地支持对SE任务使用自动情绪分析,研究人员建立了SE-domain-指定情绪字典,以进一步提高结果的准确性。不幸的是,最近的工作报告说,在分析SE文本中的情绪时,目前用于情绪分析的主流工具仍然不能提供可靠的结果。我们建议,造成这种情况的原因是,在SE文本中表达情绪的方式与社会网络或电影评论中的方式大不相同。在本文中,我们提议通过使用句子结构改进SE文本中的情绪分析,这是与建立域字典不同的观点。具体地说,我们使用句子结构首先确定作者是否在SE文本的某个特定条款中表达她的情绪,并进一步调整条款中确认的情绪的计算。根据四个不同的数据集进行的经验评估表明,我们的方法可以超越基于字典的两种基线方法,并且比基于学习的基线方法更为普遍。