There is an increasing need for the ability to model fine-grained opinion shifts of social media users, as concerns about the potential polarizing social effects increase. However, the lack of publicly available datasets that are suitable for the task presents a major challenge. In this paper, we introduce an innovative annotated dataset for modeling subtle opinion fluctuations and detecting fine-grained stances. The dataset includes a sufficient amount of stance polarity and intensity labels per user over time and within entire conversational threads, thus making subtle opinion fluctuations detectable both in long term and in short term. All posts are annotated by non-experts and a significant portion of the data is also annotated by experts. We provide a strategy for recruiting suitable non-experts. Our analysis of the inter-annotator agreements shows that the resulting annotations obtained from the majority vote of the non-experts are of comparable quality to the annotations of the experts. We provide analyses of the stance evolution in short term and long term levels, a comparison of language usage between users with vacillating and resolute attitudes, and fine-grained stance detection baselines.
翻译:由于对潜在社会两极分化的社会效应的关切增加,越来越需要有能力模拟社交媒体用户的微小意见转变,然而,缺乏适合这项任务的公开数据集是一个重大挑战。在本文件中,我们引入了一套创新的附加说明的数据集,用于模拟微妙意见波动和发现微小立场。数据集包括每个用户在一段时间内和在整个谈话线索内有足够的立场极化和强度标签,从而在长期和短期内都可察觉到微妙的意见波动。所有职位都由非专家附加说明,而且大量数据也由专家附加说明。我们为征聘合适的非专家提供了战略。我们对跨专家协议的分析表明,非专家多数投票得出的说明的质量与专家的说明相当。我们分析了短期和长期的态势演变,比较了态度不稳定和坚定的用户使用的语言,以及精确的立场检测基线。