Opinions in forums and social networks are released by millions of people due to the increasing number of users that use Web 2.0 platforms to opine about brands and organizations. For enterprises or government agencies it is almost impossible to track what people say producing a gap between user needs/expectations and organizations actions. To bridge this gap we create Viscovery, a platform for opinion summarization and trend tracking that is able to analyze a stream of opinions recovered from forums. To do this we use dynamic topic models, allowing to uncover the hidden structure of topics behind opinions, characterizing vocabulary dynamics. We extend dynamic topic models for incremental learning, a key aspect needed in Viscovery for model updating in near-real time. In addition, we include in Viscovery sentiment analysis, allowing to separate positive/negative words for a specific topic at different levels of granularity. Viscovery allows to visualize representative opinions and terms in each topic. At a coarse level of granularity, the dynamic of the topics can be analyzed using a 2D topic embedding, suggesting longitudinal topic merging or segmentation. In this paper we report our experience developing this platform, sharing lessons learned and opportunities that arise from the use of sentiment analysis and topic modeling in real world applications.
翻译:由于越来越多的用户使用Web 2.0平台来了解品牌和组织,因此,数百万人在论坛和社交网络中发表了意见。对于企业或政府机构来说,几乎不可能追踪人们所说的在用户需求/预期和组织行动之间产生差距的内容。为了缩小这一差距,我们创建了Viscovery,这是一个意见总结和趋势跟踪平台,能够分析从论坛中回收的意见流。为了做到这一点,我们使用动态主题模型,能够发现观点背后的隐藏主题结构,使词汇动态特征化。我们推广了渐进学习的动态主题模型,这是Viscury系统在近实时更新模型方面需要的一个关键方面。此外,我们在Viscury情感分析中包括了人们所说的那些在用户需求/预期和组织行动之间产生差距的内容,允许在不同微粒度上为特定主题单独使用正/负的词。维瑟利可以想象出每个专题的代表性意见和术语。在微粒度的粗略程度下,可以使用2D主题嵌入、建议纵向合并或分割来分析这些专题的动态。我们在模型文件中报告了我们开发这一平台的经验,分享了从世界分析中汲取的教训和机会。