面向企业的商品评论代表性意见提取策略研究

项目名称： 面向企业的商品评论代表性意见提取策略研究

项目编号： No.71302158

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 管理科学

项目作者： 任明

作者单位： 中国人民大学

项目金额： 19万元

中文摘要： 大量商品评论伴随着网络购物市场的繁荣和Web2.0应用的兴起而到来，它们反映着商品的口碑，不仅能够帮助顾客进行购物决策，也给企业了解顾客的意见开辟了新的途径。如何在大量的商品评论中提取代表性意见，已经引起了学术界和业界的广泛关注。目前商品评论通常是为了满足顾客的需求，根据热度、新鲜度、有用性等指标对评论进行排序，但这不一定能得到丰富多样的意见。本研究从企业的应用需求出发，目标是提取能尽可能多的覆盖意见总体的信息、有尽可能少的信息冗余的代表性意见集合。研究以信息抽取策略为主线，以意见挖掘和情感分析领域的方法为补充，增进对意见文本的理解，使得提取出的代表性意见能够有效的反映不同方面的意见。具体工作围绕四方面展开：（1）代表性意见提取的理论框架；（2）意见文本的表示；（3）代表性意见提取的方法；（4）代表性意见的验证。研究注重基于真实数据和用户实验进行验证，相关工作兼具理论创新意义和实践价值。

中文关键词： 代表性信息；信息提取；覆盖度；冗余度；在线评论

英文摘要： With the rapid growth of e-commerce and web2.0 applications, an enormous number of product reviews has emerged, which not only helps the consumers find the information on the products in decision making, but also enables the enterprises to listen to consumers and to improve their products. It has become meaningful to extract a small set of reviews, as reading through all the reviews is neither practical nor interesting. In practice, ranking criteria, such as hotness, freshness, usefulness, are widely used by many online information search services, to provide the highly-ranked ones to consumers, however, such an ordered list of reviews does not necessarily represent all different viewpoints (e.g., positive vs. negative) of the products. This study attempts to extract representative opinions in product reviews for enterprises, which covers the information content of the reviews as much as possible, and at the same time minimizes the redundancy. Then the extraction of representative opinions is formulated as an optimization problem based on an aggregator measure of coverage and redundancy. This study focuses on the following four aspects, i.e., the framework of extracting representative opinions, the formulation of opinions in terms of feature and opinion polarity, the algorithm and the evaluation of the approach.

英文关键词： Representative information；Information extraction；Coverage；Redundancy；Online reviews

成为VIP会员查看完整内容