正在进行的审查:关于Yelp审评建议的大规模纵向研究 (Reviews in motion: a large scale, longitudinal study of review recommendations on Yelp)

The United Nations Consumer Protection Guidelines lists "access ... to adequate information ... to make informed choices" as a core consumer protection right. However, problematic online reviews and imperfections in algorithms that detect those reviews pose obstacles to the fulfillment of this right. Research on reviews and review platforms often derives insights from a single web crawl, but the decisions those crawls observe may not be static. A platform may feature a review one day and filter it from view the next day. An appreciation for these dynamics is necessary to understand how a platform chooses which reviews consumers encounter and which reviews may be unhelpful or suspicious. We introduce a novel longitudinal angle to the study of reviews. We focus on "reclassification," wherein a platform changes its filtering decision for a review. To that end, we perform repeated web crawls of Yelp to create three longitudinal datasets. These datasets highlight the platform's dynamic treatment of reviews. We compile over 12.5M reviews--more than 2M unique--across over 10k businesses. Our datasets are available for researchers to use. Our longitudinal approach gives us a unique perspective on Yelp's classifier and allows us to explore reclassification. We find that reviews routinely move between Yelp's two main classifier classes ("Recommended" and "Not Recommended")--up to 8% over eight years--raising concerns about prior works' use of Yelp's classes as ground truth. These changes have impacts on small scales; for example, a business going from a 3.5 to 4.5 star rating despite no new reviews. Some reviews move multiple times: we observed up to five reclassifications in eleven months. Our data suggests demographic disparities in reclassifications, with more changes in lower density and low-middle income areas.

翻译：《联合国消费者保护准则》将“获取足够的信息......以做出知情选择......”列为核心消费者保护权利。然而,有问题的在线审查以及检测这些审查的算法中的不完善之处,对于实现这一权利构成了障碍。关于审查和审查平台的研究往往从单一的网络爬行中产生洞察力,但是这些爬行观察的决定可能不是静止的。一个平台可能以一天的审查为特点,从第二天的视野中过滤。对这些动态的欣赏是必要的,以了解一个平台如何选择审查消费者遇到的和审查可能无益或可疑的多层次。我们为审查的研究引入了一个新的纵向角度。我们侧重于“重新分类”,其中平台改变其筛选决定,以进行审查。为此,我们反复进行Yelp的网络爬行,以创建三个纵向数据集。这些数据集显示平台的动态处理方式。我们汇编了超过1250M次的快速审查,在10公里的企业中进行2M次独特的交叉。我们的数据设置可供研究人员使用。我们对Yelp's Glaseral 进行一些独特的视角,在Yelp's liveral recal recal levelopmental laves laction a laver be be lagiews in regiew le lavel lavel lagiew y y yal y y y y yal lavel y lavel laver y y y y lader) lader laps be be bes bes in be lader lavers lader latial latis bes bes bes bes bes bes bes bes in lader