With the exponential growth of online marketplaces and user-generated content therein, aspect-based sentiment analysis has become more important than ever. In this work, we critically review a representative sample of the models published during the past six years through the lens of a practitioner, with an eye towards deployment in production. First, our rigorous empirical evaluation reveals poor reproducibility: an average 4-5% drop in test accuracy across the sample. Second, to further bolster our confidence in empirical evaluation, we report experiments on two challenging data slices, and observe a consistent 12-55% drop in accuracy. Third, we study the possibility of transfer across domains and observe that as little as 10-25% of the domain-specific training dataset, when used in conjunction with datasets from other domains within the same locale, largely closes the gap between complete cross-domain and complete in-domain predictive performance. Lastly, we open-source two large-scale annotated review corpora from a large e-commerce portal in India in order to aid the study of replicability and transfer, with the hope that it will fuel further growth of the field.
翻译:随着在线市场和用户生成的内容的指数增长,基于侧面的情绪分析比以往更加重要。在这项工作中,我们严格地审查过去六年中通过一名执业者镜头公布的模型的代表性样本,着眼于生产中的部署。首先,我们严格的实证评估显示,复制率差:在整个抽样中测试准确性平均下降4-5%。第二,为了进一步加强我们对经验评估的信心,我们报告了两个具有挑战性的数据片的实验,并观察到一个一致的12-55%的准确性下降。第三,我们研究跨域转移的可能性,并观察到只有10-25%的域特定培训数据集,与同一地区其他领域的数据集一起使用时,基本上缩小了完整跨领域和完整内部预测性业绩之间的差距。最后,我们从印度一个大型电子商务门户公开了两个有注释的大型审查公司,以协助对可复制性和转让性的研究,希望它们能够促进外地的进一步发展。