Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews in order to determine: a) The target entity being reviewed, b) The high-level aspect to which it belongs, and c) The sentiment expressed toward the targets and the aspects. Numerous yet scattered corpora for ABSA make it difficult for researchers to quickly identify corpora best suited for a specific ABSA subtask. This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems. Additionally, we provide an overview of the major corpora concerning the various ABSA and its subtasks and highlight several corpus features that researchers should consider when selecting a corpus. We conclude that further large-scale ABSA corpora are required. Additionally, because each corpus is constructed differently, it is time-consuming for researchers to experiment with a novel ABSA algorithm on many corpora and often employ just one or a few corpora. The field would benefit from an agreement on a data standard for ABSA corpora. Finally, we discuss the advantages and disadvantages of current collection approaches and make recommendations for future ABSA dataset gathering.
翻译:以视觉为基础的情绪分析(ABSA)是一个自然语言处理问题,需要分析用户产生的审查,以确定:(a) 正在审查的目标实体,(b) 其所属的高级别方面,(c) 对目标及其各方面表达的看法。ABSA的许多分散的团体使得研究人员难以迅速确定最适合ABSA具体子任务的公司。这项研究的目的是提供一个公司数据库,用于培训和评估自动的ABSA系统。此外,我们概述了有关ABSA及其子任务的主要公司,并强调了研究人员在选择一个实体时应考虑的若干基本特点。我们的结论是,需要进一步大规模ABSA公司。此外,由于每个机构的结构不同,研究人员需要花时间在很多公司上试用ABSA的新算法,而且往往只使用一个或几个公司。实地工作将受益于关于ABSA公司数据标准的协议。我们讨论了目前收集数据的方法的利弊。我们讨论了目前收集ABSA公司的方法的优点和缺点,并就未来收集数据提出建议。