Tables are a powerful and popular tool for organizing and manipulating data. A vast number of tables can be found on the Web, which represents a valuable knowledge resource. The objective of this survey is to synthesize and present two decades of research on web tables. In particular, we organize existing literature into six main categories of information access tasks: table extraction, table interpretation, table search, question answering, knowledge base augmentation, and table augmentation. For each of these tasks, we identify and describe seminal approaches, present relevant resources, and point out interdependencies among the different tasks.
翻译:表格是整理和调控数据的有力和受欢迎的工具,许多表格可以在网上找到,这是宝贵的知识资源,本调查的目的是综合和介绍20年来在网上表格上进行的研究,特别是将现有文献分为信息调阅任务的六大类:表格提取、表格解释、表格搜索、问答、知识基础增强和表格扩充。对于其中每一项任务,我们确定和描述开创性方法,提出相关资源,并指出不同任务之间的相互依存关系。