Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery. However, the current process is difficult, error-prone, and laborious due to (1) the enormous amount of scientific literature available; (2) the highly-specialized scientific domains; (3) the diverse modalities of information (text, figure, table); and, (4) the silos of scientific knowledge in different publications with inconsistent formats and structures. Informed by a formative study and iterated with participatory design workshops, we designed and developed KnowledgeShovel, an Al-in-the-Loop document annotation system for researchers to construct scientific knowledge bases. The design of KnowledgeShovel introduces a multi-step multi-modal human-AI collaboration pipeline that aligns with users' existing workflows to improve data accuracy while reducing the human burden. A follow-up user evaluation with 7 geoscience researchers shows that KnowledgeShovel can enable efficient construction of scientific knowledge bases with satisfactory accuracy.
翻译:建立全面、准确和有用的科学知识库,对于人类研究人员综合科学知识和促成Al驱动的科学发现至关重要,然而,目前的过程困难、容易出错和困难,原因是:(1) 现有大量科学文献;(2) 高度专业化的科学领域;(3) 多种信息模式(文本、图表、表格);(4) 不同出版物中不同格式和结构不一致的科学知识库;通过编造式研究并经过参与性设计讲习班的循环,我们设计和开发了知识Shovel,这是一个Al-the-Loop文件批注系统,供研究人员建立科学知识基础;“知识Shovel”的设计引入了一个多步多式多式人类-AI合作管道,与用户现有的工作流程保持一致,以提高数据准确性,同时减轻人类负担。与7名地球科学研究人员进行的后续用户评估表明,“知识Shovel”能够以令人满意的准确性有效构建科学知识基础。