Concerns about reproducibility in artificial intelligence (AI) have emerged, as researchers have reported unsuccessful attempts to directly reproduce published findings in the field. Replicability, the ability to affirm a finding using the same procedures on new data, has not been well studied. In this paper, we examine both reproducibility and replicability of a corpus of 16 papers on table structure recognition (TSR), an AI task aimed at identifying cell locations of tables in digital documents. We attempt to reproduce published results using codes and datasets provided by the original authors. We then examine replicability using a dataset similar to the original as well as a new dataset, GenTSR, consisting of 386 annotated tables extracted from scientific papers. Out of 16 papers studied, we reproduce results consistent with the original in only four. Two of the four papers are identified as replicable using the similar dataset under certain IoU values. No paper is identified as replicable using the new dataset. We offer observations on the causes of irreproducibility and irreplicability. All code and data are available on Codeocean at https://codeocean.com/capsule/6680116/tree.
翻译:在人工智能(AI)领域,出现了关于可重复性的担忧,因为研究人员已经报道了未能直接重现该领域发表结果的情况。可复制性在这方面研究不足。在本文中,我们使用16篇关于表结构识别(TSR)的论文语料库,对可重复性和可复制性进行了研究,TSR是一种旨在识别数字化文档中表格单元格位置的AI任务。我们尝试使用原作者提供的代码和数据集来重现发表结果。然后,我们使用与原始数据集相似的数据集以及新数据集GenTSR来检查可复制性,后者由386个从科学论文中提取的带注释表格组成。在研究的16个论文中,只有4个论文的结果与原始结果一致。在IoU值一定的情况下,用相似数据集确认了其中的两篇论文可复制。没有一个论文在使用新数据集时被鉴定为可复制的。我们提供了关于不可重复性和不可复制性的原因的观察。所有代码和数据都可在Codeocean上找到,网址为https://codeocean.com/capsule/6680116/tree。