Printed documents continue to be a challenge for blind, low-vision, and other print-disabled (BLV) individuals. In this paper, we focus on the specific problem of (in-)accessibility of internal references to citations, footnotes, figures, tables and equations. While sighted users can flip to the referenced content and flip back in seconds, linear audio narration that BLV individuals rely on makes following these references extremely hard. We propose a vision based technique to locate the referenced content and extract metadata needed to (in subsequent work) inline a content summary into the audio narration. We apply our technique to citations in scientific documents and find it works well both on born-digital as well as scanned documents.
翻译:印刷文件仍然是盲人、低视力者和其他印刷残疾者面临的一项挑战。在本文中,我们侧重于内部引用引文、脚注、数字、表格和方程的(在)无障碍性的具体问题。有视力的用户可以翻转参考内容并在秒内翻转,但BLV个人所依赖的线性音解说使这些引用内容变得极其艰难。我们提出了一个基于愿景的技术,以查找引用内容,并提取(在随后的工作中)将内容摘要纳入音频解析中所需的元数据。我们运用我们的技术在科学文件中引用,发现它对于天生数字文件和扫描文件都非常有效。