We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks. The V-Doc supports generating and using both extractive and abstractive question-answer pairs using documents images. The extractive QA selects a subset of tokens or phrases from the document contents to predict the answers, while the abstractive QA recognises the language in the content and generates the answer based on the trained model. Both aspects are crucial to understanding the documents, especially in an image format. We include a detailed scenario of question generation for the abstractive QA task. V-Doc supports a wide range of datasets and models, and is highly extensible through a declarative, framework-agnostic platform.
翻译:我们提出V-Doc,这是一个使用文件图像和PDF的问答工具,主要供研究人员和一般非深层学习专家使用,他们寻求生成、处理和理解文件直观回答任务。V-Doc支持利用文件图像生成和使用抽取和抽象的问答对配。采掘质量评估从文件内容中选择了一组符号或短语来预测答案,而抽象质量评估则承认内容中的语言并根据经过培训的模式生成答案。这两个方面对于理解文件都至关重要,特别是图像格式。我们包括了为抽象的QA任务生成问题的详细设想。V-Doc支持一系列广泛的数据集和模型,并且通过一个声明性、框架性、性、性、性平台可以高度伸张。