通过使用合成数据自动解析结构视觉内容 (Towards Automatic Parsing of Structured Visual Content through the Use of Synthetic Data)

Lukas Scholch,Jonas Steinhauser,Maximilian Beichter,Constantin Seibold,Kailun Yang,Merlin Knäble,Thorsten Schwarz,Alexander Mädche,Rainer Stiefelhagen

from arxiv, 7 pages

Structured Visual Content (SVC) such as graphs, flow charts, or the like are used by authors to illustrate various concepts. While such depictions allow the average reader to better understand the contents, images containing SVCs are typically not machine-readable. This, in turn, not only hinders automated knowledge aggregation, but also the perception of displayed in-formation for visually impaired people. In this work, we propose a synthetic dataset, containing SVCs in the form of images as well as ground truths. We show the usage of this dataset by an application that automatically extracts a graph representation from an SVC image. This is done by training a model via common supervised learning methods. As there currently exist no large-scale public datasets for the detailed analysis of SVC, we propose the Synthetic SVC (SSVC) dataset comprising 12,000 images with respective bounding box annotations and detailed graph representations. Our dataset enables the development of strong models for the interpretation of SVCs while skipping the time-consuming dense data annotation. We evaluate our model on both synthetic and manually annotated data and show the transferability of synthetic to real via various metrics, given the presented application. Here, we evaluate that this proof of concept is possible to some extend and lay down a solid baseline for this task. We discuss the limitations of our approach for further improvements. Our utilized metrics can be used as a tool for future comparisons in this domain. To enable further research on this task, the dataset is publicly available at https://bit.ly/3jN1pJJ

翻译：作者们使用图表、流程图等结构结构视觉内容( SVC) 来说明各种概念。虽然这些描述使普通读者能够更好地了解内容, 但包含 SVC 的图像通常无法机器读取。这不但阻碍自动知识聚合, 也妨碍视障人士对显示的内装图像的感知。在这项工作中, 我们提出一个合成数据集, 包含图像形式的 SVC 以及地面真相。我们通过一个自动从 SVC 图像中提取图表表示的应用程序来显示这个数据集的用途。这是通过共同监督的学习方法培训一个模型来完成的。由于目前没有大型的SVC 详细分析公共数据集, 我们提议SVC (SSVC) 由12 000 个图像组成的合成数据集, 包含相应的捆绑框说明和详细的图表表解。我们的数据集可以让SVC 进一步开发强有力的模型来解释 SVC, 同时跳过耗时密度的数据注释。我们通过共同监督的学习方法来评估一个模型。我们的合成和手动的模型, 将这个模型用来进行真正的基准的转换。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日