The development of digitization methods for line drawings (especially in the area of electrical engineering) relies on the availability of publicly available training and evaluation data. This paper presents such an image set along with annotations. The dataset consists of 1152 images of 144 circuits by 12 drafters and 48 563 annotations. Each of these images depicts an electrical circuit diagram, taken by consumer grade cameras under varying lighting conditions and perspectives. A variety of different pencil types and surface materials has been used. For each image, all individual electrical components are annotated with bounding boxes and one out of 45 class labels. In order to simplify a graph extraction process, different helper symbols like junction points and crossovers are introduced, while texts are annotated as well. The geometric and taxonomic problems arising from this task as well as the classes themselves and statistics of their appearances are stated. The performance of a standard Faster RCNN on the dataset is provided as an object detection baseline.
翻译:开发线条绘图数字化方法(特别是在电气工程领域)取决于能否获得公开可得的培训和评价数据,本文件提供这种图像和说明,数据集包括12名起草人提供的1 152张144条电路的图像和48 563条说明,其中每张图像都描述了消费者级相机在不同照明条件和角度下拍摄的电路图,使用了各种不同的铅笔类型和表面材料,每个图像都配有捆绑盒,45个类标签中有一个。为了简化图解提取过程,采用了不同的辅助标志,如连接点和交叉点,文本也加注。说明了这项任务产生的几何学和分类问题,以及其外观的分类和统计。在数据集上标准快速RCNN的性能作为物体探测基线提供。