The architecture, engineering and construction (AEC) sector extensively uses documents supporting product and process development. As part of this, organisations should handle big data of hundreds, or even thousands, of technical documents strongly linked together, including CAD design of industrial plants, equipment purchase orders, quality certificates, and part material analysis. However, analysing such records is daunting for users because it gets complicated to sift through hundreds of documents to establish valuable relationships. This paper addresses how knowledge extracted from linked engineering documents contributes to industrial digitalisation under IT/OT convergence. The proposed GraphLED is a system tasked with data processing, graph-based modelling, and colourful visualisation of related documents. The graph-based approach ensures an improved understanding of linked information because the graph structure offers a promising tool to model the underlying data properties of engineering documents. Preliminary system validation indicates quality improvements are possible in the OCR-based data (85.9% of ambiguous text data removed). This work has the potential to benefit the industry by improving the reliability and resilience of industrial production systems through automated summaries of large quantities of documents and their linkage.
翻译:建筑、工程和建筑(AEC)部门广泛使用支持产品和工艺发展的文件,作为这项工作的一部分,各组织应处理数百甚至数千份技术文件的大数据,其中包括CAD设计工业工厂、设备采购订单、质量证书和部分材料分析;然而,分析这些记录对用户来说是艰巨的,因为要筛选成数百份文件以建立宝贵的关系就变得复杂了;本文件讨论了从链接工程文件中获取的知识如何有助于信息技术/OT趋同下的工业数字化;拟议的GreaphLED是一个负责数据处理、图表建模和相关文件的彩色可视化的系统;基于图表的方法确保更好地了解链接的信息,因为图表结构为模拟工程文件的基本数据特性提供了很有希望的工具;初步系统验证表明,基于OCR的数据(删除了85.9%的模糊文本数据)质量是可以改进的;这项工作有可能通过对大量文件及其联系进行自动摘要,提高工业生产系统的可靠性和复原力,从而有利于工业。