The material science literature contains up-to-date and comprehensive scientific knowledge of materials. However, their content is unstructured and diverse, resulting in a significant gap in providing sufficient information for material design and synthesis. To this end, we used natural language processing (NLP) and computer vision (CV) techniques based on convolutional neural networks (CNN) to discover valuable experimental-based information about nanomaterials and synthesis methods in energy-material-related publications. Our first system, TextMaster, extracts opinions from texts and classifies them into challenges and opportunities, achieving 94% and 92% accuracy, respectively. Our second system, GraphMaster, realizes data extraction of tables and figures from publications with 98.3\% classification accuracy and 4.3% data extraction mean square error. Our results show that these systems could assess the suitability of materials for a certain application by evaluation of synthesis insights and case analysis with detailed references. This work offers a fresh perspective on mining knowledge from scientific literature, providing a wide swatch to accelerate nanomaterial research through CNN.
翻译:材料科学文献包含对材料的最新和全面科学知识,然而,其内容不结构且多样,在为材料设计和合成提供充足的信息方面存在巨大差距。为此,我们利用基于进化神经网络的自然语言处理(NLP)和计算机视觉(CV)技术,在与能源材料有关的出版物中发现关于纳米材料的宝贵实验性信息和合成方法。我们的第一个系统,TextMaster,从文本中摘取意见并将其分为挑战和机遇,分别达到94%和92%的准确度。我们的第二个系统,GapMaster,从出版物中提取了98.3%的分类精确度和4.3%的数据,数据提取平方位平均值错误。我们的结果表明,这些系统可以通过评估合成见解和详细案例分析来评估材料是否适合某些应用。这项工作从科学文献中提供了关于采矿知识的新视角,通过CNN加速纳米材料研究提供了广泛的观察。