The scientific literature contains a wealth of cutting-edge knowledge in the field of materials science, as well as useful data (e.g., numerical data from experimental results, material properties and structure). These data are critical for data-driven machine learning (ML) and deep learning (DL) methods to accelerate material discovery. Due to the large and growing number of publications, it is difficult for humans to manually retrieve and retain this knowledge. In this context, we investigate a deep neural network model based on Bi-LSTM to retrieve knowledge from published scientific articles. The proposed deep neural network-based model achieves an f-1 score of \~97\% for the Material Named Entity Recognition (MNER) task. The study addresses motivation, relevant work, methodology, hyperparameters, and overall performance evaluation. The analysis provides insight into the results of the experiment and points to future directions for current research.
翻译:科学文献包含大量在材料科学领域的尖端知识,以及有用的数据(例如实验结果、物质属性和结构的数值数据)。这些数据对于数据驱动机器学习(ML)和深入学习(DL)加速材料发现的方法至关重要。由于出版物数量庞大且不断增加,人类很难手动检索和保留这种知识。在这方面,我们调查了一个基于Bi-LSTM的深神经网络模型,以便从已发表的科学文章中检索知识。拟议的深神经网络模型在材料命名实体识别(MNER)任务中取得了F-1分的97 ⁇ 。研究涉及动力、相关工作、方法、超参数和总体绩效评估。分析提供了对实验结果的深入了解,并指明了当前研究的未来方向。