Translated Title: 大型语言模型作为“万能钥匙”：利用GPT揭示材料科学中的秘密 (Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT) - 专知论文

会员服务 ·

0

大型语言模型 · 数据集 · Facebook AI Research · 语言模型 · 钙钛矿太阳能电池 ·

2023 年 4 月 6 日

Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT

翻译：Translated Title: 大型语言模型作为“万能钥匙”：利用GPT揭示材料科学中的秘密

Tong Xie,Yuwei Wan,Wei Huang,Yufei Zhou,Yixuan Liu,Qingyuan Linghu,Shaozhou Wang,Chunyu Kit,Clara Grazian,Bram Hoex

This article presents a new NLP task called structured information inference (SII) to address the complexities of information extraction at the device level in materials science. We accomplished this task by tuning GPT-3 on an existed perovskite solar cell FAIR(Findable, Accessible, Interoperable, Reusable) dataset with 91.8 F1-score and we updated the dataset with all related scientific papers up to now. The produced dataset is formatted and normalized, enabling its direct utilization as input in subsequent data analysis. This feature will enable materials scientists to develop their own models by selecting high-quality review papers within their domain. Furthermore, we designed experiments to predict solar cells' electrical performance and reverse-predict parameters on both material gene and FAIR datesets through LLM. We obtained comparable performance with traditional machine learning methods without feature selection, which demonstrates the potential of large language models to judge materials and design new materials like a materials scientist.

翻译：Translated Abstract: 本文提出了一个名为结构化信息推理（SII）的新NLP任务，以解决材料科学设备级别信息提取的复杂性。我们利用已有的钙钛矿太阳能电池FAIR(Findable, Accessible, Interoperable, Reusable)数据集，调整GPT-3，获得91.8 F1分数，并更新了现有的相关科学论文数据集。所生成的数据集格式化和标准化，可以直接用作后续数据分析的输入。这个功能将使材料科学家通过选择其领域内高质量的论文，开发自己的模型。此外，我们设计了实验，在LLM中预测太阳能电池的电气性能，并通过材料基因和FAIR数据集进行反向预测参数。我们获得了与传统机器学习方法相当的性能，而无需特征选择，这证明了大型语言模型在评估材料和设计新材料方面与材料科学家相媲美的潜力。

0

相关内容

大型语言模型

大型语言模型

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

世界经济论坛发布《2021年十大新兴技术》，Top 10 Emerging Technologies of 2021

世界经济论坛发布《2021年十大新兴技术》，Top 10 Emerging Technologies of 2021

专知会员服务

18+阅读 · 2022年4月5日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

专知会员服务

36+阅读 · 2019年12月21日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

机器学习在材料科学中的应用综述，21页pdf

机器学习在材料科学中的应用综述，21页pdf

专知会员服务

49+阅读 · 2019年9月24日

【IJCAI 2019 | tutorial】材料学与AI AI for Materials Science , Lars Kotthof

【IJCAI 2019 | tutorial】材料学与AI AI for Materials Science , Lars Kotthof

专知会员服务

18+阅读 · 2019年8月12日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

云计算框架下大规模科学计算安全外包协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

伽玛辐照诱发石墨烯结构的损伤及修复机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Fermi-LAT和AMS-02的暗物质理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

点青霉葡萄糖氧化酶热稳定性关键氨基酸研究

国家自然科学基金

0+阅读 · 2012年12月31日

EMB564蛋白参与玉米种子活力保持的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

HAX-1调控非受体酪氨酸激酶c-Abl活性与降解的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

含能材料冲击起爆机理的量子分子动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

人类线粒体DNA古老变异潜在致病性的功能验证

国家自然科学基金

0+阅读 · 2011年12月31日

无机纳米材料-聚合物复合结构高效率电致发光

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation

Arxiv

0+阅读 · 2023年5月23日

Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

Arxiv

0+阅读 · 2023年5月23日

Active Prompting with Chain-of-Thought for Large Language Models

Arxiv

0+阅读 · 2023年5月23日

VideoLLM: Modeling Video Sequence with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Arxiv

0+阅读 · 2023年5月22日

The CLIP Model is Secretly an Image-to-Prompt Converter

Arxiv

0+阅读 · 2023年5月22日

Diving into the Inter-Consistency of Large Language Models: An Insightful Analysis through Debate

Arxiv

0+阅读 · 2023年5月19日

PersonaLLM: Investigating the Ability of GPT-3.5 to Express Personality Traits and Gender Differences

Arxiv

0+阅读 · 2023年5月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

大型语言模型

Facebook AI Research

钙钛矿太阳能电池

相关VIP内容

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

世界经济论坛发布《2021年十大新兴技术》，Top 10 Emerging Technologies of 2021

世界经济论坛发布《2021年十大新兴技术》，Top 10 Emerging Technologies of 2021

专知会员服务

18+阅读 · 2022年4月5日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

【Nature交叉学科论文】机器学习在固体材料科学中的最新进展和应用，Recent advances and applications of machine learning in solidstate materials science

专知会员服务

36+阅读 · 2019年12月21日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

机器学习在材料科学中的应用综述，21页pdf

机器学习在材料科学中的应用综述，21页pdf

专知会员服务

49+阅读 · 2019年9月24日

【IJCAI 2019 | tutorial】材料学与AI AI for Materials Science , Lars Kotthof

【IJCAI 2019 | tutorial】材料学与AI AI for Materials Science , Lars Kotthof

专知会员服务

18+阅读 · 2019年8月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation

Arxiv

0+阅读 · 2023年5月23日

Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

Arxiv

0+阅读 · 2023年5月23日

Active Prompting with Chain-of-Thought for Large Language Models

Arxiv

0+阅读 · 2023年5月23日

VideoLLM: Modeling Video Sequence with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Arxiv

0+阅读 · 2023年5月22日

The CLIP Model is Secretly an Image-to-Prompt Converter

Arxiv

0+阅读 · 2023年5月22日

Diving into the Inter-Consistency of Large Language Models: An Insightful Analysis through Debate

Arxiv

0+阅读 · 2023年5月19日

PersonaLLM: Investigating the Ability of GPT-3.5 to Express Personality Traits and Gender Differences

Arxiv

0+阅读 · 2023年5月18日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

云计算框架下大规模科学计算安全外包协议研究

国家自然科学基金

1+阅读 · 2014年12月31日

伽玛辐照诱发石墨烯结构的损伤及修复机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Fermi-LAT和AMS-02的暗物质理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

点青霉葡萄糖氧化酶热稳定性关键氨基酸研究

国家自然科学基金

0+阅读 · 2012年12月31日

EMB564蛋白参与玉米种子活力保持的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

HAX-1调控非受体酪氨酸激酶c-Abl活性与降解的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

含能材料冲击起爆机理的量子分子动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

人类线粒体DNA古老变异潜在致病性的功能验证

国家自然科学基金

0+阅读 · 2011年12月31日

无机纳米材料-聚合物复合结构高效率电致发光

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员