化学工具协同大语言模型的增强——ChemCrow (ChemCrow: Augmenting large-language models with chemistry tools) - 专知论文

会员服务 ·

0

语言模型 · GPT-4 · 工具 · 协同 · 材料设计 ·

2023 年 4 月 11 日

ChemCrow: Augmenting large-language models with chemistry tools

翻译：化学工具协同大语言模型的增强——ChemCrow

Andres M Bran,Sam Cox,Andrew D White,Philippe Schwaller

Large-language models (LLMs) have recently shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 13 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our evaluation, including both LLM and expert human assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and GPT-4 + ChemCrow performance. There is a significant risk of misuse of tools like ChemCrow and we discuss their potential harms. Employed responsibly, ChemCrow not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.

翻译：摘要：近年来，大型语言模型在不同领域的任务上表现出了强大的性能，但却在化学相关的问题上表现不佳。此外，这些模型缺乏对外部知识来源的访问，限制了它们在科学应用中的使用价值。在本研究中，我们介绍了ChemCrow，一种基于大语言模型的化学智能工具，旨在解决有机合成、药物研发和材料设计等化学任务。通过集成13种专业设计的工具，ChemCrow增强了大语言模型在化学方面的性能，并产生了新的功能。我们的评估，包括大语言模型和人类专家的评估，证明了ChemCrow在自动化各种化学任务方面的有效性。令人惊讶的是，我们发现GPT-4作为测试器无法区分明显错误的GPT-4输出和GPT-4 + ChemCrow难以分辨的表现之间的差异。这些工具的滥用存在着重大风险，我们讨论了它们的潜在危害。在负责任的使用下，ChemCrow不仅有助于专业化学家并降低非专业人士的门槛，而且通过桥接实验和计算化学之间的差距促进了科学的发展。

0

相关内容

语言模型

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

专知会员服务

54+阅读 · 2023年4月1日

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

Chem. Sci.｜Root-aligned SMILES：为化学反应预测而设计的一种紧凑表示

Chem. Sci.｜Root-aligned SMILES：为化学反应预测而设计的一种紧凑表示

专知会员服务

3+阅读 · 2022年7月16日

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

专知会员服务

522+阅读 · 2022年1月31日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

原发性干燥综合征中枢神经系统损害的多模态脑磁共振、SPECT及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Tob基因在大肠癌发生发展中的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

稀土RE-Mn-Fe体系相图及稀土对MnFe合金磁致伸缩性能的影响

国家自然科学基金

0+阅读 · 2014年12月31日

铁基铁磁超导体的化学掺杂与物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂二氧化钛半导体光催化剂对芳基C-H键官能团化反应的催化活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

转录因子Slug体内调控前列腺癌生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

羧酸亚铜催化并参与下的碳碳、碳氮偶联及炔水合反应研究

国家自然科学基金

0+阅读 · 2012年12月31日

供应链多级库存网络的RFID使能的Push/Pull混合控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

电化学法制备金属多孔rugate滤波片来实现在可见-红外区完美吸收器

国家自然科学基金

0+阅读 · 2012年12月31日

Improved Probabilistic Image-Text Representations

Improved Probabilistic Image-Text Representations

Arxiv

0+阅读 · 2023年5月29日

Do Large Language Models Know What They Don't Know?

Arxiv

0+阅读 · 2023年5月29日

Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot LLMs

Arxiv

0+阅读 · 2023年5月28日

Towards Reasoning in Large Language Models: A Survey

Towards Reasoning in Large Language Models: A Survey

Arxiv

0+阅读 · 2023年5月26日

AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing Software Vulnerabilities

Arxiv

0+阅读 · 2023年5月26日

Online Ad Allocation with Predictions

Arxiv

0+阅读 · 2023年5月24日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

大模型如何适用长尾或特定领域？微软等提出《参数知识引导的增强大型语言模型》，扩展LLM的垂直化长尾适配能力

专知会员服务

87+阅读 · 2023年5月10日

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

CVPR 2023 | Prophet: 用小模型启发大语言模型解决外部知识图像问答

专知会员服务

54+阅读 · 2023年4月1日

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

Chem. Sci.｜Root-aligned SMILES：为化学反应预测而设计的一种紧凑表示

Chem. Sci.｜Root-aligned SMILES：为化学反应预测而设计的一种紧凑表示

专知会员服务

3+阅读 · 2022年7月16日

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

【2022新书】Transformer自然语言处理，Natural Language Processing with Transformers: Building Language Applications with Hugging Face

专知会员服务

522+阅读 · 2022年1月31日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

【北京智源大会2019】增强人类智能：从搜索引擎到智能任务助理（ Augmenting Human Intelligence: From Search Engines to Intelligent Task Assistants ）

专知会员服务

20+阅读 · 2019年11月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS 2025】视觉指令瓶颈微调

什么是模块化开放系统方法（MOSA）？从美陆军新型倾转旋翼机视角解读

【牛津博士论文】面向视觉、物理与语言应用的可信机器学习模型

医学领域大型语言模型的新进展

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Improved Probabilistic Image-Text Representations

Improved Probabilistic Image-Text Representations

Arxiv

0+阅读 · 2023年5月29日

Do Large Language Models Know What They Don't Know?

Arxiv

0+阅读 · 2023年5月29日

Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot LLMs

Arxiv

0+阅读 · 2023年5月28日

Towards Reasoning in Large Language Models: A Survey

Towards Reasoning in Large Language Models: A Survey

Arxiv

0+阅读 · 2023年5月26日

AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing Software Vulnerabilities

Arxiv

0+阅读 · 2023年5月26日

Online Ad Allocation with Predictions

Arxiv

0+阅读 · 2023年5月24日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Arxiv

12+阅读 · 2023年4月26日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Arxiv

12+阅读 · 2019年3月8日

相关基金

原发性干燥综合征中枢神经系统损害的多模态脑磁共振、SPECT及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Tob基因在大肠癌发生发展中的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

稀土RE-Mn-Fe体系相图及稀土对MnFe合金磁致伸缩性能的影响

国家自然科学基金

0+阅读 · 2014年12月31日

铁基铁磁超导体的化学掺杂与物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂二氧化钛半导体光催化剂对芳基C-H键官能团化反应的催化活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

转录因子Slug体内调控前列腺癌生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

羧酸亚铜催化并参与下的碳碳、碳氮偶联及炔水合反应研究

国家自然科学基金

0+阅读 · 2012年12月31日

供应链多级库存网络的RFID使能的Push/Pull混合控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

电化学法制备金属多孔rugate滤波片来实现在可见-红外区完美吸收器

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员