BloombergGPT: 金融大语言模型 (BloombergGPT: A Large Language Model for Finance) - 专知论文

会员服务 ·

0

BloombergGPT · 金融 · 基准测试 · 语言模型 · 基准 ·

2023 年 3 月 30 日

BloombergGPT: A Large Language Model for Finance

翻译：BloombergGPT: 金融大语言模型

Shijie Wu,Ozan Irsoy,Steven Lu,Vadim Dabravolski,Mark Dredze,Sebastian Gehrmann,Prabhanjan Kambadur,David Rosenberg,Gideon Mann

The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. As a next step, we plan to release training logs (Chronicles) detailing our experience in training BloombergGPT.

翻译：自然语言处理在金融科技领域中的应用十分广泛和复杂，包括情感分析和实体识别等任务。大型语言模型（LLMs）已被证明在各种任务上非常有效；然而，领域特定的金融语言模型在文献中尚未被报道。在本研究中，我们提出了BloombergGPT，这是一个基于大量金融数据训练的500亿参数语言模型。我们构建了一个基于Bloomberg丰富的数据源的3630亿标记数据集，这可能是目前最大的特定领域数据集，并增加了来自通用数据集的3450亿标记。我们在标准LLM基准测试、公开金融基准测试和一套最能反映我们预期用途的内部基准测试上验证了BloombergGPT。我们的混合数据集训练导致一个在金融任务上表现出显著优势的模型，而不牺牲在通用LLM基准测试上的性能。此外，我们解释了我们的建模选择、训练过程和评估方法。作为下一步，我们计划发布训练日志（Chronicles），详细说明我们在训练BloombergGPT过程中的经验。

0

相关内容

BloombergGPT

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

专知会员服务

104+阅读 · 2023年4月28日

ChatGPT如何垂直化？彭博发布《BloombergGPT-500亿参数的金融大型语言模型》论文，65页pdf详述模型优异性能（附中英文版论文下载）

ChatGPT如何垂直化？彭博发布《BloombergGPT-500亿参数的金融大型语言模型》论文，65页pdf详述模型优异性能（附中英文版论文下载）

专知会员服务

143+阅读 · 2023年3月31日

不可错过！普林斯顿陈丹琦最新《大语言模型理解》2022课程！全面讲述BERT、GPT、T5等大模型，附Slides

不可错过！普林斯顿陈丹琦最新《大语言模型理解》2022课程！全面讲述BERT、GPT、T5等大模型，附Slides

专知会员服务

142+阅读 · 2022年10月19日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【人工智能+人力资源】人力资源专业人士的工具箱，Human-Centred Artificial Intelligence for Human Resources: A Toolkit for Human Resources Professionals

【人工智能+人力资源】人力资源专业人士的工具箱，Human-Centred Artificial Intelligence for Human Resources: A Toolkit for Human Resources Professionals

专知会员服务

29+阅读 · 2022年2月17日

亚马逊人工智能公平性与可解释性白皮书，17页pdf

专知会员服务

60+阅读 · 2021年5月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

专知

39+阅读 · 2019年9月27日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

逆天语言模型GPT-2最新开源：345M预训练模型和1.5B参数都来了

逆天语言模型GPT-2最新开源：345M预训练模型和1.5B参数都来了

量子位

18+阅读 · 2019年5月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA参与Arc调控海马神经元突触重塑在癫痫发生中的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类半群在图论和形式语言学中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

缺氧诱导的库操纵钙通道分子表达及相互作用——肺血管壁重塑的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Realized GARCH框架的波动率和相关性模型理论和应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖尿病血管钙化的新机制：高糖诱导内皮细胞－成骨细胞转分化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

远志总皂苷对阿尔茨海默病模型大鼠认知的影响及其突触机制

国家自然科学基金

0+阅读 · 2011年12月31日

脑脉通联合骨髓间充质干细胞移植对脑缺血大鼠神经细胞的保护机制

国家自然科学基金

0+阅读 · 2009年12月31日

Understanding HTML with Large Language Models

Understanding HTML with Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Extending Memory for Language Modelling

Arxiv

0+阅读 · 2023年5月19日

Semantic Anomaly Detection with Large Language Models

Arxiv

0+阅读 · 2023年5月18日

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

Arxiv

0+阅读 · 2023年5月18日

The Web Can Be Your Oyster for Improving Large Language Models

Arxiv

0+阅读 · 2023年5月18日

DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

A Survey of Knowledge-Enhanced Text Generation

Arxiv

18+阅读 · 2020年10月9日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

相关VIP内容

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

专知会员服务

104+阅读 · 2023年4月28日

ChatGPT如何垂直化？彭博发布《BloombergGPT-500亿参数的金融大型语言模型》论文，65页pdf详述模型优异性能（附中英文版论文下载）

ChatGPT如何垂直化？彭博发布《BloombergGPT-500亿参数的金融大型语言模型》论文，65页pdf详述模型优异性能（附中英文版论文下载）

专知会员服务

143+阅读 · 2023年3月31日

不可错过！普林斯顿陈丹琦最新《大语言模型理解》2022课程！全面讲述BERT、GPT、T5等大模型，附Slides

不可错过！普林斯顿陈丹琦最新《大语言模型理解》2022课程！全面讲述BERT、GPT、T5等大模型，附Slides

专知会员服务

142+阅读 · 2022年10月19日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【人工智能+人力资源】人力资源专业人士的工具箱，Human-Centred Artificial Intelligence for Human Resources: A Toolkit for Human Resources Professionals

【人工智能+人力资源】人力资源专业人士的工具箱，Human-Centred Artificial Intelligence for Human Resources: A Toolkit for Human Resources Professionals

专知会员服务

29+阅读 · 2022年2月17日

亚马逊人工智能公平性与可解释性白皮书，17页pdf

专知会员服务

60+阅读 · 2021年5月20日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】反事实推理在多模态对话生成中的应用

基于强化学习的智能体化搜索全面综述：基础、角色、优化、评估与应用

ICCV最佳论文出炉，朱俊彦团队用砖块积木摘得桂冠

面向具身操作的高效视觉–语言–动作模型：系统综述

相关资讯

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

专知

39+阅读 · 2019年9月27日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

逆天语言模型GPT-2最新开源：345M预训练模型和1.5B参数都来了

逆天语言模型GPT-2最新开源：345M预训练模型和1.5B参数都来了

量子位

18+阅读 · 2019年5月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

相关论文

Understanding HTML with Large Language Models

Understanding HTML with Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Extending Memory for Language Modelling

Arxiv

0+阅读 · 2023年5月19日

Semantic Anomaly Detection with Large Language Models

Arxiv

0+阅读 · 2023年5月18日

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

Arxiv

0+阅读 · 2023年5月18日

The Web Can Be Your Oyster for Improving Large Language Models

Arxiv

0+阅读 · 2023年5月18日

DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition

Arxiv

0+阅读 · 2023年5月17日

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

Arxiv

33+阅读 · 2023年2月18日

A Survey of Knowledge-Enhanced Text Generation

Arxiv

18+阅读 · 2020年10月9日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

相关基金

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA参与Arc调控海马神经元突触重塑在癫痫发生中的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类半群在图论和形式语言学中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

缺氧诱导的库操纵钙通道分子表达及相互作用——肺血管壁重塑的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Realized GARCH框架的波动率和相关性模型理论和应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖尿病血管钙化的新机制：高糖诱导内皮细胞－成骨细胞转分化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

远志总皂苷对阿尔茨海默病模型大鼠认知的影响及其突触机制

国家自然科学基金

0+阅读 · 2011年12月31日

脑脉通联合骨髓间充质干细胞移植对脑缺血大鼠神经细胞的保护机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员