带有变压器的线性代数 (Linear algebra with transformers) - 专知论文

会员服务 ·

0

模型评估 · 变换 · 线性的 · 同分布的 · 奇异值 ·

2021 年 12 月 3 日

Linear algebra with transformers

翻译：带有变压器的线性代数

François Charton

Most applications of transformers to mathematics, from integration to theorem proving, focus on symbolic computation. In this paper, we show that transformers can be trained to perform numerical calculations with high accuracy. We consider problems of linear algebra: matrix transposition, addition, multiplication, eigenvalues and vectors, singular value decomposition, and inversion. Training small transformers (up to six layers) over datasets of random matrices, we achieve high accuracies (over 90%) on all problems. We also show that trained models can generalize out of their training distribution, and that out-of-domain accuracy can be greatly improved by working from more diverse datasets (in particular, by training from matrices with non-independent and identically distributed coefficients). Finally, we show that few-shot learning can be leveraged to re-train models to solve larger problems.

翻译：变压器大部分应用到数学, 从集成到理论验证, 重点是符号计算。在本文中, 我们显示变压器可以接受高精确度计算数字的培训。我们考虑线形代数的问题: 矩阵变异、添加、倍增、倍增、电子值和矢量、单值分解和反转。培训小变压器( 最多六层) 超过随机矩阵数据集, 我们在所有问题上都实现了高通俗( 90%以上 ) 。我们还显示, 受过训练的模型可以概括其培训分布, 并且通过使用更多样化的数据集( 特别是从不独立且分布相同的系数的矩阵上进行培训) 来大大改进外部精确性。最后, 我们显示, 少发式的学习可以被运用到再培训模型中去解决更大的问题。

0

相关内容

模型评估

机器学习系统设计系统评估标准

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【经典书】线性代数，Linear Algebra，525页pdf

【经典书】线性代数，Linear Algebra，525页pdf

专知会员服务

79+阅读 · 2021年1月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知

6+阅读 · 2020年1月16日

MIT线性代数（Linear Algebra）中文笔记

MIT线性代数（Linear Algebra）中文笔记

专知

53+阅读 · 2019年11月4日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Linear pooling of sample covariance matrices

Arxiv

1+阅读 · 2022年2月7日

On Using Transformers for Speech-Separation

Arxiv

1+阅读 · 2022年2月6日

Anchor DETR: Query Design for Transformer-Based Detector

Arxiv

5+阅读 · 2021年9月15日

Relative Positional Encoding for Transformers with Linear Complexity

Arxiv

8+阅读 · 2021年5月18日

Rethinking Attention with Performers

Arxiv

3+阅读 · 2020年9月30日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【经典书】线性代数，Linear Algebra，525页pdf

【经典书】线性代数，Linear Algebra，525页pdf

专知会员服务

79+阅读 · 2021年1月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知

6+阅读 · 2020年1月16日

MIT线性代数（Linear Algebra）中文笔记

MIT线性代数（Linear Algebra）中文笔记

专知

53+阅读 · 2019年11月4日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Linear pooling of sample covariance matrices

Arxiv

1+阅读 · 2022年2月7日

On Using Transformers for Speech-Separation

Arxiv

1+阅读 · 2022年2月6日

Anchor DETR: Query Design for Transformer-Based Detector

Arxiv

5+阅读 · 2021年9月15日

Relative Positional Encoding for Transformers with Linear Complexity

Arxiv

8+阅读 · 2021年5月18日

Rethinking Attention with Performers

Arxiv

3+阅读 · 2020年9月30日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

微信扫码咨询专知VIP会员