ArGOT: 摘自ArXiv的术语汇编 (ArGoT: A Glossary of Terms extracted from the arXiv) - 专知论文

会员服务 ·

0

arXiv · 数学 · MINE · INTERACT · 情景 ·

2021 年 9 月 7 日

ArGoT: A Glossary of Terms extracted from the arXiv

翻译：ArGOT: 摘自ArXiv的术语汇编

from arxiv, In Proceedings SCSS 2021, arXiv:2109.02501

We introduce ArGoT, a data set of mathematical terms extracted from the articles hosted on the arXiv website. A term is any mathematical concept defined in an article. Using labels in the article's source code and examples from other popular math websites, we mine all the terms in the arXiv data and compile a comprehensive vocabulary of mathematical terms. Each term can be then organized in a dependency graph by using the term's definitions and the arXiv's metadata. Using both hyperbolic and standard word embeddings, we demonstrate how this structure is reflected in the text's vector representation and how they capture relations of entailment in mathematical concepts. This data set is part of an ongoing effort to align natural mathematical text with existing Interactive Theorem Prover Libraries (ITPs) of formally verified statements.

翻译：我们引入了ArGot, 这是一组数学术语的数据集, 从arXiv网站主页上的文章中摘取。术语是指文章中定义的任何数学概念。使用文章源代码中的标签和其他流行数学网站的示例, 我们将所有术语都埋存在 arXiv 数据中, 并汇编一个全面的数学术语词汇表。然后, 每个术语都可以通过使用术语的定义和 arXiv 元数据, 组织成一个依赖性图表。使用双曲和标准词嵌入, 我们演示了该结构如何在文本的矢量表达中反映出来, 以及它们如何在数学概念中捕捉隐含的关系。这套数据是当前努力的一部分, 目的是将自然数学文本与现有的经正式验证的互动式理论质库( IPPrever 图书馆) 保持一致。

1

相关内容

arXiv

arXiv（X依希腊文的χ发音，读音如英语的archive）是一个收集物理学、数学、计算机科学与生物学的论文预印本的网站，始于1991年8月14日。截至2008年10月，arXiv.org已收集超过50万篇预印本；至2014年底，藏量达到1百万篇。在2014年时，约以每月8000篇的速度增加。

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

BERTian Poetics: Constrained Composition with Masked LMs

Arxiv

0+阅读 · 2021年10月28日

A new 1.375-approximation algorithm for Sorting By Transpositions

Arxiv

0+阅读 · 2021年10月27日

Parametricity for Nested Types and GADTs

Arxiv

0+阅读 · 2021年10月27日

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Arxiv

0+阅读 · 2021年10月27日

Manifesto for Putting 'Chartjunk' in the Trash 2021!

Arxiv

0+阅读 · 2021年10月26日

CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction

Arxiv

4+阅读 · 2021年6月21日

A Survey of Learning Causality with Data: Problems and Methods

Arxiv

31+阅读 · 2020年5月5日

Text Generation with Exemplar-based Adaptive Decoding

Arxiv

4+阅读 · 2019年4月9日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

VIP会员

文章信息

相关主题

相关VIP内容

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

BERTian Poetics: Constrained Composition with Masked LMs

Arxiv

0+阅读 · 2021年10月28日

A new 1.375-approximation algorithm for Sorting By Transpositions

Arxiv

0+阅读 · 2021年10月27日

Parametricity for Nested Types and GADTs

Arxiv

0+阅读 · 2021年10月27日

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Arxiv

0+阅读 · 2021年10月27日

Manifesto for Putting 'Chartjunk' in the Trash 2021!

Arxiv

0+阅读 · 2021年10月26日

CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction

Arxiv

4+阅读 · 2021年6月21日

A Survey of Learning Causality with Data: Problems and Methods

Arxiv

31+阅读 · 2020年5月5日

Text Generation with Exemplar-based Adaptive Decoding

Arxiv

4+阅读 · 2019年4月9日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

微信扫码咨询专知VIP会员