连续矢量空间中数学表达式的语义表达式 (Semantic Representations of Mathematical Expressions in a Continuous Vector Space)

Mathematical notation makes up a large portion of STEM literature, yet, finding semantic representations for formulae remains a challenging problem. Because mathematical notation is precise and its meaning changes significantly with small character shifts, the methods that work for natural text do not necessarily work well for mathematical expressions. In this work, we describe an approach for representing mathematical expressions in a continuous vector space. We use the encoder of a sequence-to-sequence architecture, trained on visually different but mathematically equivalent expressions, to generate vector representations (embeddings). We compare this approach with an autoencoder and show that the former is better at capturing mathematical semantics. Finally, to expedite future projects, we publish a corpus of equivalent transcendental and algebraic expression pairs.

翻译：数学符号占STEM文献的一大部分,然而,找到公式的语义表达方式仍是一个具有挑战性的问题。由于数学符号精确,其含义随着字符的细微变化而发生重大变化,自然文本使用的方法不一定对数学表达方式有效。在这项工作中,我们描述一种在连续矢量空间中代表数学表达方式的方法。我们用一个序列到序列结构的编码器生成矢量表达方式(组合),该编码器受过视觉不同但数学等同的表达方式的培训。我们把这个方法与一个自动编码器进行比较,并表明前者在获取数学语义学方面比较好。最后,为了加速未来的工程,我们出版了一套等同的超文本和代数表达式配对。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日