While there exist many methods in machine learning for comparison of letter string data, most are better equipped to handle strings that represent natural language, and their performance will not hold up when presented with strings that correspond to mathematical expressions. Based on the graphical representation of the expression tree, here we propose a simple method for encoding such expressions that is only sensitive to their structural properties, and invariant to the specifics which can vary between two seemingly different, but semantically similar mathematical expressions.
翻译:虽然在比较字母字符串数据方面机器学习有许多方法, 但大多数都更有能力处理代表自然语言的字符串, 当显示字符串与数学表达式相对应时, 其性能将无法维持。 基于表达式树的图形表达方式, 我们在此建议一种简单的编码方法, 用于编码这些表达式, 它只对其结构属性敏感, 并且不易变, 具体到两种看起来不同但语义相似的数学表达式之间可能有所不同。