We propose the neural string edit distance model for string-pair matching and string transduction based on learnable string edit distance. We modify the original expectation-maximization learned edit distance algorithm into a differentiable loss function, allowing us to integrate it into a neural network providing a contextual representation of the input. We evaluate on cognate detection, transliteration, and grapheme-to-phoneme conversion, and show that we can trade off between performance and interpretability in a single framework. Using contextual representations, which are difficult to interpret, we match the performance of state-of-the-art string-pair matching models. Using static embeddings and a slightly different loss function, we force interpretability, at the expense of an accuracy drop.
翻译:我们根据可学习的字符串编辑距离,提出用于字符串匹配和字符串转换的神经字符串编辑远程模型。我们修改最初的预期-最大程度学会的远程算法,将其修改为不同的损失功能,让我们将其整合到一个神经网络中,提供输入的上下文描述。我们评估COgnate检测、转立和图形对手机转换,并表明我们可以在一个单一框架内交换性能和可解释性。我们使用难以解释的背景表达,我们匹配最先进的字符串匹配模型的性能。我们使用静态嵌入和略微不同的损失功能,我们强迫解释性,以精确性下降为代价。