We introduce a framework for automatic differentiation with weighted finite-state transducers (WFSTs) allowing them to be used dynamically at training time. Through the separation of graphs from operations on graphs, this framework enables the exploration of new structured loss functions which in turn eases the encoding of prior knowledge into learning algorithms. We show how the framework can combine pruning and back-off in transition models with various sequence-level loss functions. We also show how to learn over the latent decomposition of phrases into word pieces. Finally, to demonstrate that WFSTs can be used in the interior of a deep neural network, we propose a convolutional WFST layer which maps lower-level representations to higher-level representations and can be used as a drop-in replacement for a traditional convolution. We validate these algorithms with experiments in handwriting recognition and speech recognition.
翻译:我们引入了与加权的有限移动器(WFSTs)自动区分的框架,允许它们在培训时动态使用。通过将图表与图表上的操作分开,这个框架可以探索新的结构损失功能,这反过来便利了将先前的知识编码为学习算法。我们展示了这个框架如何在过渡模型中结合各种序列级损失功能,将缩小和后退结合起来。我们还展示了如何学习将词组隐性分解成单词块的问题。最后,为了证明可以在深神经网络的内部使用WFFSTs,我们提议了一个革命性WFST层,该层将较低层次的表达方式绘制到更高层次的表达方式,并用作传统演化过程的一滴替代。我们用笔迹识别和语音识别实验来验证这些算法。