Weighted finite-state machines are a fundamental building block of NLP systems. They have withstood the test of time -- from their early use in noisy channel models in the 1990s up to modern-day neurally parameterized conditional random fields. This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines. We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature. In the case of second-order derivatives, our scheme runs in the optimal $\mathcal{O}(A^2 N^4)$ time where $A$ is the alphabet size and $N$ is the number of states. Our algorithm is significantly faster than prior algorithms. Additionally, our approach leads to a significantly faster algorithm for computing second-order expectations, such as covariance matrices and gradients of first-order expectations.
翻译:加权的限定状态机器是NLP系统的基本组成部分。它们经受了时间的考验 -- -- 从1990年代早期在噪音频道模型中的早期使用到现代神经参数化的有条件随机字段。这项工作考察了与加权的限定状态机器的正常化常数有关的较高级衍生物的计算方法。我们为评价所有订单的衍生物提供了一般算法,文献中以前没有描述过这一点。在第二阶衍生物方面,我们的计划运行的时间是美元为字母大小,美元为州数,而美元为美元。我们的算法比以前的算法要快得多。此外,我们的计算方法导致计算第二阶期望值的计算速度要快得多,例如共变式矩阵和一阶期望的梯度。