Greedy algorithms for NLP such as transition based parsing are prone to error propagation. One way to overcome this problem is to allow the algorithm to backtrack and explore an alternative solution in cases where new evidence contradicts the solution explored so far. In order to implement such a behavior, we use reinforcement learning and let the algorithm backtrack in cases where such an action gets a better reward than continuing to explore the current solution. We test this idea on both POS tagging and dependency parsing and show that backtracking is an effective means to fight against error propagation.
翻译:NLP的贪婪算法,如基于过渡的解析,容易传播错误。克服这一问题的一个方法就是让算法在新证据与迄今所探讨的解决方案相矛盾的情况下进行反向跟踪和探索替代解决方案。为了实施这种行为,我们使用强化学习,并在此类行动得到比继续探索当前解决方案更好的回报的情况下让算法反向跟踪。我们在 POS 标签和依赖性解析上测试这一想法,并表明反向跟踪是防止错误传播的有效手段。