Most deep learning research has focused on developing new model and training procedures. On the other hand the training objective has usually been restricted to combinations of standard losses. When the objective aligns well with the evaluation metric, this is not a major issue. However when dealing with complex structured outputs, the ideal objective can be hard to optimize and the efficacy of usual objectives as a proxy for the true objective can be questionable. In this work, we argue that the existing inference network based structure prediction methods ( Tu and Gimpel 2018; Tu, Pang, and Gimpel 2020) are indirectly learning to optimize a dynamic loss objective parameterized by the energy model. We then explore using implicit-gradient based technique to learn the corresponding dynamic objectives. Our experiments show that implicitly learning a dynamic loss landscape is an effective method for improving model performance in structure prediction.
翻译:大部分深层次的学习研究侧重于开发新的模式和培训程序。另一方面,培训目标通常局限于标准损失的组合。当目标与评估指标完全一致时,这不是一个重大问题。然而,在处理复杂的结构化产出时,理想目标可能很难优化,而通常目标作为真正目标的替代物,其有效性则可能值得怀疑。在这项工作中,我们认为现有的基于推论的网络结构预测方法(Tu和Gimpel 2018;Tu、Pang和Gimpel 2020)正在间接地学习优化能源模型所设定的动态损失目标参数。然后我们探索使用基于隐含等级的技术来学习相应的动态目标。我们的实验表明,隐性地学习动态损失场景是改善结构预测模型性能的有效方法。