In machine learning we often encounter structured output prediction problems (SOPPs), i.e. problems where the output space admits a rich internal structure. Application domains where SOPPs naturally occur include natural language processing, speech recognition, and computer vision. Typical SOPPs have an extremely large label set, which grows exponentially as a function of the size of the output. Existing generalization analysis implies generalization bounds with at least a square-root dependency on the cardinality $d$ of the label set, which can be vacuous in practice. In this paper, we significantly improve the state of the art by developing novel high-probability bounds with a logarithmic dependency on $d$. Moreover, we leverage the lens of algorithmic stability to develop generalization bounds in expectation without any dependency on $d$. Our results therefore build a solid theoretical foundation for learning in large-scale SOPPs. Furthermore, we extend our results to learning with weakly dependent data.
翻译:在机器学习中,我们经常遇到结构化产出预测问题,即产出空间承认内部结构丰富的问题。自然产生SOPP的应用领域包括自然语言处理、语音识别和计算机视觉。典型SOPP的标签非常大,随着产出大小的函数而成倍增长。现有的一般化分析意味着一般化的界限,至少对标签集的基数美元具有平方根依赖性,这在实践上可能是空洞的。在本文中,我们通过开发具有对美元的对数依赖性的新颖的高概率界限,极大地改进了工艺水平。此外,我们利用算法稳定性的镜头来开发不依赖美元的一般化界限。因此,我们的结果为大规模SOPP的学习奠定了坚实的理论基础。此外,我们把我们的结果推广到依赖薄弱的数据学习上。