In this work, we establish risk bounds for the Empirical Risk Minimization (ERM) with both dependent and heavy-tailed data-generating processes. We do so by extending the seminal works of Mendelson [Men15, Men18] on the analysis of ERM with heavy-tailed but independent and identically distributed observations, to the strictly stationary exponentially $\beta$-mixing case. Our analysis is based on explicitly controlling the multiplier process arising from the interaction between the noise and the function evaluations on inputs. It allows for the interaction to be even polynomially heavy-tailed, which covers a significantly large class of heavy-tailed models beyond what is analyzed in the learning theory literature. We illustrate our results by deriving rates of convergence for the high-dimensional linear regression problem with dependent and heavy-tailed data.
翻译:在这项工作中,我们为 " 经验风险最小化(ERM) " 设定了风险界限,既有依赖性的,也有繁琐的数据生成程序。我们这样做的方式是,将Mendelson[Men15, Men18]在分析机构风险管理的开创性作品中,用大量详细但独立和同样分布的观测,扩展至严格静止的指数性指数($\beta$-mixing)案例。我们的分析基于明确控制噪音和投入功能评估相互作用产生的乘数过程。我们的分析使得互动甚至能够实现多式的重成型,它涵盖大量重成型模型,超出了学习理论文献中的分析范围。我们通过得出与依赖性和重成型数据的高维线回归问题的趋同率来展示我们的成果。