Regularization schemes for regression have been widely studied in learning theory and inverse problems. In this paper, we study distribution regression (DR) which involves two stages of sampling, and aims at regressing from probability measures to real-valued responses over a reproducing kernel Hilbert space (RKHS). Recently, theoretical analysis on DR has been carried out via kernel ridge regression and several learning behaviors have been observed. However, the topic has not been explored and understood beyond the least square based DR. By introducing a robust loss function $l_{\sigma}$ for two-stage sampling problems, we present a novel robust distribution regression (RDR) scheme. With a windowing function $V$ and a scaling parameter $\sigma$ which can be appropriately chosen, $l_{\sigma}$ can include a wide range of popular used loss functions that enrich the theme of DR. Moreover, the loss $l_{\sigma}$ is not necessarily convex, hence largely improving the former regression class (least square) in the literature of DR. The learning rates under different regularity ranges of the regression function $f_{\rho}$ are comprehensively studied and derived via integral operator techniques. The scaling parameter $\sigma$ is shown to be crucial in providing robustness and satisfactory learning rates of RDR.
翻译:在学习理论和反向问题中,对回归的常规化计划进行了广泛的研究。在本文中,我们研究了分布回归(DR),这涉及两个取样阶段,目的是在复制Hilbert空间(RKHS)时,从概率措施向实际价值反应倒退,最近,通过内核脊脊回归(NERHS)对DR进行了理论分析,并观察到了一些学习行为。然而,这个专题没有被探索和理解到超过以最小平方为基础的DR。通过对两阶段抽样问题引入强力损失函数$l ⁇ sigma},我们提出了一个新颖的稳健的分布回归(RDR)计划。用窗口函数的美元和缩放参数$$squm$,可以适当选择,$sgmath 美元可以包括广泛的流行使用的损失功能,从而丰富了DR的主题。此外,损失 $l ⁇ sgmath}损失不一定是相同的,因此在DR的文献中大大改进了以前的回归级(East sqrequal) 下学习率。 正在全面研究和通过精确的精确度测算。