了解非线性模型外推法到未知域域的最初步骤 (First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains)

Real-world machine learning applications often involve deploying neural networks to domains that are not seen in the training time. Hence, we need to understand the extrapolation of nonlinear models -- under what conditions on the distributions and function class, models can be guaranteed to extrapolate to new test distributions. The question is very challenging because even two-layer neural networks cannot be guaranteed to extrapolate outside the support of the training distribution without further assumptions on the domain shift. This paper makes some initial steps toward analyzing the extrapolation of nonlinear models for structured domain shift. We primarily consider settings where the marginal distribution of each coordinate of the data (or subset of coordinates) does not shift significantly across the training and test distributions, but the joint distribution may have a much bigger shift. We prove that the family of nonlinear models of the form $f(x)=\sum f_i(x_i)$, where $f_i$ is an arbitrary function on the subset of features $x_i$, can extrapolate to unseen distributions, if the covariance of the features is well-conditioned. To the best of our knowledge, this is the first result that goes beyond linear models and the bounded density ratio assumption, even though the assumptions on the distribution shift and function class are stylized.

翻译：现实世界机器学习应用程序通常涉及将神经网络部署到培训时间未见的领域。因此,我们需要理解非线性模型的外推法 -- -- 在分布和功能类的条件下,模型可以保证外推到新的测试分布。这个问题非常具有挑战性, 因为即使是双层神经网络也不能保证在支持培训分配之外外外外推,而不进一步假设域变。本文为分析非线性模型的外推法为结构化域变换提供了一些初步步骤。我们主要考虑的是每个数据协调点( 或坐标子集)的边际分布不会在培训和测试分布之间发生重大变化,但联合分布可能有很大的转变。我们证明,表格$f(x)sum f_i(x_i)$(x_i)$($_ i) 的组合是任意功能。本文为分析非线性模型对结构化域变换的外推法, 如果特征的相差非常精确, 我们主要考虑的是每个数据( 或坐标子组) 的边际分布不会在培训和测试分布之间发生重大变化, 联合分布可能发生更大的变化。我们证明表型号的模型的精确度的模型的模型是分置。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日