One method for obtaining generalizable solutions to machine learning tasks when presented with diverse training environments is to find \textit{invariant representations} of the data. These are representations of the covariates such that the best model on top of the representation is invariant across training environments. In the context of linear Structural Equation Models (SEMs), invariant representations might allow us to learn models with out-of-distribution guarantees, i.e., models that are robust to interventions in the SEM. To address the invariant representation problem in a {\em finite sample} setting, we consider the notion of $\epsilon$-approximate invariance. We study the following question: If a representation is approximately invariant with respect to a given number of training interventions, will it continue to be approximately invariant on a larger collection of unseen SEMs? This larger collection of SEMs is generated through a parameterized family of interventions. Inspired by PAC learning, we obtain finite-sample out-of-distribution generalization guarantees for approximate invariance that holds \textit{probabilistically} over a family of linear SEMs without faithfulness assumptions. Our results show bounds that do not scale in ambient dimension when intervention sites are restricted to lie in a constant size subset of in-degree bounded nodes. We also show how to extend our results to a linear indirect observation model that incorporates latent variables.
翻译:当在不同的培训环境中提出机器学习任务的普遍解决方案时,一种方法就是找到数据中的 & textit{ involution 代表 。 这是一种共变式的表达方式, 使代表处上方的最佳模式在培训环境中是变化不定的。 在线性结构等式模型(SEMs)中, 异变式的表达方式可能使我们能够学习分配外保障模式, 即对SEM的干预具有强大力的模式。 为了在 & 有限抽样} 设置中解决变异性代表问题, 我们考虑了 $\ epsilon$- a plosides ablical Propossial Proference 的概念。 我们研究以下问题: 如果代表处面上的最佳模式在培训干预的某个特定数量上几乎是变化不定的。 更多的SEMsmetrimission 集合通过一个参数化的干预组合产生。 在 PAC 学习 的激励下, 我们获得有限的分配范围一般化保证, 大约的变差性(text/ prealive) adlievilalalalalalal imalisal imal imalispalislation laviews a subilate subal subal subal subal subilate subilate subal subilate subilate subilate subiltibal subil subil subilate subiltibiliz subiltibiltibild subild subly_ subilizalizalizal 范围 范围 范围 范围 范围如何如何显示我们的系统规模如何显示我们的 范围 范围范围 范围在我们的系统范围范围 范围范围范围范围范围的系统范围范围范围范围范围的系统范围是一种不以 范围 范围上多少 范围化的系统范围的系统范围的系统规模显示我们不以 范围 范围范围 范围 范围 范围 范围 范围 范围 范围 范围 范围 范围 范围范围的系统规模显示我们的 范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围范围的