Conventional supervised learning methods, especially deep ones, are found to be sensitive to out-of-distribution (OOD) examples, largely because the learned representation mixes the semantic factor with the variation factor due to their domain-specific correlation, while only the semantic factor causes the output. To address the problem, we propose a Causal Semantic Generative model (CSG) based on a causal thought so that the two factors are modeled separately, and develop methods to learn it on a single training domain and predict in a test domain without (OOD generalization) or with unsupervised data (domain adaptation). We prove that under proper conditions, CSG identifies the semantic factor by learning from training data, and this semantic identification guarantees the boundedness of OOD generalization error and the success of adaptation. The methods and theory are built on the invariance principle of causal generative mechanisms, which is fundamental and general. The methods are based on variational Bayes, with a novel design for both efficient learning and easy prediction. Empirical study demonstrates the improved test accuracy for both OOD generalization and domain adaptation.
翻译:常规监督教学方法,特别是深层方法,被认为对分配外(OOD)实例十分敏感,主要是因为所学的表述方式将语义因素与因具体领域相关关系而产生的变异因素混为一谈,而只有语义因素才导致产出。为了解决这个问题,我们提议了一个基于因果考虑的Causal语义感化模型(CSG),以便将这两个因素分开建模,并制订方法,在单一的培训领域进行学习,并在试验领域进行预测,而没有(OOOD一般化)或未经监督的数据(主要适应),我们证明,在适当条件下,CSG通过从培训数据中学习确定语义因素,这种语义性识别方式保证OOD一般化错误的界限性以及适应的成功性。方法和理论基于因果感化机制的不定性原则,这是基本和一般的。方法基于变异性海湾,为高效率学习和易于预测提供了新颖的设计。