There often is a dilemma between ease of optimization and robust out-of-distribution (OoD) generalization. For instance, many OoD methods rely on penalty terms whose optimization is challenging. They are either too strong to optimize reliably or too weak to achieve their goals. We propose to initialize the networks with a rich representation containing a palette of potentially useful features, ready to be used by even simple models. On the one hand, a rich representation provides a good initialization for the optimizer. On the other hand, it also provides an inductive bias that helps OoD generalization. Such a representation is constructed with the Rich Feature Construction (RFC) algorithm, also called the Bonsai algorithm, which consists of a succession of training episodes. During discovery episodes, we craft a multi-objective optimization criterion and its associated datasets in a manner that prevents the network from using the features constructed in the previous iterations. During synthesis episodes, we use knowledge distillation to force the network to simultaneously represent all the previously discovered features. Initializing the networks with Bonsai representations consistently helps six OoD methods achieve top performance on ColoredMNIST benchmark. The same technique substantially outperforms comparable results on the Wilds Camelyon17 task, eliminates the high result variance that plagues other methods, and makes hyperparameter tuning and model selection more reliable.
翻译:最优化的方便程度和强有力的分配(OoD)一般化之间往往存在两难。例如,许多OOD方法依赖最优化具有挑战性的处罚条件。它们要么过于强大,无法优化可靠,要么太弱,无法实现其目标。我们提议启动包含潜在有用特征的丰富代表面的网络,其中含有潜在有用特征的调色板,可供甚至简单的模型使用。一方面,内容丰富的代表面为优化者提供了良好的初始化。另一方面,它也提供了有助于OOOD一般化的诱导偏差。这种代表面是用富地貌建筑(RFC)算法构建的。这种代表面也称为Bonsai算法,其中包括一系列培训事件。在发现事件期间,我们设计了一个多目标优化标准及其相关数据集,其方式使网络无法使用先前反复构建的特征。在合成时,我们利用知识蒸馏力迫使网络同时代表先前发现的所有特征。与Bonsai 初始化的网络一直有助于六种OOD方法在CLMIS基准上实现顶级性性工作,也称为Bonsai算法,从而大大地排除了17号高度任务选择结果。同样的高标准。