Factorizable joint shift was recently proposed as a type of dataset shift for which the characteristics can be estimated from observed data. For the multinomial (multi-class) classification setting, we derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features. On the basis of this result, we propose alternatives to joint importance aligning, at the same time pointing out the limitations encountered when making an assumption of factorizable joint shift. Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift. In addition, we investigate the consequences of assuming factorizable joint shift for the bias caused by sample selection.
翻译:最近提出了可实现的联合转换,作为可以从观察到的数据中估计其特点的数据集转换的一种类型。对于多数值(多等级)分类设置,我们得出了一个可实现的源(培训)分布、目标(测试)前等级概率和特征目标边际分布等要素的系数联合变化的表示。根据这一结果,我们提出了联合重要性调整的替代办法,同时指出在假设可实现因素联合转移时遇到的限制。文件的其他结果包括一般数据集转换和可系数联合变化下的后等级概率修正公式。此外,我们调查假设可实现因素联合转移对抽样选择造成的偏差的后果。