We develop a method to generate predictive regions that cover a multivariate response variable with a user-specified probability. Our work is composed of two components. First, we use a deep generative model to learn a representation of the response that has a unimodal distribution. Existing multiple-output quantile regression approaches are effective in such cases, so we apply them on the learned representation, and then transform the solution to the original space of the response. This process results in a flexible and informative region that can have an arbitrary shape, a property that existing methods lack. Second, we propose an extension of conformal prediction to the multivariate response setting that modifies any method to return sets with a pre-specified coverage level. The desired coverage is theoretically guaranteed in the finite-sample case for any distribution. Experiments conducted on both real and synthetic data show that our method constructs regions that are significantly smaller compared to existing techniques.
翻译:我们开发了一种方法来生成预测区域, 覆盖多变量反应变量, 其概率为用户指定。 我们的工作由两个部分组成。 首先, 我们使用一个深基因模型来学习具有单一方式分布的响应的表示方式。 现有的多输出量回归方法在这类情况下是有效的, 所以我们在所学的表示法上应用这些方法, 然后将解决方案转换到响应的原始空间 。 这个过程的结果是一个灵活、 信息丰富的区域, 这个区域可以具有任意的形状, 而这是现有方法所缺乏的特性 。 其次, 我们提议将符合的预测扩展为多变量响应设置, 该设置将任何返回组合的方法修改为预设的覆盖水平 。 理想的覆盖范围在理论上保证在任何分布的有限量样本中。 对真实和合成数据进行的实验显示, 我们的方法构建的区域比现有技术要小得多 。