In several applications, besides getting a generative model of the data, we also want the model to be useful for specific downstream tasks. Mixture models are useful for identifying discrete components in the data, but may not identify components useful for downstream tasks if misspecified; further, current inference techniques often fail to overcome misspecification even when a supervisory signal is provided. We introduce the prediction-focused mixture model, which selects and models input features relevant to predicting the targets. We demonstrate that our approach identifies relevant signal from inputs even when the model is highly misspecified.
翻译:在若干应用中,除了获得数据基因模型外,我们还希望该模型对具体的下游任务有用。混合模型有助于确定数据中的离散组成部分,但如果定义错误,则可能无法确定对下游任务有用的组成部分;此外,即使提供了监督信号,目前的推论技术也往往无法克服错误区分。我们引入了以预测为重点的混合物模型,该模型选择了与预测目标有关的预测和模型输入特征。我们证明,我们的方法确定了投入中的相关信号,即使模型的描述高度错误。