Probabilistic models help us encode latent structures that both model the data and are ideally also useful for specific downstream tasks. Among these, mixture models and their time-series counterparts, hidden Markov models, identify discrete components in the data. In this work, we focus on a constrained capacity setting, where we want to learn a model with relatively few components (e.g. for interpretability purposes). To maintain prediction performance, we introduce prediction-focused modeling for mixtures, which automatically selects the dimensions relevant to the prediction task. Our approach identifies relevant signal from the input, outperforms models that are not prediction-focused, and is easy to optimize; we also characterize when prediction-focused modeling can be expected to work.
翻译:概率模型有助于我们将既模拟数据又理想地对具体的下游任务也有用的潜在结构编码起来。 在这些结构中,混合模型及其时间序列对等模型,隐藏的马尔科夫模型,确定了数据中的离散组成部分。在这项工作中,我们侧重于一个受限制的能力设置,我们希望学习一个组件相对较少的模型(例如用于解释目的)。为了保持预测性能,我们引入了以预测为重点的混合物模型,这些模型自动选择与预测任务相关的层面。我们的方法从输入中找出了相关信号,优于不以预测为重点、且易于优化的模型;我们还确定了何时可以预期以预测为重点的模型发挥作用。