A fundamental challenge for any intelligent system is prediction: given some inputs, can you predict corresponding outcomes? Most work on supervised learning has focused on producing accurate marginal predictions for each input. However, we show that for a broad class of decision problems, accurate joint predictions are required to deliver good performance. In particular, we establish several results pertaining to combinatorial decision problems, sequential predictions, and multi-armed bandits to elucidate the essential role of joint predictive distributions. Our treatment of multi-armed bandits introduces an approximate Thompson sampling algorithm and analytic techniques that lead to a new kind of regret bound.
翻译:对任何智能系统来说,一个根本的挑战就是预测:根据一些投入,你能预测相应的结果吗?大多数关于监督学习的工作都侧重于为每项投入提供准确的边际预测。然而,我们表明,对于广泛的决策问题,需要准确的联合预测才能带来良好的表现。特别是,我们确定了与组合决策问题、顺序预测和多武装匪徒有关的若干结果,以阐明联合预测分布的基本作用。我们对多武装匪徒的处理引入了近似Thompson抽样算法和分析技术,导致一种新的遗憾束缚。