We propose a new variant of online learning that we call "ambiguous online learning". In this setting, the learner is allowed to produce multiple predicted labels. Such an "ambiguous prediction" is considered correct when at least one of the labels is correct, and none of the labels are "predictably wrong". The definition of "predictably wrong" comes from a hypothesis class in which hypotheses are also multi-valued. Thus, a prediction is "predictably wrong" if it's not allowed by the (unknown) true hypothesis. In particular, this setting is natural in the context of multivalued dynamical systems, recommendation algorithms and lossless compression. It is also strongly related to so-called "apple tasting". We show that in this setting, there is a trichotomy of mistake bounds: up to logarithmic factors, any hypothesis class has an optimal mistake bound of either Theta(1), Theta(sqrt(N)) or N.
翻译:我们提出了一种新的在线学习变体,称为"模糊在线学习"。在此设置中,学习者被允许生成多个预测标签。当至少有一个标签正确,且没有任何标签是"可预测错误"时,这种"模糊预测"被视为正确。"可预测错误"的定义来源于一个假设类,其中假设也是多值的。因此,如果一个预测不被(未知的)真实假设所允许,那么它就是"可预测错误"的。特别地,这种设置在多值动力系统、推荐算法和无损压缩的背景下非常自然。它也与所谓的"苹果品尝"问题密切相关。我们证明在此设置下,错误界限存在三分律:在对数因子范围内,任何假设类的最优错误界限要么是Theta(1)、Theta(sqrt(N)),要么是N。