We investigate the problem of minimizing the excess generalization error with respect to the best expert prediction in a finite family in the stochastic setting, under limited access to information. We assume that the learner only has access to a limited number of expert advices per training round, as well as for prediction. Assuming that the loss function is Lipschitz and strongly convex, we show that if we are allowed to see the advice of only one expert per round for T rounds in the training phase, or to use the advice of only one expert for prediction in the test phase, the worst-case excess risk is $\Omega$(1/ $\sqrt$ T) with probability lower bounded by a constant. However, if we are allowed to see at least two actively chosen expert advices per training round and use at least two experts for prediction, the fast rate O(1/T) can be achieved. We design novel algorithms achieving this rate in this setting, and in the setting where the learner has a budget constraint on the total number of observed expert advices, and give precise instance-dependent bounds on the number of training rounds and queries needed to achieve a given generalization error precision.
翻译:我们调查的是,在有限的获取信息渠道下,如何在有限家庭内最佳的专家预测中,最大限度地减少对有限家庭的最佳专家预测的过度概括错误; 我们假定,学习者在每一轮培训中只能得到数量有限的专家咨询意见,以及预测; 假设损失功能是利普西茨和强烈的曲线,我们表明,如果允许我们在培训阶段每轮培训中只看到一位专家的建议,或者只使用一位专家的建议来预测试验阶段,最坏的超额风险是$\Omega$(1/$\sqrt$ T),而概率受常数限制的可能性较低; 但是,如果我们允许在每轮培训中至少看到两名积极挑选的专家咨询意见,并且至少使用两名专家进行预测,那么就能够达到快速的O(1/T)率; 我们设计了在这种环境下达到这一比率的新算法,以及在学习者预算上对所观察到的专家咨询意见总数有限制的设置,并给出精确度的培训回合和查询次数的精确度,以实例为依据。