In this paper, we study oracle-efficient algorithms for beyond worst-case analysis of online learning. We focus on two settings. First, the smoothed analysis setting of [RST11,HRS22] where an adversary is constrained to generating samples from distributions whose density is upper bounded by $1/\sigma$ times the uniform density. Second, the setting of $K$-hint transductive learning, where the learner is given access to $K$ hints per time step that are guaranteed to include the true instance. We give the first known oracle-efficient algorithms for both settings that depend only on the pseudo (or VC) dimension of the class and parameters $\sigma$ and $K$ that capture the power of the adversary. In particular, we achieve oracle-efficient regret bounds of $ \widetilde{O} ( \sqrt{T d\sigma^{-1}} ) $ and $ \widetilde{O} ( \sqrt{T dK} ) $ for learning real-valued functions and $ O ( \sqrt{T d\sigma^{-\frac{1}{2}} } )$ for learning binary-valued functions. For the smoothed analysis setting, our results give the first oracle-efficient algorithm for online learning with smoothed adversaries [HRS22]. This contrasts the computational separation between online learning with worst-case adversaries and offline learning established by [HK16]. Our algorithms also achieve improved bounds for worst-case setting with small domains. In particular, we give an oracle-efficient algorithm with regret of $O ( \sqrt{T(d |\mathcal{X}|)^{1/2} })$, which is a refinement of the earlier $O ( \sqrt{T|\mathcal{X}|})$ bound by [DS16].
翻译:在本文中, 我们研究超过最坏的在线学习分析的 { 最坏的 { 最短的算法 。 我们聚焦于两个设置 。 首先, 对手只能从密度比统一密度高1 /\ gma$ 倍于统一密度的分布器中生成样本的[ RST11, HRS22] 平滑的分析设置 。 其次, 设置 $K$- hint 感应学习, 使学习者每时间步骤获得 $16 的提示, 保证包含真实实例 。 我们给两个设置的第一个已知的 ora- elevel 效率, 仅取决于 类和参数的假( 或 VC) 的伪( 或 VC) 度和 美元。 特别是, 我们实现了 QQQQQ_\ gmax 值 有效 的遗憾框框框框 。 (\ t\ d\ g) 美元=xxal- declearling a- levelop oral oral orth) 。 (\\\\\\\\\\\\\\\\\\\\\\ kal lein lein lein leax lead leardeal ledge lead ledge ledge ledge ledge ledge lead) a. d\\\\\\\\\\\\\\\ d\ d\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ d