We provide new insights on eluder dimension, a complexity measure that has been extensively used to bound the regret of algorithms for online bandits and reinforcement learning with function approximation. First, we study the relationship between the eluder dimension for a function class and a generalized notion of rank, defined for any monotone "activation" $\sigma : \mathbb{R}\to \mathbb{R}$, which corresponds to the minimal dimension required to represent the class as a generalized linear model. It is known that when $\sigma$ has derivatives bounded away from $0$, $\sigma$-rank gives rise to an upper bound on eluder dimension for any function class; we show however that eluder dimension can be exponentially smaller than $\sigma$-rank. We also show that the condition on the derivative is necessary; namely, when $\sigma$ is the $\mathsf{relu}$ activation, the eluder dimension can be exponentially larger than $\sigma$-rank. For binary-valued function classes, we obtain a characterization of the eluder dimension in terms of star number and threshold dimension, quantities which are relevant in active learning and online learning respectively.
翻译:我们提供了关于埃鲁得维度的新洞见, 这是一种广泛用来约束在线土匪算法的遗憾和以函数近似值强化学习的复杂度。 首先, 我们研究功能类的埃鲁特维度和普通等级概念之间的关系, 定义为任何单调“ 活性” $\ sgmam :\ mathbb{R ⁇ to\ mathbb{R}$, 与代表该类作为通用线性模式所需的最低维度相对应。 众所周知, 当 $\ grama$ 的衍生物与 $0 绑在一起时, $\ sigma_rank 会导致任何功能类的埃鲁特维度的上限; 然而, 我们显示, 埃鲁德维度的维度可能大大小于$\ sigmam$-rank 。 我们还表明, 衍生物的条件是必要的; 也就是说, 当 $\ gramax是 $\ mathf{relu} 激活时, liuder 维度可能指数大于 $\\\ reclead lear im nudeal nudeal nudeal nudeal rideal rideal lideal listral nual nual riumlection.