不确定性下的教学和学习的最佳比率 (Optimal Rates of Teaching and Learning Under Uncertainty)

In this paper, we consider a recently-proposed model of teaching and learning under uncertainty, in which a teacher receives independent observations of a single bit corrupted by binary symmetric noise, and sequentially transmits to a student through another binary symmetric channel based on the bits observed so far. After a given number $n$ of transmissions, the student outputs an estimate of the unknown bit, and we are interested in the exponential decay rate of the error probability as $n$ increases. We propose a novel block-structured teaching strategy in which the teacher encodes the number of 1s received in each block, and show that the resulting error exponent is the binary relative entropy $D\big(\frac{1}{2}\|\max(p,q)\big)$, where $p$ and $q$ are the noise parameters. This matches a trivial converse result based on the data processing inequality, and settles two conjectures of [Jog and Loh, 2021] and [Huleihel \emph{et al.}, 2019]. In addition, we show that the computation time required by the teacher and student is linear in $n$. We also study a more general setting in which the binary symmetric channels are replaced by general binary-input discrete memoryless channels. We provide an achievability bound and a converse bound, and show that the two coincide in certain cases, including (i) when the two channels are identical, and (ii) when the student-teacher channel is a binary symmetric channel. More generally, we give sufficient conditions under which our achievable learning rate is the best possible for block-structured protocols.

翻译：在本文中,我们考虑的是最近提出的在不确定情况下进行教学和学习的模式,在这种模式中,教师接受一个被二进制对称噪音破坏的单位独立观测,然后根据迄今所观察到的位数通过另一个二进制对称频道相继传递给学生。在输入一个给定数字后,学生产出是对未知位数的估计,我们感兴趣的是误差概率的指数衰减率以美元计为增加。我们提出了一个新的块状教学战略,教师在其中对每个街区收到的1个数字进行编码,并表明由此造成的误差缩写是二进制相对的对称 $D\ big (\\ grac{1\\\\\%2\ max(p,q)\\ big) 频道, 其中1 p美元和 $qq(g) 是噪音参数参数参数。这符合基于数据处理不平等的微小的反差结果, 并解决了两个预言的[Jog and [Hulehle hel demin deminal], 2019) 。此外,我们还显示, 更精确的频道是正常的路径, 我们的对一个正常的计算法是正常的, 。我们用来在普通的计算中, 将一个正常的轨道中, 。我们用来在普通的对一个正常的计算。