盲目独立假设下单调函数的恰当学习：突破黑盒校正障碍 (Agnostic proper learning of monotone functions: beyond the black-box correction barrier)

We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$ uniformly random examples of an unknown function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$, our algorithm outputs a hypothesis $g:\{\pm 1\}^n \rightarrow \{\pm 1\}$ that is monotone and $(\mathrm{opt} + \varepsilon)$-close to $f$, where $\mathrm{opt}$ is the distance from $f$ to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$, nearly matching the lower bound of Blais et al (RANDOM '15). We also give an algorithm for estimating up to additive error $\varepsilon$ the distance of an unknown function $f$ to monotone using a run-time of $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$. Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of Bshouty and Tamon (JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld, and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued hypothesis, then ``corrects'' it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than $2\mathrm{opt} + \varepsilon$ information-theoretically; we bypass this barrier by a) augmenting the improper learner with a convex optimization step, and b) learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the ``poset sorting'' problem of [LRV22] for functions over general posets with non-Boolean labels.

翻译：摘要：我们提出了第一个针对单调布尔函数进行恰当、高效学习的不可知算法。给定对一个未知函数$f：\{\pm 1\}^n \rightarrow \{\pm 1\}$的$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$个均匀随机样例，我们的算法输出一个关于$g: \{\pm 1\}^n \rightarrow \{\pm 1\}$的单调假设，其与$f$的距离为$(\mathrm{opt} + \varepsilon)$，其中$\mathrm{opt}$是距离$f$最近的单调函数。该算法的运行时间（以及因此得出的假设的大小和评估时间）也为$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$，几乎符合Blais等人（RANDOM'15）的下界。我们还提供了一种算法，用于估计未知函数$f$到单调函数的距离，该距离的添加误差为$\varepsilon$，其运行时间为$2^{\tilde{O}(\sqrt{n}/\varepsilon)}$。先前已知这两个问题的样本有效算法，但这些算法的运行时间并不有效。因此，我们的工作填补了我们对运行时间和样本复杂度之间的知识差距。本文基于Bshouty和Tamon的不适当学习算法（JACM'96）和Lange、Rubinfeld和Vasilyan的适当半不可知学习算法（FOCS'22），其获得非单调布尔值假设，然后在图形上对查询有效的本地计算算法进行“校正”，以实现单调性。该黑盒校正方法在信息理论上无法达到比$2\mathrm{opt} + \varepsilon$更好的误差；我们通过以下方式绕过此障碍a）将不适当学习器与凸优化步骤相结合，b）学习和修正一个实值函数，然后将其值四舍五入为布尔值。我们实值修正算法解决了对于一般偏序集带非布尔标签的函数的“偏序排序”问题[LRV22]。