We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $\eta(x) \leq \eta$, where $\eta<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to $\eta$. Prior to our work, no non-trivial booster was known in this setting. Moreover, we show that this error upper bound is best possible for polynomial-time black-box boosters, under standard cryptographic assumptions. Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise. As a simple application of our positive result, we give the first efficient Massart learner for unions of high-dimensional rectangles.
翻译:我们用Massart噪音模型研究在(独立分配)PAC模型中提高学习能力薄弱者准确性的问题。 在Massart噪音模型中,每个例子的标签是美元($\eta(x)\leq\eta$)的单独错误分类,概率为$\eta <1/2美元。Massart模型存在于随机分类噪音模型和不可知模型之间。我们的主要积极结果就是在Massart噪音出现错误分类错误时,第一个计算高效的推算法,这种错误会任意接近$/eta美元。在我们工作之前,在这个环境中,没有已知的非三角推进器。此外,我们显示,根据标准的加密假设,这一错误对多球时黑盒助推器来说是最佳可能的。我们的上下界是使用Massart噪音推进分布独立PAC模型的复杂性。作为我们积极结果的一个简单应用,我们给了第一位高效的Massart学习器,用于高维矩的组合。