We study the problem of crowdsourced PAC learning of threshold functions with pairwise comparisons. This is a challenging problem and only recently have query-efficient algorithms been established in the scenario where the majority of the crowd are perfect. In this work, we investigate the significantly more challenging case that the majority are incorrect, which in general renders learning impossible. We show that under the semi-verified model of Charikar~et~al.~(2017), where we have (limited) access to a trusted oracle who always returns the correct annotation, it is possible to PAC learn the underlying hypothesis class while drastically mitigating the labeling cost via the more easily obtained comparison queries. Orthogonal to recent developments in semi-verified or list-decodable learning that crucially rely on data distributional assumptions, our PAC guarantee holds by exploring the wisdom of the crowd.
翻译:我们用对比方法研究众包PAC学习临界函数的问题。 这是一个具有挑战性的问题,直到最近才在大多数人群完美无缺的情景中建立了查询效率算法。 在这项工作中,我们调查了多数人不正确、一般来说无法学习的更具挑战性的案例。我们证明,在半验证的Charikar~et~al.~(2017)模式下,我们有机会(有限)接触一个总是返回正确注释的可信神器,因此,PAC可以了解基本假设类,同时通过更容易获得的比较查询大幅度降低标签成本。对于关键依赖数据分布假设的半核实或列表可辨别学习的最新发展,我们PAC保证探索人群的智慧。