We consider the stochastic score classification problem. There are several binary tests, where each test $i$ is associated with a probability $p_i$ of being positive and a cost $c_i$. The score of an outcome is a weighted sum of all positive tests, and the range of possible scores is partitioned into intervals corresponding to different classes. The goal is to perform tests sequentially (and possibly adaptively) so as to identify the class at the minimum expected cost. We provide the first constant-factor approximation algorithm for this problem, which improves over the previously-known logarithmic approximation ratio. Moreover, our algorithm is $non$ $adaptive$: it just involves performing tests in a $fixed$ order until the class is identified. Our approach also extends to the $d$-dimensional score classification problem and the "explainable" stochastic halfspace evaluation problem (where we want to evaluate some function on $d$ halfspaces). We obtain an $O(d^2\log d)$-approximation algorithm for both these extensions. Finally, we perform computational experiments that demonstrate the practical performance of our algorithm for score classification. We observe that, for most instances, the cost of our algorithm is within $50\%$ of an information-theoretic lower bound on the optimal value.
翻译:我们考虑的是随机分数分类问题。 有好几项二进制测试, 每一次测试的美元均与正值和正值的概率美元和成本美元挂钩。 结果的分数是所有正值测试的加权总和, 可能分数的范围被分割成与不同类别相应的间隔。 目标是按顺序进行测试( 并可能进行适应性调整), 以便以最低预期成本来辨别该类。 我们为这一问题提供了第一个常数- 方位近似算法, 这个问题比以前已知的对数近比提高了。 此外, 我们的算法是 $non $dadoptive $: 它只是以美元固定的顺序进行测试, 直至确定该类。 我们的方法还扩展到了美元分数的分数分类问题和“ 可解释性”半空间评价问题( 我们想在半空里评估某些函数 $ d2\ d) 。 我们为这两个扩展获得了一个 $O( d2\ log) $- approcolation logationalationalation logalationaltigrational orgal) 。 最后, 我们为这些分级进行计算法中最低的算法。