We study the Non-Homogeneous Sequential Hypothesis Testing (NHSHT), where a single active Decision-Maker (DM) selects actions with heterogeneous positive costs to identify the true hypothesis under an average error constraint \(\delta\), while minimizing expected total cost paid. Under standard arguments, we show that the objective decomposes into the product of the mean number of samples and the mean per-action cost induced by the policy. This leads to a key design principle: one should optimize the ratio of expectations (expected information gain per expected cost) rather than the expectation of per-step information-per-cost ("bit-per-buck"), which can be suboptimal. We adapt the Chernoff scheme to NHSHT, preserving its classical \(\log 1/\delta\) scaling. In simulations, the adapted scheme reduces mean cost by up to 50\% relative to the classic Chernoff policy and by up to 90\% relative to the naive bit-per-buck heuristic.
翻译:本文研究非均匀序贯假设检验问题,其中单个主动决策者通过选择具有异质正代价的动作,在平均误差约束\(\delta\)下识别真实假设,同时最小化预期总代价。通过标准论证,我们证明该目标可分解为策略诱导的样本均值与单动作代价均值的乘积。这引出一个关键设计原则:应优化期望比值(预期信息增益与预期代价之比),而非优化单步信息-代价比("每比特成本")的期望值,后者可能产生次优解。我们将Chernoff方案适配至非均匀序贯假设检验框架,保持其经典的\(\log 1/\delta\)缩放特性。仿真结果表明,改进方案相较于经典Chernoff策略平均代价降低最高达50%,相较于朴素每比特成本启发式方法降低最高达90%。