可能依赖观测的近最佳非几何非几何序列测试和信任序列 (Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations)

Sequential testing, always-valid $p$-values, and confidence sequences promise flexible statistical inference and on-the-fly decision making. However, unlike fixed-$n$ inference based on asymptotic normality, existing sequential tests either make parametric assumptions and end up under-covering/over-rejecting when these fail or use non-parametric but conservative concentration inequalities and end up over-covering/under-rejecting. To circumvent these issues, we sidestep exact at-least-$\alpha$ coverage and focus on asymptotically exact coverage and asymptotic optimality. That is, we seek sequential tests whose probability of ever rejecting a true hypothesis asymptotically approaches $\alpha$ and whose expected time to reject a false hypothesis approaches a lower bound on all tests with asymptotic coverage at least $\alpha$, both under an appropriate asymptotic regime. We permit observations to be both non-parametric and dependent and focus on testing whether the observations form a martingale difference sequence. We propose the universal sequential probability ratio test (uSPRT), a slight modification to the normal-mixture sequential probability ratio test, where we add a burn-in period and adjust thresholds accordingly. We show that even in this very general setting, the uSPRT is asymptotically optimal under mild generic conditions. We apply the results to stabilized estimating equations to test means, treatment effects, etc. Our results also provide corresponding guarantees for the implied confidence sequences. Numerical simulations verify our guarantees and the benefits of the uSPRT over alternatives.

翻译：序列测试,总是以美元计价的美元价值,以及信任序列,都有可能进行灵活的统计推断和在飞行时做出决策。然而,与基于无症状正常度的固定美元假设不同的是,现有的顺序测试,要么作出参数假设,要么在出现故障时最终进行覆盖不足/过度反射,要么使用非参数性但保守的集中不平等,最终导致过度覆盖/反射。为了绕过这些问题,我们退一步精确地标定在最小值-美元值的覆盖度,并侧重于无症状准确的覆盖度和无症状的模拟优化。也就是说,我们寻求连续测试,其可能永远拒绝真实假设的概率接近于美元,要么在出现故障时,要么使用非参数性但保守的集中度不平等性不平等,要么最终导致过度覆盖/反弹。我们允许观测结果为非临界值和依赖性,同时侧重于测试观察是否以最低值的精确度精确度准确度准确度准确度准确度和无症状的精确度优化度。我们提议在正常的顺序下进行普遍概率测试。我们提议对数值的测测测算,这是一次测序测测算。我们测算的概率测算的概率测算,以测算。