This paper develops a model-free sequential test for conditional independence. The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure, and safely conclude whether a feature is conditionally associated with the response under study. We allow the processing of data points online as soon as they arrive and stop data acquisition once significant results are detected while rigorously controlling the type-I error rate. Our test can work with any sophisticated machine learning algorithm to enhance data efficiency to the extent possible. The developed method is inspired by two statistical frameworks. The first is the model-X conditional randomization test, a test for conditional independence that is valid in offline settings where the sample size is fixed in advance. The second is testing by betting, a "game-theoretic" approach for sequential hypothesis testing. We conduct synthetic experiments to demonstrate the advantage of our test over out-of-the-box sequential tests that account for the multiplicity of tests in the time horizon, and demonstrate the practicality of our proposal by applying it to real-world tasks.
翻译:本文开发了无模式的有条件独立顺序测试。 拟议的测试允许研究人员分析一个输入的 i. d. 数据流, 并任意依赖结构, 并安全地确定一个特性是否有条件地与研究中的答复相关。 我们允许一旦数据点到达后立即在网上处理, 并在发现重要结果后停止获取数据, 同时严格控制I型误差率。 我们的测试可以与任何先进的机器学习算法合作, 以尽可能提高数据效率。 开发的方法受两个统计框架的启发。 第一个是模型- X 有条件随机化测试, 这是在样本大小事先固定的离线设置中有效的有条件独立测试。 第二个测试是通过赌进行连续假设测试的“ 游戏理论” 方法。 我们进行合成实验, 以展示我们的测试优势, 超越框外的顺序测试, 从而考虑到时间范围内的多重测试, 并展示我们提案的实用性, 将其应用到现实世界的任务中 。