We propose confidence sequences -- sequences of confidence intervals which are valid uniformly over time -- for quantiles of any distribution over a complete, fully-ordered set, based on a stream of i.i.d. observations. We give methods both for tracking a fixed quantile and for tracking all quantiles simultaneously. Specifically, we provide explicit expressions with small constants for intervals whose widths shrink at the fastest possible $\sqrt{t^{-1} \log\log t}$ rate, along with a non-asymptotic concentration inequality for the empirical distribution function which holds uniformly over time with the same rate. The latter strengthens Smirnov's empirical process law of the iterated logarithm and extends the Dvoretzky-Kiefer-Wolfowitz inequality to hold uniformly over time. We give a new algorithm and sample complexity bound for selecting an arm with an approximately best quantile in a multi-armed bandit framework. In simulations, our method requires fewer samples than existing methods by a factor of five to fifty.
翻译:我们建议信任序列 -- -- 信任间隔序列序列,这些序列在时间上统一有效 -- -- 任何分布于完整、完全有序的集成体的四分位数,基于一流的i.d.观察。我们给出了追踪固定孔径和同时跟踪所有孔径的方法。具体地说,我们为宽度以最快速度缩小于$\sqrt{t ⁇ 1}\log\log\logt}$的间隔提供了清晰的常数表达式,同时提出一个非被动集中的不平等,用于经验分布函数,这种分配功能与同一速度保持统一。后者强化了Smirnov的迭代对数实证过程法,并扩展了Dvoretzky-Kiefer-Wolfowitzlitz 的不平等,以便统一时间。我们给出了一个新的算法和样本复杂性,用于选择一个在多臂土带框架中具有大约最佳的四分位器的手臂。在模拟中,我们的方法需要比现有方法少5至50倍的样品。