Randomization tests are based on a re-randomization of existing data to gain data-dependent critical values that lead to exact hypothesis tests under special circumstances. However, it is not always possible to re-randomize data in accordance to the physical randomization from which the data has been obained. As a consequence, most statistical tests cannot control the type I error probability. Still, similarly as the bootstrap, data re-randomization can be used to improve the type I error control. However, no general asymptotic theory under weak null hypotheses has been developed for such randomization tests yet. It is the aim of this paper to provide a conveniently applicable theory on the asymptotic validity of randomization tests with asymptotically normal test statistics. Similarly, confidence intervals will be developed. This will be achieved by creating a link between two well-established fields in mathematical statistics: empirical processes and inference based on randomization via algebraic groups. A broadly applicable conditional weak convergence theorem is developed for empirical processes that are based on randomized observations. Random elements of an algebraic group are applied to the data vectors from which the randomized version of a statistic is derived. Combining a variant of the functional delta-method with a suitable studentization of the statistic, asymptotically exact hypothesis tests is deduced, while the finite sample exactness property under group-invariant sub-hypotheses is preserved. The methodology is exemplified with: the Pearson correlation coefficient, a Mann-Whitney effect based on right-censored paired data, and a competing risks analysis. The practical usefulness of the approaches is assessed through simulation studies and an application to data from patients suffering from diabetic retinopathy.
翻译:随机测试所依据的是对现有数据进行重新随机调整,以获得基于数据的关键值,从而导致在特殊情况下进行精确的假设测试。 但是, 并不总是有可能根据数据被忽略的物理随机性调整数据。 因此, 大多数统计测试无法控制I型误差概率。 然而, 与靴子陷阱一样, 数据重新随机化可以用来改进I型错误控制。 然而, 尚未为这种随机测试开发出基于数据依据的基于薄弱的无效假设的普通无遗传性理论。 本文的目的是为随机测试提供一种方便适用的理论, 以便根据这些数据的随机随机随机随机随机性测试重新调整数据。 同样, 将开发信任间隔。 这将通过在数学统计中建立两个成熟的字段间的联系来实现: 实验过程和根据随机化通过升温组进行的结果控制。 在随机化观测的基础上, 随机化的离子变现的变现的变现的变现的变现性数据, 与正变的变现的变现的变现的变现性数据, 正在将变现的变现的变现的变现的变现的变现的变现的变现性数据用于变现数据。