We develop and compare e-variables for testing whether $k$ samples of data are drawn from the same distribution, the alternative being that they come from different elements of an exponential family. We consider the GRO (growth-rate optimal) e-variables for (1) a 'small' null inside the same exponential family, and (2) a 'large' nonparametric null, as well as (3) an e-variable arrived at by conditioning on the sum of the sufficient statistics. (2) and (3) are efficiently computable, and extend ideas from Turner et al. [2021] and Wald [1947] respectively from Bernoulli to general exponential families. We provide theoretical and simulation-based comparisons of these e-variables in terms of their logarithmic growth rate, and find that for small effects all four e-variables behave surprisingly similarly; for the Gaussian location and Poisson families, e-variables (1) and (3) coincide; for Bernoulli, (1) and (2) coincide; but in general, whether (2) or (3) grows faster against the small null is family-dependent. We furthermore discuss algorithms for numerically approximating (1).
翻译:我们开发并比较电子变量,以测试是否从同一分布中提取了K美元的数据样本,替代办法是它们来自指数式家庭的不同元素。我们认为,对于(1) 同一指数式家庭内的“小”无效和(2) “大”非参数无效,以及(3) 通过以充足统计数据的总和为条件实现的电子变量。(2)和(3) 高效地进行了计算,并将Turner等人[2021年]和Wald[1947年]的想法分别从伯努利(Bernoulli)和[1947年]扩大到一般指数式家庭。我们对这些电子变量的对数值增长率进行理论和模拟比较,发现对于小效果而言,所有四种电子变量都表现得惊人地相似;对于Gaussian和Poisson家庭,电子变量(1)和(3)相同;对于Bernoulli,(1)和(2)相同;但一般而言,是(2)或(3)相对于小变量增长较快于家庭。我们进一步讨论数字匹配(1)的算法。</s>