The standard A/B testing approaches are mostly based on t-test in large scale industry applications. These standard approaches however suffers from low statistical power in business settings, due to nature of small sample-size or non-Gaussian distribution or return-on-investment (ROI) consideration. In this paper, we (i) show the statistical efficiency of using estimating equation and U statistics, which can address these issues separately; and (ii) propose a novel doubly robust generalized U that allows flexible definition of treatment effect, and can handles small samples, distribution robustness, ROI and confounding consideration in one framework. We provide theoretical results on asymptotics and efficiency bounds, together with insights on the efficiency gain from theoretical analysis. We further conduct comprehensive simulation studies, apply the methods to multiple real A/B tests at LinkedIn, and share results and learnings that are broadly useful.
翻译:在工业界大规模应用中,标准A/B测试方法主要基于t检验。然而,由于小样本量、非高斯分布或投资回报率考量的固有特性,这些标准方法在商业场景中往往统计功效不足。本文中,我们(i)展示了使用估计方程与U统计量的统计效能,其可分别解决上述问题;(ii)提出一种新颖的双稳健广义U统计量,该框架允许灵活定义处理效应,并能同时处理小样本、分布稳健性、投资回报率及混杂因素考量。我们提供了关于渐近性与效率界限的理论结果,并从理论分析中阐释了效能增益的机理。我们进一步开展了全面的模拟研究,将所提方法应用于领英平台的多个真实A/B测试,并分享了具有广泛实用价值的实验结果与启示。