eBay's experimentation platform runs hundreds of A/B tests on any given day. The platform integrates with the tracking infrastructure and customer experience servers, provides the sampling service for experiments, and has the responsibility to monitor the progress of each A/B test. There are many challenges especially when it is required to ensure experiment quality at the large scale. We discuss two automated test quality monitoring processes and methodologies, namely randomization validation using population stability index (PSI) and sample ratio mismatch (a.k.a. sample delta) detection using sequential analysis. The automated processes assist the experimentation platform to run high quality and trustworthy tests not only effectively on a large scale, but also efficiently by minimizing false positive monitoring alarms to experimenters.
翻译:eBay的实验平台在任何一天都进行数百次A/B测试。 该平台与跟踪基础设施和客户经验服务器相结合,为实验提供抽样服务,并负责监测每项A/B测试的进展情况。 存在许多挑战,特别是在需要确保大规模试验质量时。 我们讨论了两个自动测试质量监测程序和方法,即使用人口稳定性指数随机化验证和通过连续分析检测样本比例不匹配(a.k.a.样本三角塔),自动化流程帮助实验平台不仅大规模有效地进行高质量和可信赖的测试,而且通过向实验者减少虚假的正面监测警报而有效。</s>