利用代控计量器进行在线实验:指导方针和案例研究 (Online Experimentation with Surrogate Metrics: Guidelines and a Case Study) - 专知论文

会员服务 ·

0

CASE · 假正例率 · Marketplace · 正则化项 · 假阳性 ·

2021 年 6 月 2 日

Online Experimentation with Surrogate Metrics: Guidelines and a Case Study

翻译：利用代控计量器进行在线实验:指导方针和案例研究

Weitao Duan,Shan Ba,Chunzhe Zhang

A/B tests have been widely adopted across industries as the golden rule that guides decision making. However, the long-term true north metrics we ultimately want to drive through A/B test may take a long time to mature. In these situations, a surrogate metric which predicts the long-term metric is often used instead to conclude whether the treatment is effective. However, because the surrogate rarely predicts the true north perfectly, a regular A/B test based on surrogate metrics tends to have high false positive rate and the treatment variant deemed favorable from the test may not be the winning one. In this paper, we discuss how to adjust the A/B testing comparison to ensure experiment results are trustworthy. We also provide practical guidelines on the choice of good surrogate metrics. To provide a concrete example of how to leverage surrogate metrics for fast decision making, we present a case study on developing and evaluating the predicted confirmed hire surrogate metric in LinkedIn job marketplace.

翻译：A/B测试已被各行业广泛采用,作为指导决策的黄金规则。然而,我们最终希望通过A/B测试的长期真实的北方指标可能需要很长时间才能成熟。在这种情况下,通常使用预测长期指标的替代指标来断定治疗是否有效。然而,由于替代指标很少完美地预测真实的北方,基于代用指标的常规A/B测试往往具有很高的假正率,而被认为优于测试的治疗变量可能不是获胜的。在本文件中,我们讨论如何调整A/B测试的比较,以确保实验结果可信。我们还为选择良好的代用指标提供了实用指南。为如何利用代用指标快速决策提供具体实例,我们介绍了一项关于开发和评价LinkedIn工作市场中预计确认的雇用代用代用指标的案例研究。

0

相关内容

CASE

如何构建你的推荐系统？这份21页ppt教程为你讲解

如何构建你的推荐系统？这份21页ppt教程为你讲解

专知会员服务

65+阅读 · 2021年2月12日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

4+阅读 · 2017年12月5日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

T-RECS: A Simulation Tool to Study the Societal Impact of Recommender Systems

Arxiv

0+阅读 · 2021年7月28日

Design and Analysis of a Robotic Lizard using Five-Bar Mechanism

Arxiv

0+阅读 · 2021年7月27日

A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods

Arxiv

0+阅读 · 2021年7月26日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年7月26日

Max-Type and Sum-Type Procedures for Online Change-Point Detection in the Mean of High-Dimensional Data

Arxiv

0+阅读 · 2021年7月26日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

Adaptively Weighted Top-N Recommendation for Organ Matching

Arxiv

0+阅读 · 2021年7月23日

GeomCA: Geometric Evaluation of Data Representations

GeomCA: Geometric Evaluation of Data Representations

Arxiv

11+阅读 · 2021年5月26日

Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

Arxiv

6+阅读 · 2020年8月20日

Products of Euclidean metrics and applications to proximity questions among curves

Arxiv

3+阅读 · 2020年4月13日

VIP会员

文章信息

相关主题

相关VIP内容

如何构建你的推荐系统？这份21页ppt教程为你讲解

如何构建你的推荐系统？这份21页ppt教程为你讲解

专知会员服务

65+阅读 · 2021年2月12日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰太空研究（2022-2024年） | 176页

新型军用战斗机无人机（MFUAV’s）| 2025最新80页

国防领域人工智能走向何方？

无人机对士兵的心理影响

相关资讯

CCF推荐 | 国际会议信息10条

CCF推荐 | 国际会议信息10条

Call4Papers

8+阅读 · 2019年5月27日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

4+阅读 · 2017年12月5日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

T-RECS: A Simulation Tool to Study the Societal Impact of Recommender Systems

Arxiv

0+阅读 · 2021年7月28日

Design and Analysis of a Robotic Lizard using Five-Bar Mechanism

Arxiv

0+阅读 · 2021年7月27日

A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods

Arxiv

0+阅读 · 2021年7月26日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年7月26日

Max-Type and Sum-Type Procedures for Online Change-Point Detection in the Mean of High-Dimensional Data

Arxiv

0+阅读 · 2021年7月26日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

Adaptively Weighted Top-N Recommendation for Organ Matching

Arxiv

0+阅读 · 2021年7月23日

GeomCA: Geometric Evaluation of Data Representations

GeomCA: Geometric Evaluation of Data Representations

Arxiv

11+阅读 · 2021年5月26日

Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank

Arxiv

6+阅读 · 2020年8月20日

Products of Euclidean metrics and applications to proximity questions among curves

Arxiv

3+阅读 · 2020年4月13日

微信扫码咨询专知VIP会员