以模型为基础的衡量标准:预测模型亚人口群业绩抽样有效估计 (Model-based metrics: Sample-efficient estimates of predictive model subpopulation performance) - 专知论文

会员服务 ·

0

Performer · 估计/估计量 · MoDELS · 方差 · 子采样 ·

2021 年 4 月 25 日

Model-based metrics: Sample-efficient estimates of predictive model subpopulation performance

翻译：以模型为基础的衡量标准:预测模型亚人口群业绩抽样有效估计

Andrew C. Miller,Leon A. Gatys,Joseph Futoma,Emily B. Fox

from arxiv, 27 pages, 8 figures

Machine learning models $-$ now commonly developed to screen, diagnose, or predict health conditions $-$ are evaluated with a variety of performance metrics. An important first step in assessing the practical utility of a model is to evaluate its average performance over an entire population of interest. In many settings, it is also critical that the model makes good predictions within predefined subpopulations. For instance, showing that a model is fair or equitable requires evaluating the model's performance in different demographic subgroups. However, subpopulation performance metrics are typically computed using only data from that subgroup, resulting in higher variance estimates for smaller groups. We devise a procedure to measure subpopulation performance that can be more sample-efficient than the typical subsample estimates. We propose using an evaluation model $-$ a model that describes the conditional distribution of the predictive model score $-$ to form model-based metric (MBM) estimates. Our procedure incorporates model checking and validation, and we propose a computationally efficient approximation of the traditional nonparametric bootstrap to form confidence intervals. We evaluate MBMs on two main tasks: a semi-synthetic setting where ground truth metrics are available and a real-world hospital readmission prediction task. We find that MBMs consistently produce more accurate and lower variance estimates of model performance for small subpopulations.

翻译：目前通常为筛选、诊断或预测健康状况而开发的机器学习模型以美元为单位,用各种性能衡量对美元进行评估。评估模型实际效用的一个重要第一步是评价其在整个受关注人群中的平均性能。在许多环境下,模型还必须在预先界定的亚群中作出良好的预测。例如,显示模型公平或公平需要评估模型在不同人口分组中的性能。然而,子人口性能衡量标准通常仅使用该分组的数据来计算,从而得出较小群体的差异估计值。我们设计了一个程序,以衡量比典型的子抽样估计值更高效的子人口性能。我们建议使用一个评价模型来说明预测模型的有条件分布,在预先界定的亚群群群中,以美元作为基于模型的估计值。我们的程序包括模型的检查和验证,我们建议对传统的非参数性能测距进行计算效率的近似近度,以形成信任度。我们评估了两个主要任务:一种半合成的基底线测量值比典型的亚群性能效率更高。我们建议使用一个模型,说明一种有条件的模型,用以得出更精确的地测算和低实际的医院的性能。

0

相关内容

Performer

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

机器学习经典—理论与算法 [王立威北京大学] 2019年中国计算机大会计算机经典算法回顾与展望——机器学习与数据挖掘论坛

机器学习经典—理论与算法 [王立威北京大学] 2019年中国计算机大会计算机经典算法回顾与展望——机器学习与数据挖掘论坛

专知会员服务

36+阅读 · 2019年10月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Nonparametric Empirical Bayes Estimation and Testing for Sparse and Heteroscedastic Signals

Arxiv

0+阅读 · 2021年6月16日

An Empirical Characterization of Fair Machine Learning For Clinical Risk Prediction

Arxiv

0+阅读 · 2021年6月15日

Inference with generalizable classifier predictions

Arxiv

0+阅读 · 2021年6月14日

Estimating the size of a closed population by modeling latent and observed heterogeneity

Arxiv

0+阅读 · 2021年6月13日

Parameter Estimation and Model-Based Clustering with Spherical Normal Distribution on the Unit Hypersphere

Arxiv

0+阅读 · 2021年6月11日

Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?

Arxiv

0+阅读 · 2021年6月11日

An Effective Baseline for Robustness to Distributional Shift

Arxiv

0+阅读 · 2021年5月15日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

【ACL2020-Google】学习鲁棒度量的文本生成，BLEURT: Learning Robust Metrics for Text Generation

专知会员服务

17+阅读 · 2020年4月10日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

机器学习经典—理论与算法 [王立威北京大学] 2019年中国计算机大会计算机经典算法回顾与展望——机器学习与数据挖掘论坛

机器学习经典—理论与算法 [王立威北京大学] 2019年中国计算机大会计算机经典算法回顾与展望——机器学习与数据挖掘论坛

专知会员服务

36+阅读 · 2019年10月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Nonparametric Empirical Bayes Estimation and Testing for Sparse and Heteroscedastic Signals

Arxiv

0+阅读 · 2021年6月16日

An Empirical Characterization of Fair Machine Learning For Clinical Risk Prediction

Arxiv

0+阅读 · 2021年6月15日

Inference with generalizable classifier predictions

Arxiv

0+阅读 · 2021年6月14日

Estimating the size of a closed population by modeling latent and observed heterogeneity

Arxiv

0+阅读 · 2021年6月13日

Parameter Estimation and Model-Based Clustering with Spherical Normal Distribution on the Unit Hypersphere

Arxiv

0+阅读 · 2021年6月11日

Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?

Arxiv

0+阅读 · 2021年6月11日

An Effective Baseline for Robustness to Distributional Shift

Arxiv

0+阅读 · 2021年5月15日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员