通过校准子集选择改进筛选过程 (Improving Screening Processes via Calibrated Subset Selection) - 专知论文

会员服务 ·

0

短列表 · Processing（编程语言） · search engine · CSS · 相互独立的 ·

2022 年 6 月 13 日

Improving Screening Processes via Calibrated Subset Selection

翻译：通过校准子集选择改进筛选过程

Lequn Wang,Thorsten Joachims,Manuel Gomez Rodriguez

from arxiv, International Conference on Machine Learning (ICML) 2022

Many selection processes such as finding patients qualifying for a medical trial or retrieval pipelines in search engines consist of multiple stages, where an initial screening stage focuses the resources on shortlisting the most promising candidates. In this paper, we investigate what guarantees a screening classifier can provide, independently of whether it is constructed manually or trained. We find that current solutions do not enjoy distribution-free theoretical guarantees -- we show that, in general, even for a perfectly calibrated classifier, there always exist specific pools of candidates for which its shortlist is suboptimal. Then, we develop a distribution-free screening algorithm -- called Calibrated Subset Selection (CSS) -- that, given any classifier and some amount of calibration data, finds near-optimal shortlists of candidates that contain a desired number of qualified candidates in expectation. Moreover, we show that a variant of CSS that calibrates a given classifier multiple times across specific groups can create shortlists with provable diversity guarantees. Experiments on US Census survey data validate our theoretical results and show that the shortlists provided by our algorithm are superior to those provided by several competitive baselines.

翻译：许多选择过程,例如找到有资格在搜索引擎中接受医疗试验或检索管道的病人,这些选择过程包括多个阶段,初步筛选阶段将资源集中到最有希望的候选人的短名单中。在本文中,我们调查一个筛选分类师可以提供哪些保障,而不管它是人工还是经过培训的。我们发现,目前的解决方案并不享有无分配的理论保障 -- -- 我们发现,一般来说,即使是一个完全校准的分类师,也总是有其短名单不最优的候选者。然后,我们开发了一个不分发的筛选算法 -- -- 称为校准子选择(CSS) -- -- 考虑到任何分类员和某些数量的校准数据,我们发现几乎最理想的候选人短名单,其中含有期望的合格候选人人数。此外,我们显示,一个对特定分类师进行多次校准、跨特定群体进行校准的变式,可以产生具有可辨定多样性保证的短名单。关于人口普查数据的实验证实了我们的理论结果,并表明,我们的算法提供的短名单优于几个竞争性基线提供的候选人。

0

相关内容

短列表

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

VSTM1调控单核/巨噬细胞功能及动脉粥样硬化发生发展的体内外研究

国家自然科学基金

0+阅读 · 2014年12月31日

ATP激活血管内皮细胞P2Y2受体趋化巡逻型单核细胞稳定动脉粥样硬化斑块

国家自然科学基金

0+阅读 · 2013年12月31日

纳米复合镁基储氢材料热力学及动力学调控

国家自然科学基金

0+阅读 · 2012年12月31日

miR-145/PAK4/LIMK1调控通路介导结直肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白乙酰化和去乙酰化对MRTF-A抗脑缺血诱导神经细胞凋亡的影响及机制

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒ORFV119蛋白与宿主细胞相互作用的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Research Trends and Applications of Data Augmentation Algorithms

Arxiv

0+阅读 · 2022年8月2日

Bayesian Variable Selection in a Million Dimensions

Arxiv

0+阅读 · 2022年8月2日

Fast Two-step Blind Optical Aberration Correction

Arxiv

0+阅读 · 2022年8月1日

A Rotation Meanout Network with Invariance for Dermoscopy Image Classification and Retrieval

Arxiv

0+阅读 · 2022年8月1日

How to Enable Collaboration in Open Government Data Ecosystems: A Public Platform Provider's Perspective

Arxiv

0+阅读 · 2022年7月30日

Consistent Quality Oriented Rate Control in HEVC via Balancing Intra and Inter Frame Coding

Arxiv

0+阅读 · 2022年7月30日

What to share, when, and where: balancing the objectives and complexities of open source software contributions

Arxiv

0+阅读 · 2022年7月29日

Integrated multimodal artificial intelligence framework for healthcare applications

Arxiv

0+阅读 · 2022年7月29日

How Many Equations of Motion Describe a Moving Human?

Arxiv

0+阅读 · 2022年7月28日

Federated Causal Inference in Heterogeneous Observational Data

Arxiv

24+阅读 · 2021年8月10日

VIP会员

文章信息

相关主题

Processing（编程语言）

相互独立的

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】多目标奖励与偏好优化：理论与算法

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

自主化海军：海上无人系统与未来海战

迈向智能体系统规模化的科学

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Research Trends and Applications of Data Augmentation Algorithms

Arxiv

0+阅读 · 2022年8月2日

Bayesian Variable Selection in a Million Dimensions

Arxiv

0+阅读 · 2022年8月2日

Fast Two-step Blind Optical Aberration Correction

Arxiv

0+阅读 · 2022年8月1日

A Rotation Meanout Network with Invariance for Dermoscopy Image Classification and Retrieval

Arxiv

0+阅读 · 2022年8月1日

How to Enable Collaboration in Open Government Data Ecosystems: A Public Platform Provider's Perspective

Arxiv

0+阅读 · 2022年7月30日

Consistent Quality Oriented Rate Control in HEVC via Balancing Intra and Inter Frame Coding

Arxiv

0+阅读 · 2022年7月30日

What to share, when, and where: balancing the objectives and complexities of open source software contributions

Arxiv

0+阅读 · 2022年7月29日

Integrated multimodal artificial intelligence framework for healthcare applications

Arxiv

0+阅读 · 2022年7月29日

How Many Equations of Motion Describe a Moving Human?

Arxiv

0+阅读 · 2022年7月28日

Federated Causal Inference in Heterogeneous Observational Data

Arxiv

24+阅读 · 2021年8月10日

相关基金

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

VSTM1调控单核/巨噬细胞功能及动脉粥样硬化发生发展的体内外研究

国家自然科学基金

0+阅读 · 2014年12月31日

ATP激活血管内皮细胞P2Y2受体趋化巡逻型单核细胞稳定动脉粥样硬化斑块

国家自然科学基金

0+阅读 · 2013年12月31日

纳米复合镁基储氢材料热力学及动力学调控

国家自然科学基金

0+阅读 · 2012年12月31日

miR-145/PAK4/LIMK1调控通路介导结直肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白乙酰化和去乙酰化对MRTF-A抗脑缺血诱导神经细胞凋亡的影响及机制

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒ORFV119蛋白与宿主细胞相互作用的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于MUAV平台的ARGIS扩展技术

国家自然科学基金

1+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员