Streaming algorithms for evaluating noisy judges on unlabeled data -- binary classification - 专知论文

会员服务 ·

0

相互独立的 · 相关系数 · binary · 估计/估计量 · 未标记 ·

2023 年 6 月 2 日

Streaming algorithms for evaluating noisy judges on unlabeled data -- binary classification

翻译：暂无翻译

Andrés Corrada-Emmanuel

from arxiv, 23 pages, 5 figures

The evaluation of noisy binary classifiers on unlabeled data is treated as a streaming task: given a data sketch of the decisions by an ensemble, estimate the true prevalence of the labels as well as each classifier's accuracy on them. Two fully algebraic evaluators are constructed to do this. Both are based on the assumption that the classifiers make independent errors. The first is based on majority voting. The second, the main contribution of the paper, is guaranteed to be correct. But how do we know the classifiers are independent on any given test? This principal/agent monitoring paradox is ameliorated by exploiting the failures of the independent evaluator to return sensible estimates. A search for nearly error independent trios is empirically carried out on the \texttt{adult}, \texttt{mushroom}, and \texttt{two-norm} datasets by using the algebraic failure modes to reject evaluation ensembles as too correlated. The searches are refined by constructing a surface in evaluation space that contains the true value point. The algebra of arbitrarily correlated classifiers permits the selection of a polynomial subset free of any correlation variables. Candidate evaluation ensembles are rejected if their data sketches produce independent estimates too far from the constructed surface. The results produced by the surviving ensembles can sometimes be as good as 1\%. But handling even small amounts of correlation remains a challenge. A Taylor expansion of the estimates produced when independence is assumed but the classifiers are, in fact, slightly correlated helps clarify how the independent evaluator has algebraic `blind spots'.

翻译：暂无翻译

0

相关内容

相互独立的

相互独立的

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

Con A修饰型抗菌肽脂质体对单增李斯特菌生物被膜的靶向作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自适应全变分模型的高分辨率遥感影像房屋提取及后处理方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

酸雨区污染农田土壤重金属面源输出季节特征及驱动机制

国家自然科学基金

0+阅读 · 2013年12月31日

二向性反射分布函数的先验知识耦合式融合方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

纳米脂质体在肿瘤亚细胞靶器官分布动力学计算模型的设计

国家自然科学基金

0+阅读 · 2012年12月31日

页岩气储层微观结构及岩石物理响应数值模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

城市轨道交通中直线感应电机的非参数建模与反步最优控制研究

国家自然科学基金

0+阅读 · 2008年12月31日

A review on Bayesian model-based clustering

Arxiv

0+阅读 · 2023年7月24日

Risk-optimized Outlier Removal for Robust Point Cloud Classification

Arxiv

0+阅读 · 2023年7月20日

Multilevel latent class analysis with covariates: Analysis of cross-national citizenship norms with a two-stage approach

Arxiv

0+阅读 · 2023年7月20日

Consistent Group selection using Global-local prior in High dimensional setup

Arxiv

0+阅读 · 2023年7月20日

Correcting Underrepresentation and Intersectional Bias for Fair Classification

Arxiv

0+阅读 · 2023年7月19日

Sequential Predictive Two-Sample and Independence Testing

Arxiv

0+阅读 · 2023年7月19日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Text Classification Algorithms: A Survey

Arxiv

16+阅读 · 2020年5月20日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相互独立的

估计/估计量

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

A review on Bayesian model-based clustering

Arxiv

0+阅读 · 2023年7月24日

Risk-optimized Outlier Removal for Robust Point Cloud Classification

Arxiv

0+阅读 · 2023年7月20日

Multilevel latent class analysis with covariates: Analysis of cross-national citizenship norms with a two-stage approach

Arxiv

0+阅读 · 2023年7月20日

Consistent Group selection using Global-local prior in High dimensional setup

Arxiv

0+阅读 · 2023年7月20日

Correcting Underrepresentation and Intersectional Bias for Fair Classification

Arxiv

0+阅读 · 2023年7月19日

Sequential Predictive Two-Sample and Independence Testing

Arxiv

0+阅读 · 2023年7月19日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Text Classification Algorithms: A Survey

Arxiv

16+阅读 · 2020年5月20日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

Con A修饰型抗菌肽脂质体对单增李斯特菌生物被膜的靶向作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

自适应全变分模型的高分辨率遥感影像房屋提取及后处理方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

酸雨区污染农田土壤重金属面源输出季节特征及驱动机制

国家自然科学基金

0+阅读 · 2013年12月31日

二向性反射分布函数的先验知识耦合式融合方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

纳米脂质体在肿瘤亚细胞靶器官分布动力学计算模型的设计

国家自然科学基金

0+阅读 · 2012年12月31日

页岩气储层微观结构及岩石物理响应数值模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

Toll 样受体介导的巨噬细胞对prion清除的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

城市轨道交通中直线感应电机的非参数建模与反步最优控制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员