弱受监督学习的数据一致性 (Data Consistency for Weakly Supervised Learning) - 专知论文

会员服务 ·

0

标注 · 监督 · 学成 · 监督学习 · 约束优化 ·

2022 年 2 月 8 日

Data Consistency for Weakly Supervised Learning

翻译：弱受监督学习的数据一致性

Chidubem Arachie,Bert Huang

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We propose a novel weak supervision algorithm that processes noisy labels, i.e., weak signals, while also considering features of the training data to produce accurate labels for training. Our method searches over classifiers of the data representation to find plausible labelings. We call this paradigm data consistent weak supervision. A key facet of our framework is that we are able to estimate labels for data examples low or no coverage from the weak supervision. In addition, we make no assumptions about the joint distribution of the weak signals and true labels of the data. Instead, we use weak signals and the data features to solve a constrained optimization that enforces data consistency among the labels we generate. Empirical evaluation of our method on different datasets shows that it significantly outperforms state-of-the-art weak supervision methods on both text and image classification tasks.

翻译：在许多应用中,培训机器学习模式涉及使用大量人文附加说明的数据。为数据获取精确标签费用昂贵。相反,在监管薄弱的情况下,培训提供了低成本的替代办法。我们建议采用新的薄弱监督算法,处理吵闹标签,即弱信号,同时考虑培训数据的特点,以生成准确的培训标签。我们的方法搜索数据代表的分类人员,以寻找可信的标签。我们称这种模式数据为持续薄弱的监管。我们框架的一个关键方面是,我们能够估计数据示例的标签低或没有来自薄弱监管的覆盖。此外,我们没有假设数据弱信号和真实标签的联合分布。相反,我们使用弱信号和数据特征来解决限制优化的问题,以强化我们所制作的标签的数据一致性。我们对不同数据集的方法进行的经验性评估表明,它大大超越了文本和图像分类工作方面的最新监管方法。

0

相关内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

基于缺失数据分析和信息几何理论的SAR图像自动目标识别研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于似然函数的统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

相依样本下的经验似然推断

国家自然科学基金

0+阅读 · 2012年12月31日

基于核方法的非局部图像处理

国家自然科学基金

0+阅读 · 2012年12月31日

基于约束的高维数据聚类

国家自然科学基金

2+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

重载齿轮箱复杂工况多源激励下复合故障耦合机理及诊断方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂数据的统计分析与建模

国家自然科学基金

5+阅读 · 2011年12月31日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Self-supervised Learning for Sonar Image Classification

Arxiv

0+阅读 · 2022年4月20日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

Weakly-supervised Temporal Path Representation Learning with Contrastive Curriculum Learning -- Extended Version

Weakly-supervised Temporal Path Representation Learning with Contrastive Curriculum Learning -- Extended Version

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

相关论文

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Self-supervised Learning for Sonar Image Classification

Arxiv

0+阅读 · 2022年4月20日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

Weakly-supervised Temporal Path Representation Learning with Contrastive Curriculum Learning -- Extended Version

Weakly-supervised Temporal Path Representation Learning with Contrastive Curriculum Learning -- Extended Version

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

基于缺失数据分析和信息几何理论的SAR图像自动目标识别研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于似然函数的统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

相依样本下的经验似然推断

国家自然科学基金

0+阅读 · 2012年12月31日

基于核方法的非局部图像处理

国家自然科学基金

0+阅读 · 2012年12月31日

基于约束的高维数据聚类

国家自然科学基金

2+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

赋值理论与几何不等式的研究

国家自然科学基金

1+阅读 · 2011年12月31日

重载齿轮箱复杂工况多源激励下复合故障耦合机理及诊断方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂数据的统计分析与建模

国家自然科学基金

5+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员