机器学习，深度学习，数据挖掘或数学领域的研究人员，以下是一篇论文的题目和摘要：标题：微小、始终在线和脆弱的：通过设计选择在设备上的机器学习工作流中传播偏差 (Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows) - 专知论文

会员服务 ·

0

有偏 · ML · Learning · Machine Learning · 设计 ·

2023 年 3 月 17 日

Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows

翻译：机器学习，深度学习，数据挖掘或数学领域的研究人员，以下是一篇论文的题目和摘要：标题：微小、始终在线和脆弱的：通过设计选择在设备上的机器学习工作流中传播偏差

Wiebke Toussaint,Aaron Yi Ding,Fahim Kawsar,Akhil Mathur

from arxiv, To be published in ACM Transactions on Software Engineering and Methodology

Billions of distributed, heterogeneous and resource constrained IoT devices deploy on-device machine learning (ML) for private, fast and offline inference on personal data. On-device ML is highly context dependent, and sensitive to user, usage, hardware and environment attributes. This sensitivity and the propensity towards bias in ML makes it important to study bias in on-device settings. Our study is one of the first investigations of bias in this emerging domain, and lays important foundations for building fairer on-device ML. We apply a software engineering lens, investigating the propagation of bias through design choices in on-device ML workflows. We first identify reliability bias as a source of unfairness and propose a measure to quantify it. We then conduct empirical experiments for a keyword spotting task to show how complex and interacting technical design choices amplify and propagate reliability bias. Our results validate that design choices made during model training, like the sample rate and input feature type, and choices made to optimize models, like light-weight architectures, the pruning learning rate and pruning sparsity, can result in disparate predictive performance across male and female groups. Based on our findings we suggest low effort strategies for engineers to mitigate bias in on-device ML.

翻译：摘要：数十亿个分布式、异构和资源受限的物联网设备部署设备上的机器学习，以便在个人数据上进行私有、快速和脱机推断。设备上的机器学习高度依赖于上下文，对用户、使用、硬件和环境属性敏感。这种敏感性和机器学习的偏倚倾向，使研究设备上的偏差变得重要。我们的研究是对这一新兴领域中偏差的首次调查之一，为构建更公平的设备上机器学习奠定了重要基础。我们采用软件工程视角，研究设计选择在设备上机器学习工作流中传播偏差。我们首先将可靠性偏差作为不公平的来源，提出一种衡量方式来量化它。然后，我们针对关键词识别任务进行了实证实验，展示了复杂和相互作用的技术设计选择如何放大和传播可靠性偏差。我们的结果验证了模型训练期间做出的设计选择，如采样率和输入特征类型，以及优化模型所做的选择，如轻量级架构、修剪学习率和修剪稀疏度，可能导致男性和女性群体之间存在不同的预测性能。基于我们的发现，我们建议工程师采取低成本策略来减轻设备上机器学习中的偏差。

0

相关内容

【开放书】设计机器学习系统，Designing Machine Learning Systems

【开放书】设计机器学习系统，Designing Machine Learning Systems

专知会员服务

77+阅读 · 2022年5月17日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

专知会员服务

16+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

图与推荐

0+阅读 · 2022年11月14日

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

专知

18+阅读 · 2022年4月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知

69+阅读 · 2020年3月7日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

基于模型的安全关键的信息物理融合系统的设计方法中的软件综合

国家自然科学基金

1+阅读 · 2014年12月31日

符号网络理论研究与应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于动态因素的在线社会网络信息传播效果实时预测模型

国家自然科学基金

0+阅读 · 2013年12月31日

基于在线机器学习的组合算法交易策略研究

国家自然科学基金

5+阅读 · 2013年12月31日

产品、激励与种子顾客的交互模式对口碑传播效果的影响及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弱标注下基于主动学习的检测器适应问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

多天线无线通信系统的鲁棒性设计

国家自然科学基金

2+阅读 · 2012年12月31日

高速感应电机电流优化分配策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于化学与生物信息学的网络分布式计算共享平台研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于主动学习的半监督领域本体自动构建

国家自然科学基金

4+阅读 · 2009年12月31日

Mlinear: Rethink the Linear Model for Time-series Forecasting

Arxiv

0+阅读 · 2023年5月8日

Augmented Datasheets for Speech Datasets and Ethical Decision-Making

Arxiv

0+阅读 · 2023年5月8日

Machine Learning Systems are Bloated and Vulnerable

Arxiv

1+阅读 · 2023年5月8日

Portfolio-Based Incentive Mechanism Design for Cross-Device Federated Learning

Arxiv

0+阅读 · 2023年5月6日

An Overview of AI and Blockchain Integration for Privacy-Preserving

Arxiv

0+阅读 · 2023年5月6日

Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy

Arxiv

0+阅读 · 2023年5月3日

Recent Advances of Blockchain and its Applications

Arxiv

13+阅读 · 2022年8月16日

CELEST: Federated Learning for Globally Coordinated Threat Detection

Arxiv

17+阅读 · 2022年5月23日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

VIP会员

文章信息

相关主题

Machine Learning

相关VIP内容

【开放书】设计机器学习系统，Designing Machine Learning Systems

【开放书】设计机器学习系统，Designing Machine Learning Systems

专知会员服务

77+阅读 · 2022年5月17日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知会员服务

131+阅读 · 2020年3月7日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

【O'Reilly AI Conference 2019】高管简报：机器学习系统隐私的进步（Executive Briefing: Advances in privacy for machine learning systems），Katharine Jarmul

专知会员服务

16+阅读 · 2019年11月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

MIT博士论文 | 图指导的预测（含GNN的泛化能力和表示能力分析）

图与推荐

0+阅读 · 2022年11月14日

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

【干货书】基于统计和机器学习的实用时间序列分析预测，Time Series Analysis Prediction

专知

18+阅读 · 2022年4月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

【Manning2020新书】R/mlr机器学习，513页pdf，Machine Learning with R

专知

69+阅读 · 2020年3月7日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Mlinear: Rethink the Linear Model for Time-series Forecasting

Arxiv

0+阅读 · 2023年5月8日

Augmented Datasheets for Speech Datasets and Ethical Decision-Making

Arxiv

0+阅读 · 2023年5月8日

Machine Learning Systems are Bloated and Vulnerable

Arxiv

1+阅读 · 2023年5月8日

Portfolio-Based Incentive Mechanism Design for Cross-Device Federated Learning

Arxiv

0+阅读 · 2023年5月6日

An Overview of AI and Blockchain Integration for Privacy-Preserving

Arxiv

0+阅读 · 2023年5月6日

Training Natural Language Processing Models on Encrypted Text for Enhanced Privacy

Arxiv

0+阅读 · 2023年5月3日

Recent Advances of Blockchain and its Applications

Arxiv

13+阅读 · 2022年8月16日

CELEST: Federated Learning for Globally Coordinated Threat Detection

Arxiv

17+阅读 · 2022年5月23日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

相关基金

基于模型的安全关键的信息物理融合系统的设计方法中的软件综合

国家自然科学基金

1+阅读 · 2014年12月31日

符号网络理论研究与应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于动态因素的在线社会网络信息传播效果实时预测模型

国家自然科学基金

0+阅读 · 2013年12月31日

基于在线机器学习的组合算法交易策略研究

国家自然科学基金

5+阅读 · 2013年12月31日

产品、激励与种子顾客的交互模式对口碑传播效果的影响及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弱标注下基于主动学习的检测器适应问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

多天线无线通信系统的鲁棒性设计

国家自然科学基金

2+阅读 · 2012年12月31日

高速感应电机电流优化分配策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于化学与生物信息学的网络分布式计算共享平台研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于主动学习的半监督领域本体自动构建

国家自然科学基金

4+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员