Billions of distributed, heterogeneous and resource constrained smart consumer devices deploy on-device machine learning (ML) to deliver private, fast and offline inference on personal data. On-device ML systems are highly context dependent, and sensitive to user, usage, hardware and environmental attributes. Despite this sensitivity and the propensity towards bias in ML, bias in on-device ML has not been studied. This paper studies the propagation of bias through design choices in on-device ML development workflows. We position \emph{reliablity bias}, which arises from disparate device failures across demographic groups, as a source of unfairness in on-device ML settings and quantify metrics to evaluate it. We then identify complex and interacting technical design choices in the on-device ML workflow that can lead to disparate performance across user groups, and thus \emph{reliability bias}. Finally, we show with an empirical case study that seemingly innocuous design choices such as the data sample rate, pre-processing parameters used to construct input features and pruning hyperparameters propagate \emph{reliability bias} through an audio keyword spotting development workflow. We leverage our insights to suggest strategies for developers to develop fairer on-device ML.
翻译:分布式、多式和资源有限的智能消费者装置的数十亿个分布式、多式和资源有限的智能消费者装置,在个人数据上进行私人、快速和离线的推断。 在线 ML系统高度依赖环境,对用户、 使用、 硬件和环境属性敏感。 尽管这种敏感性和倾向在 ML 中存在偏差, 尚未研究在 设计式 ML 上存在偏差。 本文研究在设计式 ML 开发工作流程上通过设计选择传播偏见。 我们把由于各人口群体之间不同设备故障而产生的 emph{ reliblicity 偏差, 定位为在设计式 ML 设置上存在不公平, 并量化评价它的指标。 我们随后在设计式 ML 工作流程上确定复杂和互动的技术设计选择, 可能导致不同用户群体不同的业绩, 从而导致 emph{ 可靠性偏差 。 最后,我们通过经验案例研究显示, 设计选择似乎不中肯, 例如数据抽样率、 预处理参数, 用于构建输入功能特征和运行超常分数仪, 用于在设计中, 向发展中显示我们Syregrefrefrefregregrefrefregreldestrystrystrystrystral