Billions of distributed, heterogeneous and resource constrained smart consumer devices deploy on-device machine learning (ML) to deliver private, fast and offline inference on personal data. On-device ML systems are highly context dependent, and sensitive to user, usage, hardware and environmental attributes. Despite this sensitivity and the propensity towards bias in ML, bias in on-device ML has not been studied. This paper studies the propagation of bias through design choices in on-device ML development workflows. We position reliability bias, which arises from disparate device failures across demographic groups, as a source of unfairness in on-device ML settings and quantify metrics to evaluate it. We then identify complex and interacting technical design choices in the on-device ML workflow that can lead to disparate performance across user groups, and thus reliability bias. Finally, we show with an empirical case study that seemingly innocuous design choices such as the data sample rate, pre-processing parameters used to construct input features and pruning hyperparameters propagate reliability bias through an audio keyword spotting development workflow. We leverage our insights to suggest strategies for developers to develop fairer on-device ML.
翻译:分布式、多式和资源有限的智能消费者装置的数十亿个分布式、多式和资源有限的智能消费者装置在设备机上部署,以提供私人、快速和离线的个人数据推断。 在线设计 ML系统高度依赖环境,并且对用户、使用、硬件和环境属性敏感。 尽管这种敏感性和偏向于 ML 中存在偏向性倾向,但还没有研究在设备ML 上存在的偏向。 本文研究通过设计选择在设备ML 开发工作流程中传播偏见。 我们将可靠性偏向定位于各人口群体不同的设备失灵,作为在设备ML 设置上存在不公平之处的一个来源,并量化评价它的指标。 我们然后在设备ML 工作流程中确定复杂和互动的技术设计选择,这可能导致不同用户群体不同的性,从而产生可靠性偏差。 最后,我们通过经验案例研究显示,在数据抽样率、用于建立输入特征和通过音频关键值识别工作流程传播可靠性偏差。 我们利用我们的洞察力来建议开发者战略,以更公平的方式发展Mice-L 。