On-device machine learning is becoming a reality thanks to the availability of powerful hardware and model compression techniques. Typically, these models are pretrained on large GPU clusters and have enough parameters to generalise across a wide variety of inputs. In this work, we observe that a much smaller, personalised model can be employed to fit a specific scenario, resulting in both higher accuracy and faster execution. Nevertheless, on-device training is extremely challenging, imposing excessive computational and memory requirements even for flagship smartphones. At the same time, on-device data availability might be limited and samples are most frequently unlabelled. To this end, we introduce PersEPhonEE, a framework that attaches early exits on the model and personalises them on-device. These allow the model to progressively bypass a larger part of the computation as more personalised data become available. Moreover, we introduce an efficient on-device algorithm that trains the early exits in a semi-supervised manner at a fraction of the whole network's personalisation time. Results show that PersEPhonEE boosts accuracy by up to 15.9% while dropping the training cost by up to 2.2x and inference latency by 2.2-3.2x on average for the same accuracy, depending on the availability of labels on-device.
翻译:由于有强大的硬件和模型压缩技术,设备机的学习正在成为现实。通常,这些模型在大型 GPU 集群上已经经过预先训练,并有足够的参数来对各种投入进行概括。在这项工作中,我们观察到,可以使用一个更小的、个性化的模式来适应特定情景,从而导致更高的准确性和更快的执行。然而,设备机的训练极具挑战性,甚至对旗舰智能手机也施加过量的计算和记忆要求。与此同时,设备机的数据可能有限,样品往往没有标签。为此,我们引入了PersEPhonE,这是一个将早期出口附在模型上并个人化在设计上的框架。这些模型可以随着更个人化的数据的出现,逐渐绕过计算中的更大部分。此外,我们引入了一种高效的装置算法,在整个网络的个人化时间的一小部分以半监控的方式对早期出口进行训练。结果显示,PesEPhonEE提高精确度最高至15.9%,同时将培训费用从2.2到2.x 平均提供率。