Many applications that utilize sensors in mobile devices and apply machine learning to provide novel services have emerged. However, various factors such as different users, devices, environments, and hyperparameters, affect the performance for such applications, thus making the domain shift (i.e., distribution shift of a target user from the training source dataset) an important problem. Although recent domain adaptation techniques attempt to solve this problem, the complex interplay between the diverse factors often limits their effectiveness. We argue that accurately estimating the performance in untrained domains could significantly reduce performance uncertainty. We present DAPPER (Domain AdaPtation Performance EstimatoR) that estimates the adaptation performance in a target domain with only unlabeled target data. Our intuition is that the outputs of a model on the target data provide clues for the model's actual performance in the target domain. DAPPER does not require expensive labeling costs nor involve additional training after deployment. Our evaluation with four real-world sensing datasets compared against four baselines shows that DAPPER outperforms the baselines by on average 17% in estimation accuracy. Moreover, our on-device experiment shows that DAPPER achieves up to 216X less computation overhead compared with the baselines.
翻译:使用移动设备传感器并应用机器学习以提供新服务的许多应用已经出现。但是,不同的用户、装置、环境和超参数等各种因素影响这些应用的性能,从而将域变(即目标用户从培训源数据集的分布转换)变成一个重要问题。虽然最近领域适应技术试图解决这一问题,但各种因素之间的复杂相互作用往往限制了它们的效力。我们认为,准确估计未培训域的性能可以大大降低性能不确定性。我们提出DAPPER(Domain AdaPturation Affective EsimatoR)只用未贴标签的目标数据来估计目标域的适应性能。我们的直觉是,目标数据模型的输出为模型在目标域的实际性能提供了线索。DAPPER不需要昂贵的标签成本,也不需要在部署后进行额外的培训。我们用四个真实世界的遥感数据集比四个基线进行的评估表明,DAPPER(DAPPER)在估计精确度方面比基线高出平均17%。此外,我们进行的实验显示,DAPPER与21X测算的基线比DAPER的基数比不到2116X。