The performance of a machine learning model degrades when it is applied to data from a similar but different domain than the data it has initially been trained on. The goal of domain adaptation (DA) is to mitigate this domain shift problem by searching for an optimal feature transformation to learn a domain-invariant representation. Such a domain shift can appear in handwriting recognition (HWR) applications where the motion pattern of the hand and with that the motion pattern of the pen is different for writing on paper and on tablet. This becomes visible in the sensor data for online handwriting (OnHW) from pens with integrated inertial measurement units. This paper proposes a supervised DA approach to enhance learning for OnHW recognition between tablet and paper data. Our method exploits loss functions such as maximum mean discrepancy and correlation alignment to learn a domain-invariant feature representation (i.e., similar covariances between tablet and paper features). We use a triplet loss that takes negative samples of the auxiliary domain (i.e., paper samples) to increase the amount of samples of the tablet dataset. We conduct an evaluation on novel sequence-based OnHW datasets (i.e., words) and show an improvement on the paper domain with an early fusion strategy by using pairwise learning.
翻译:机器学习模型的性能在应用到与其最初培训的数据相类似但不同的领域的数据时会退化。 域适应( DA) 的目标是通过寻找最佳特征转换以学习域变量代表来缓解域变化问题。 这种域变化可以在笔迹识别应用中出现, 手动的动作模式和笔的动作模式在纸面和平板上写作时不同。 这在带有综合惯性测量单位的笔笔笔( ONHW) 的在线笔迹( ONHW) 的传感器数据中可见。 本文提出由DA监督的办法来加强OnHW对平板和纸面数据识别的学习。 我们的方法利用损失函数, 如最大平均差异和相关性对等, 学习域- 变量代表( 即平板和纸面特征之间的相似变量) 。 我们使用三重损失, 从辅助域( 即纸质样本) 中取负样品( ) 来增加平板数据集的样本数量。 我们用新式的 OnW 顺序和早期学习策略( 显示纸质的改进) 进行新顺序评估 。