Given a set of heterogeneous source datasets with their classifiers, how can we quickly find the most useful source dataset for a specific target task? We address the problem of measuring transferability between source and target datasets, where the source and the target have different feature spaces and distributions. We propose Transmeter, a fast and accurate method to estimate the transferability of two heterogeneous multivariate datasets. We address three challenges in measuring transferability between two heterogeneous multivariate datasets: reducing time, minimizing domain gap, and extracting meaningful homogeneous representations. To overcome the above issues, we utilize a pre-trained source model, an adversarial network, and an encoder-decoder architecture. Extensive experiments on heterogeneous multivariate datasets show that Transmeter gives the most accurate transferability measurement with up to 10.3 times faster performance than its competitor. We also show that selecting the best source data with Transmeter followed by a full transfer leads to the best transfer accuracy and the fastest running time.
翻译:鉴于一组不同源数据集及其分类,我们如何能迅速找到用于具体目标任务的最有用的源数据集?我们解决了衡量源数据集和目标数据集之间可转移性的问题,因为源数据集和目标数据集具有不同的特性空间和分布。我们提出了“Tranteter”,这是用来估计两个不同多变量数据集可转移性的快速和准确的方法。我们在衡量两个不同多变量数据集之间可转移性方面面临着三个挑战:减少时间、尽量减少域间差距和提取有意义的同质表示。为了克服上述问题,我们使用了预先培训过的源模型、对称网络和编码交换器结构。关于多变量数据集的广泛实验表明,“Trantemo”提供了最准确的可转移性测量,其性能比其兼容性要快10.3倍。我们还表明,选择具有全传输跟踪跟踪的“Tranteter”最佳源数据可以达到最佳传输准确性和运行速度。