Multi-domain learning (MDL) refers to learning a set of models simultaneously, where each model is specialized to perform a task in a particular domain. Generally, a high labeling effort is required in MDL, as data needs to be labeled by human experts for every domain. Active learning (AL) can be utilized in MDL to reduce the labeling effort by only using the most informative data. The resultant paradigm is termed multi-domain active learning (MDAL). In this work, we provide an exhaustive literature review for MDAL on the relevant fields, including AL, cross-domain information sharing schemes, and cross-domain instance evaluation approaches. It is found that the few studies which have been directly conducted on MDAL cannot serve as off-the-shelf solutions on more general MDAL tasks. To fill this gap, we construct a pipeline of MDAL and present a comprehensive comparative study of thirty different algorithms, which are established by combining six representative MDL models and five commonly used AL strategies. We evaluate the algorithms on six datasets involving textual and visual classification tasks. In most cases, AL brings notable improvements to MDL, and the naive BvSB (best vs. second best) Uncertainty strategy can perform competitively with the state-of-the-art AL strategies. Besides, BvSB with the MAN (multinomial adversarial networks) model can consistently achieve top or above-average performance on all the datasets. Furthermore, we qualitatively analyze the behaviors of the well-performed strategies and models, shedding light on their superior performance in the comparison. Finally, we recommend using BvSB with the MAN model in the application of MDAL due to their good performance in the experiments.
翻译:多域学习( MDL) 指的是同时学习一组模型, 每一模型都专门用于执行特定领域的任务。 一般来说, MDL 需要高标签工作, 因为数据需要由人类专家为每个领域贴上标签。 MDL 可以使用积极学习( AL) 来减少标签工作, 仅使用信息量最大的数据。 由此形成的范例被称为多域积极学习( MDL) 。 在这项工作中, 我们为MDAL 提供有关领域的完整文献审查, 包括AL、 浅度信息共享计划和跨域评价方法。 通常, MDL 直接进行的一些研究不能作为一般 MDAL 任务上的现有解决方案。 为了填补这一空白, 我们建造了MDAL 管道, 并对三十种不同的算法进行了全面的比较研究, 这些算法是结合了六种具有代表性的MDL 模型和五种常用的 AL 战略。 我们用文本和视觉分类任务来评估六种数据集的算法。 在多数情况下, 在MDL、 ASB 和 Airal Streal Streal 战略上, 其上, 其高级性性业绩, 可以以最优的SB 和最优的SB 。 。