Conversational analysis systems are trained using noisy human labels and often require heavy preprocessing during multi-modal feature extraction. Using noisy labels in single-task learning increases the risk of over-fitting. Auxiliary tasks could improve the performance of the primary task learning during the same training -- this approach sits in the intersection of transfer learning and multi-task learning (MTL). In this paper, we explore how the preprocessed data used for feature engineering can be re-used as auxiliary tasks, thereby promoting the productive use of data. Our main contributions are: (1) the identification of sixteen beneficially auxiliary tasks, (2) studying the method of distributing learning capacity between the primary and auxiliary tasks, and (3) studying the relative supervision hierarchy between the primary and auxiliary tasks. Extensive experiments on IEMOCAP and SEMAINE data validate the improvements over single-task approaches, and suggest that it may generalize across multiple primary tasks.
翻译:使用吵闹的人类标签来培训相互沟通的分析系统,在多式特征提取过程中往往需要大量的预处理。使用在单式任务学习中使用的吵闹的标签会增加过度配置的风险。辅助性任务可以在同一培训期间改进初级任务学习的绩效 -- -- 这种方法处于转让学习和多式任务学习的交叉点。在本文件中,我们探讨了如何将用于特征工程的预处理数据重新用作辅助性任务,从而促进数据的生产性使用。我们的主要贡献是:(1) 确定16项有益的辅助任务,(2) 研究在初级任务和辅助任务之间分配学习能力的方法,(3) 研究初级任务和辅助任务之间的相对监督等级。关于IEMOCAP和SEMANINE数据的广泛实验验证了单式任务方法的改进,并表明它可以将多种主要任务加以概括。