多任务学习中基于课程模型的金融营销依赖关系建模 (Curriculum Modeling the Dependence among Targets with Multi-task Learning for Financial Marketing)

Multi-task learning for various real-world applications usually involves tasks with logical sequential dependence. For example, in online marketing, the cascade behavior pattern of $impression \rightarrow click \rightarrow conversion$ is usually modeled as multiple tasks in a multi-task manner, where the sequential dependence between tasks is simply connected with an explicitly defined function or implicitly transferred information in current works. These methods alleviate the data sparsity problem for long-path sequential tasks as the positive feedback becomes sparser along with the task sequence. However, the error accumulation and negative transfer will be a severe problem for downstream tasks. Especially, at the beginning stage of training, the optimization for parameters of former tasks is not converged yet, and thus the information transferred to downstream tasks is negative. In this paper, we propose a prior information merged model (\textbf{PIMM}), which explicitly models the logical dependence among tasks with a novel prior information merged (\textbf{PIM}) module for multiple sequential dependence task learning in a curriculum manner. Specifically, the PIM randomly selects the true label information or the prior task prediction with a soft sampling strategy to transfer to the downstream task during the training. Following an easy-to-difficult curriculum paradigm, we dynamically adjust the sampling probability to ensure that the downstream task will get the effective information along with the training. The offline experimental results on both public and product datasets verify that PIMM outperforms state-of-the-art baselines. Moreover, we deploy the PIMM in a large-scale FinTech platform, and the online experiments also demonstrate the effectiveness of PIMM.

翻译：多任务学习中的各种实际应用通常涉及具有逻辑顺序依赖性的任务。例如，在在线营销中，$展示 \rightarrow 点击 \rightarrow 转化$ 的连续行为模式通常被建模为多个任务的多任务学习，其中任务之间的顺序依赖性在当前工作中简单地通过明确定义的函数或隐含传递的信息连接。这些方法缓解了长路径连续任务中的数据稀疏问题，因为随着任务序列中的正反馈变得更加稀疏。然而，错误累积和负面传递对下游任务将是严重问题。尤其是在训练的开始阶段，以前任务的参数优化尚未收敛，因此传递到下游任务的信息是负面的。在本文中，我们提出了一种先验信息合并模型（\textbf{PIMM}），该模型以课程方式显式地建模了多个顺序依赖任务的逻辑依赖关系，其中先进的PIM（\textbf{PIM}）模块用于合并先验信息。具体而言，PIM使用软采样策略随机选择真实标签信息或先前任务的预测来传递到下游任务，以便动态调整采样概率，以确保下游任务在训练过程中获得有效信息。公开数据集和产品数据集的离线实验结果验证了PIMM优于现有技术基线的性能。此外，我们在大规模FinTech平台部署了PIMM，并进行了在线实验，结果也证明了PIMM的有效性。

相关内容

多任务学习

关注 161

多任务学习（MTL）是机器学习的一个子领域，可以同时解决多个学习任务，同时利用各个任务之间的共性和差异。与单独训练模型相比，这可以提高特定任务模型的学习效率和预测准确性。多任务学习是归纳传递的一种方法，它通过将相关任务的训练信号中包含的域信息用作归纳偏差来提高泛化能力。通过使用共享表示形式并行学习任务来实现,每个任务所学的知识可以帮助更好地学习其它任务。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

专知会员服务

39+阅读 · 2020年11月3日

CIKM 2020 | 序列推荐预训练：基于互信息最大化的自监督学习

专知会员服务

46+阅读 · 2020年9月17日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

Query2box: 使用盒嵌入对向量空间中的知识图谱进行推理，Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings