形象说明师资紧急培训战略 (Teacher-Critical Training Strategies for Image Captioning)

Existing image captioning models are usually trained by cross-entropy (XE) loss and reinforcement learning (RL), which set ground-truth words as hard targets and force the captioning model to learn from them. However, the widely adopted training strategies suffer from misalignment in XE training and inappropriate reward assignment in RL training. To tackle these problems, we introduce a teacher model that serves as a bridge between the ground-truth caption and the caption model by generating some easier-to-learn word proposals as soft targets. The teacher model is constructed by incorporating the ground-truth image attributes into the baseline caption model. To effectively learn from the teacher model, we propose Teacher-Critical Training Strategies (TCTS) for both XE and RL training to facilitate better learning processes for the caption model. Experimental evaluations of several widely adopted caption models on the benchmark MSCOCO dataset show the proposed TCTS comprehensively enhances most evaluation metrics, especially the Bleu and Rouge-L scores, in both training stages. TCTS is able to achieve to-date the best published single model Bleu-4 and Rouge-L performances of 40.2% and 59.4% on the MSCOCO Karpathy test split. Our codes and pre-trained models will be open-sourced.

翻译：为解决这些问题,我们引入了教师模式,作为地面图解和字幕模型之间的桥梁,作为软目标。教师模式的构建方式是将地面图象属性纳入基线说明模型。为了有效地从教师模式中学习,我们建议为XE和RL培训制定师资培训战略,以便利更好地学习说明模型。对一些广泛采用的基本MSCO数据集教学模式的实验性评估显示,拟议的TCTS在两个培训阶段都全面加强大多数评价指标,特别是布雷乌和红色-L的评分。 TCTS能够将最佳的已公布的单一模型Bleu-4和红-红-L的分数更新到最佳的Bleu-RODSB-4和ROC-BRE-BSB-4和RE-RE-RB-RBS-RB-RB-RB-RB-RB-B-RB-M-RB-B-M-RB-M-B-RB-RB-M-M-RB-B-B-RB-B-B-RB-B-RB-B-B-B-B-L-RBS-ML-B-B-L-BS-B-RB-B-RB-B-B-L-L-BS-BS-L-L-B-BS-S-S-S-S-B-B-B-B-S-B-B-B-B-B-B-M-B-B-B-B-B-B-B-B-B-B-M-M-B-B-B-B-M-B-B-B-B-B-B-B-B-B-B-M-B-B-B-M-M-M-M-M-M-M-B-B-B-B-B-B-B-B-B-B-B-B-M-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-B-M-B-M-M-B-M-B-B-B-M-M-B-B-B-B-B-

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2020新书】从Excel中学习数据挖掘，223页pdf

专知会员服务

93+阅读 · 2020年6月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【斯坦福大学-ICLR2020】图神经网络预训练的策略，Strategies for Pre-training Graph Neural Networks

专知会员服务

78+阅读 · 2020年3月1日