新天体说明需要的只是您需要的引号 (Paraphrasing Is All You Need for Novel Object Captioning)

Novel object captioning (NOC) aims to describe images containing objects without observing their ground truth captions during training. Due to the absence of caption annotation, captioning models cannot be directly optimized via sequence-to-sequence training or CIDEr optimization. As a result, we present Paraphrasing-to-Captioning (P2C), a two-stage learning framework for NOC, which would heuristically optimize the output captions via paraphrasing. With P2C, the captioning model first learns paraphrasing from a language model pre-trained on text-only corpus, allowing expansion of the word bank for improving linguistic fluency. To further enforce the output caption sufficiently describing the visual content of the input image, we perform self-paraphrasing for the captioning model with fidelity and adequacy objectives introduced. Since no ground truth captions are available for novel object images during training, our P2C leverages cross-modality (image-text) association modules to ensure the above caption characteristics can be properly preserved. In the experiments, we not only show that our P2C achieves state-of-the-art performances on nocaps and COCO Caption datasets, we also verify the effectiveness and flexibility of our learning framework by replacing language and cross-modality association models for NOC. Implementation details and code are available in the supplementary materials.

翻译：新目标说明(NOC) 旨在描述含有对象的图像,但不在培训期间观察其地面真相说明。由于缺乏标题说明,字幕模型无法通过顺序到顺序培训或CIDER优化直接优化。结果,我们为NOC提供了一个双阶段学习框架(P2C),这是一个通过参数转换来超脱优化产出说明的阶段性能。有了P2C,字幕模型首先从一个语言模型中学习参数,该语言模型经过纯文本材料的预先训练,允许扩大文字库,以改善语言流畅。为了进一步实施充分描述输入图像的视觉内容的输出说明,我们用真实性和充分的目标为字幕模型进行自我引用。由于在培训期间没有为新对象图像提供基础真相说明,我们的P2C利用跨模式(模版)关联模块,以确保上述字幕特性得到妥善保存。在实验中,我们不仅展示了我们P2C补充材料的灵活性,而且通过学习格式化的CFS-C软件框架,我们也没有显示我们现有的格式和COFS-S-C格式,我们无法取代了我们现有的格式和CFO-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-Supolololollegleglegation Statealdaldaldald Stald Stald Stald Staldaldaldald Staldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldality Foldaldaldaldaldaldaldaldaldal ex Foldaldaldal ex Fors)格式,我们没有学习了我们学习了我们学习框架。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/