配有变压器的深足球字幕:数据集、语义损失和多层次评价 (Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation) - 专知论文

会员服务 ·

0

HTTPS · Learning · 图像修复 · 数据集 · 多样性 ·

2022 年 11 月 30 日

Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation

翻译：配有变压器的深足球字幕:数据集、语义损失和多层次评价

Ahmad Hammoudeh,Bastien Vanderplaetse,Stéphane Dupont

This work aims at generating captions for soccer videos using deep learning. In this context, this paper introduces a dataset, model, and triple-level evaluation. The dataset consists of 22k caption-clip pairs and three visual features (images, optical flow, inpainting) for ~500 hours of \emph{SoccerNet} videos. The model is divided into three parts: a transformer learns language, ConvNets learn vision, and a fusion of linguistic and visual features generates captions. The paper suggests evaluating generated captions at three levels: syntax (the commonly used evaluation metrics such as BLEU-score and CIDEr), meaning (the quality of descriptions for a domain expert), and corpus (the diversity of generated captions). The paper shows that the diversity of generated captions has improved (from 0.07 reaching 0.18) with semantics-related losses that prioritize selected words. Semantics-related losses and the utilization of more visual features (optical flow, inpainting) improved the normalized captioning score by 28\%. The web page of this work: https://sites.google.com/view/soccercaptioning}{https://sites.google.com/view/soccercaptioning

翻译：这项工作旨在利用深层学习为足球视频制作字幕。在此背景下, 本文引入了一个数据集、模型和三层评价。数据集由22k 字幕- 剪辑配对和三种视觉特征( 图像、光学流、编绘) 组成, 用于 ~ 500 小时的 emph{ SoccerNet} 视频。模型分为三个部分: 变压器学习语言, ConvNets 学习视觉, 语言和视觉特征的融合产生字幕。本文建议在三个层次上评价生成的字幕: 组合( 常用的评价指标, 如 BLEU- 核心和 CIDer ), 含义( 域专家描述的质量) 和集( 生成的字幕的多样性) 。显示生成的字幕的多样性已经得到改善( 从 0.07 到 0. 18 ), 与语义相关的损失, 以及更多视觉特征的利用( 光学流, ) 改进了 28 的标准化字幕评分数。。网页。地图/ 网页。。 / 地图 / capregiew 。。。 / cregion/ sion/ 。。

0

相关内容

HTTPS

超文本传输安全协议是超文本传输协议和 SSL/TLS 的组合，用以提供加密通讯及对网络服务器身份的鉴定。

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

AntimiR-34a增强脂肪干细胞对骨形成蛋白高效诱导成骨作用反应性的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Plücker直线摄影测量的理论与方法

国家自然科学基金

0+阅读 · 2014年12月31日

高计数率PPAC探测器前端读出电路研制

国家自然科学基金

0+阅读 · 2014年12月31日

低DCAD上调小肠TRPV6表达促钙吸收：防治围产期奶牛低血钙作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于纳米结构的物理性niche调控MSC软骨向分化及促进大块骨软骨缺损再生研究

国家自然科学基金

0+阅读 · 2012年12月31日

CyclinE2-3'UTR竞争性结合miR-30e上调Notch1促进鼻咽癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

lnc-Oct4结合miR-145上调Oct4促进膀胱癌演进的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型胃癌血管靶向探针PEG-(GX1)2的制备及用于胃癌早期诊断的研究

国家自然科学基金

0+阅读 · 2012年12月31日

数字图像复原大规模问题的高性能算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于兰州HIRFL－CSR内外靶实验飞行时间探测器的多气隙电阻板室研制

国家自然科学基金

0+阅读 · 2009年12月31日

Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization

Arxiv

0+阅读 · 2023年2月1日

Priors are Powerful: Improving a Transformer for Multi-camera 3D Detection with 2D Priors

Arxiv

0+阅读 · 2023年1月31日

Pseudo 3D Perception Transformer with Multi-level Confidence Optimization for Visual Commonsense Reasoning

Arxiv

0+阅读 · 2023年1月30日

Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Arxiv

0+阅读 · 2023年1月27日

Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking

Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking

Arxiv

0+阅读 · 2023年1月27日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

相关论文

Learning Generalized Zero-Shot Learners for Open-Domain Image Geolocalization

Arxiv

0+阅读 · 2023年2月1日

Priors are Powerful: Improving a Transformer for Multi-camera 3D Detection with 2D Priors

Arxiv

0+阅读 · 2023年1月31日

Pseudo 3D Perception Transformer with Multi-level Confidence Optimization for Visual Commonsense Reasoning

Arxiv

0+阅读 · 2023年1月30日

Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

Arxiv

0+阅读 · 2023年1月27日

Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking

Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking

Arxiv

0+阅读 · 2023年1月27日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

相关基金

AntimiR-34a增强脂肪干细胞对骨形成蛋白高效诱导成骨作用反应性的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Plücker直线摄影测量的理论与方法

国家自然科学基金

0+阅读 · 2014年12月31日

高计数率PPAC探测器前端读出电路研制

国家自然科学基金

0+阅读 · 2014年12月31日

低DCAD上调小肠TRPV6表达促钙吸收：防治围产期奶牛低血钙作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于纳米结构的物理性niche调控MSC软骨向分化及促进大块骨软骨缺损再生研究

国家自然科学基金

0+阅读 · 2012年12月31日

CyclinE2-3'UTR竞争性结合miR-30e上调Notch1促进鼻咽癌转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

lnc-Oct4结合miR-145上调Oct4促进膀胱癌演进的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型胃癌血管靶向探针PEG-(GX1)2的制备及用于胃癌早期诊断的研究

国家自然科学基金

0+阅读 · 2012年12月31日

数字图像复原大规模问题的高性能算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于兰州HIRFL－CSR内外靶实验飞行时间探测器的多气隙电阻板室研制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员