利用语义相似度计量器改进音频控制 (Improving Audio Captioning Using Semantic Similarity Metrics) - 专知论文

会员服务 ·

0

语义相似度 · 相似度 · 图像字幕 · Machine Translation · 确切的 ·

2022 年 10 月 29 日

Improving Audio Captioning Using Semantic Similarity Metrics

翻译：利用语义相似度计量器改进音频控制

Rehana Mahfuz,Yinyi Guo,Erik Visser

Audio captioning quality metrics which are typically borrowed from the machine translation and image captioning areas measure the degree of overlap between predicted tokens and gold reference tokens. In this work, we consider a metric measuring semantic similarities between predicted and reference captions instead of measuring exact word overlap. We first evaluate its ability to capture similarities among captions corresponding to the same audio file and compare it to other established metrics. We then propose a fine-tuning method to directly optimize the metric by backpropagating through a sentence embedding extractor and audio captioning network. Such fine-tuning results in an improvement in predicted captions as measured by both traditional metrics and the proposed semantic similarity captioning metric.

翻译：通常从机器翻译和图像说明区借用的音频字幕质量度量标准,通常用来测量预测的象征物和黄金参考象征物之间的重叠程度。在这项工作中,我们考虑测量预测和参考说明之间的语义相似性,而不是测量准确的文字重叠。我们首先评估其捕捉与同一音频文件相对应的字幕相似性的能力,并将其与其他既定指标进行比较。然后我们提出微调方法,通过嵌入句子的提取器和音频说明网络进行反插,直接优化计量。这种微调结果改进了以传统指标和拟议的语义相似性说明度衡量的预测说明。

0

相关内容

语义相似度

语义相似度

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Triptolide诱导c-FLIP选择性剪切在调控TRAIL耐药胰腺癌细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长效抗中毒低铂/掺杂型TiN催化剂的可控合成及甲醇氧化催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

SIRPa对心肌肥厚的影响及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

硫化物/石墨烯纳米复合材料的室温固相合成及光催化性能

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR受体介导的肺肿瘤靶向siRNA类脂质载体的构建及转运调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于胸腺恢复的AIDS患者T淋巴细胞受体库多样性研究

国家自然科学基金

1+阅读 · 2012年12月31日

ECF转运蛋白的结构与转运机制

国家自然科学基金

0+阅读 · 2012年12月31日

具有铁电磁功能性的配位聚合物的合成、结构和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

天然多组分体系中极性不饱和物质的π-配位萃取研究

国家自然科学基金

0+阅读 · 2011年12月31日

Extrinsic Evaluation of Machine Translation Metrics

Arxiv

0+阅读 · 2022年12月20日

What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes

Arxiv

0+阅读 · 2022年12月20日

Optimizing Prompts for Text-to-Image Generation

Arxiv

0+阅读 · 2022年12月19日

Context Label Learning: Improving Background Class Representations in Semantic Segmentation

Arxiv

0+阅读 · 2022年12月16日

MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks

Arxiv

0+阅读 · 2022年12月15日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

语义相似度

Machine Translation

相关VIP内容

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Extrinsic Evaluation of Machine Translation Metrics

Arxiv

0+阅读 · 2022年12月20日

What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes

Arxiv

0+阅读 · 2022年12月20日

Optimizing Prompts for Text-to-Image Generation

Arxiv

0+阅读 · 2022年12月19日

Context Label Learning: Improving Background Class Representations in Semantic Segmentation

Arxiv

0+阅读 · 2022年12月16日

MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks

Arxiv

0+阅读 · 2022年12月15日

From Show to Tell: A Survey on Image Captioning

Arxiv

15+阅读 · 2021年7月14日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Image Captioning

Arxiv

11+阅读 · 2018年5月13日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

相关基金

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Triptolide诱导c-FLIP选择性剪切在调控TRAIL耐药胰腺癌细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长效抗中毒低铂/掺杂型TiN催化剂的可控合成及甲醇氧化催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

SIRPa对心肌肥厚的影响及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

硫化物/石墨烯纳米复合材料的室温固相合成及光催化性能

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR受体介导的肺肿瘤靶向siRNA类脂质载体的构建及转运调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于胸腺恢复的AIDS患者T淋巴细胞受体库多样性研究

国家自然科学基金

1+阅读 · 2012年12月31日

ECF转运蛋白的结构与转运机制

国家自然科学基金

0+阅读 · 2012年12月31日

具有铁电磁功能性的配位聚合物的合成、结构和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

天然多组分体系中极性不饱和物质的π-配位萃取研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员