SMURRF: 通过典型分析进行指挥评价的语文和语言不易分解 (SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis) - 专知论文

会员服务 ·

0

可理解性 · Extensibility · 异常点 · INFORMS · state-of-the-art ·

2021 年 6 月 2 日

SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis

翻译：SMURRF: 通过典型分析进行指挥评价的语文和语言不易分解

Joshua Feinglass,Yezhou Yang

The open-ended nature of visual captioning makes it a challenging area for evaluation. The majority of proposed models rely on specialized training to improve human-correlation, resulting in limited adoption, generalizability, and explainabilty. We introduce "typicality", a new formulation of evaluation rooted in information theory, which is uniquely suited for problems lacking a definite ground truth. Typicality serves as our framework to develop a novel semantic comparison, SPARCS, as well as referenceless fluency evaluation metrics. Over the course of our analysis, two separate dimensions of fluency naturally emerge: style, captured by metric SPURTS, and grammar, captured in the form of grammatical outlier penalties. Through extensive experiments and ablation studies on benchmark datasets, we show how these decomposed dimensions of semantics and fluency provide greater system-level insight into captioner differences. Our proposed metrics along with their combination, SMURF, achieve state-of-the-art correlation with human judgment when compared with other rule-based evaluation metrics.

翻译：视觉字幕的开放性使得它成为评估的一个挑战领域。大多数拟议模型依靠专门培训来改进人类关系,从而导致有限的采纳、普遍性和解释性。我们引入了“典型性”——一种基于信息理论的评价新提法,这种新提法特别适合缺乏明确地面事实的问题。典型性作为我们制定新的语义比较的框架,SPARCS和不参考的流利评价指标。在我们的分析过程中,流利自然出现了两个不同的层面:风格,由SPURTS衡量,和语法,以语法外值惩罚的形式捕捉。通过对基准数据集的广泛试验和通俗化研究,我们展示了这些分解的语义和流利的维度如何在系统层面对字幕差异提供更深入的洞察力。我们提议的衡量尺度及其组合SMURF,与其他有章的评价指标相比,实现了与人类判断的状态和艺术相关性。

0

相关内容

可理解性

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

还在修改博士论文？这份《博士论文写作技巧》为你指南

还在修改博士论文？这份《博士论文写作技巧》为你指南

专知会员服务

165+阅读 · 2020年6月9日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 11月截稿会议信息9条

计算机类 | 11月截稿会议信息9条

Call4Papers

6+阅读 · 2018年10月14日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Arxiv

0+阅读 · 2021年7月26日

Language Models as Zero-shot Visual Semantic Learners

Arxiv

0+阅读 · 2021年7月26日

MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation

Arxiv

0+阅读 · 2021年7月24日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

Exploring the Semantics for Visual Relationship Detection

Arxiv

3+阅读 · 2019年4月3日

Deep learning evaluation using deep linguistic processing

Arxiv

3+阅读 · 2018年5月12日

Improved Image Captioning with Adversarial Semantic Alignment

Arxiv

6+阅读 · 2018年4月30日

Jointly Localizing and Describing Events for Dense Video Captioning

Arxiv

5+阅读 · 2018年4月23日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

Arxiv

3+阅读 · 2017年8月3日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

还在修改博士论文？这份《博士论文写作技巧》为你指南

还在修改博士论文？这份《博士论文写作技巧》为你指南

专知会员服务

165+阅读 · 2020年6月9日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机类 | 11月截稿会议信息9条

计算机类 | 11月截稿会议信息9条

Call4Papers

6+阅读 · 2018年10月14日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Arxiv

0+阅读 · 2021年7月26日

Language Models as Zero-shot Visual Semantic Learners

Arxiv

0+阅读 · 2021年7月26日

MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation

Arxiv

0+阅读 · 2021年7月24日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

Exploring the Semantics for Visual Relationship Detection

Arxiv

3+阅读 · 2019年4月3日

Deep learning evaluation using deep linguistic processing

Arxiv

3+阅读 · 2018年5月12日

Improved Image Captioning with Adversarial Semantic Alignment

Arxiv

6+阅读 · 2018年4月30日

Jointly Localizing and Describing Events for Dense Video Captioning

Arxiv

5+阅读 · 2018年4月23日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation

Arxiv

3+阅读 · 2017年8月3日

微信扫码咨询专知VIP会员