电影情节中的情感和心理状态学习 (How you feelin'? Learning Emotions and Mental States in Movie Scenes) - 专知论文

会员服务 ·

0

电影 · 情感理解 · 联合预测 · 识别方法 · 令牌 ·

2023 年 4 月 12 日

How you feelin'? Learning Emotions and Mental States in Movie Scenes

翻译：电影情节中的情感和心理状态学习

Dhruv Srivastava,Aditya Kumar Singh,Makarand Tapaswi

from arxiv, CVPR 2023. Project Page: https://katha-ai.github.io/projects/emotx/

Movie story analysis requires understanding characters' emotions and mental states. Towards this goal, we formulate emotion understanding as predicting a diverse and multi-label set of emotions at the level of a movie scene and for each character. We propose EmoTx, a multimodal Transformer-based architecture that ingests videos, multiple characters, and dialog utterances to make joint predictions. By leveraging annotations from the MovieGraphs dataset, we aim to predict classic emotions (e.g. happy, angry) and other mental states (e.g. honest, helpful). We conduct experiments on the most frequently occurring 10 and 25 labels, and a mapping that clusters 181 labels to 26. Ablation studies and comparison against adapted state-of-the-art emotion recognition approaches shows the effectiveness of EmoTx. Analyzing EmoTx's self-attention scores reveals that expressive emotions often look at character tokens while other mental states rely on video and dialog cues.

翻译：电影故事分析需要理解角色的情感和心理状态。为了实现这一目标，我们将情感理解形式化为在电影场景的角色级别和每个角色的多标签预测。我们提出了EmoTx，这是一种多模态基于Transformer的架构，可以处理视频、多个角色和对话话语，以进行联合预测。通过利用来自MovieGraphs数据集的标注，我们旨在预测经典情感（例如快乐、愤怒）和其他心理状态（例如诚实、乐于助人）。我们对最常出现的10和25个标签以及将181个标签聚类为26个标签的映射进行了实验。消融研究和与改进后的最先进情感识别方法的比较显示了EmoTx的有效性。分析EmoTx的自我关注分数表明，表情丰富的情感通常会注意角色令牌，而其他心理状态则依赖于视频和对话线索。

0

相关内容

电影是一种视听媒介，利用胶卷、录像带或数位媒体将影像和声音捕捉，再加上后期的编辑工作而成。

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

专知会员服务

55+阅读 · 2020年4月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

专知

15+阅读 · 2018年5月28日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

视觉学习与人脑可塑性

国家自然科学基金

3+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于开径式非相干宽带腔增强吸收光谱大气中HONO探测方法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

基于视觉语义推理与上下文约束建模的场景理解方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

情绪状态下工作记忆及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

平流层-中间层准两年振荡（QBO）的观测与模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

LDA+Guztwiller方法研究铁基超导体

国家自然科学基金

0+阅读 · 2009年12月31日

基于多特征情感信息融合的高效率e-Learning关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

轻度认知损害及阿尔茨海默病的多模态脑影像研究和早期诊断系统

国家自然科学基金

0+阅读 · 2008年12月31日

Video Prediction Models as Rewards for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月30日

Consistent Text Categorization using Data Augmentation in e-Commerce

Arxiv

0+阅读 · 2023年5月30日

Transferring Visual Attributes from Natural Language to Verified Image Generation

Arxiv

0+阅读 · 2023年5月29日

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Arxiv

0+阅读 · 2023年5月29日

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Arxiv

0+阅读 · 2023年5月28日

Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Arxiv

0+阅读 · 2023年5月27日

Learning to Imagine: Visually-Augmented Natural Language Generation

Arxiv

0+阅读 · 2023年5月26日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

【北京大学】动态异构图神经网络建模情感，Jointly Modeling Aspect and Sentiment with Dynamic Heterogeneous Graph Neural Networks

专知会员服务

55+阅读 · 2020年4月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

【论文推荐】最新六篇图像描述生成相关论文—字符级推断、视觉解释、语义对齐、实体感知、确定性非自回归

专知

15+阅读 · 2018年5月28日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Video Prediction Models as Rewards for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月30日

Consistent Text Categorization using Data Augmentation in e-Commerce

Arxiv

0+阅读 · 2023年5月30日

Transferring Visual Attributes from Natural Language to Verified Image Generation

Arxiv

0+阅读 · 2023年5月29日

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Arxiv

0+阅读 · 2023年5月29日

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Arxiv

0+阅读 · 2023年5月28日

Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Arxiv

0+阅读 · 2023年5月27日

Learning to Imagine: Visually-Augmented Natural Language Generation

Arxiv

0+阅读 · 2023年5月26日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

视觉学习与人脑可塑性

国家自然科学基金

3+阅读 · 2014年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于开径式非相干宽带腔增强吸收光谱大气中HONO探测方法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

基于视觉语义推理与上下文约束建模的场景理解方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

情绪状态下工作记忆及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

平流层-中间层准两年振荡（QBO）的观测与模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

LDA+Guztwiller方法研究铁基超导体

国家自然科学基金

0+阅读 · 2009年12月31日

基于多特征情感信息融合的高效率e-Learning关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

轻度认知损害及阿尔茨海默病的多模态脑影像研究和早期诊断系统

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员