Pretrained Language Models as Visual Planners for Human Assistance - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · Extensibility · INFORMS · AIM ·

2023 年 4 月 27 日

Pretrained Language Models as Visual Planners for Human Assistance

翻译：暂无翻译

Dhruvesh Patel,Hamid Eghbalzadeh,Nitin Kamra,Michael Louis Iuzzolino,Unnat Jain,Ruta Desai

from arxiv, No change in the main text; added acknowledgements

To make progress towards multi-modal AI assistants which can guide users to achieve complex multi-step goals, we propose the task of Visual Planning for Assistance (VPA). Given a goal briefly described in natural language, e.g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to obtain a plan, i.e., a sequence of actions such as "sand shelf", "paint shelf", etc., to achieve the goal. This requires assessing the user's progress from the untrimmed video, and relating it to the requirements of underlying goal, i.e., relevance of actions and ordering dependencies amongst them. Consequently, this requires handling long video history, and arbitrarily complex action dependencies. To address these challenges, we decompose VPA into video action segmentation and forecasting. We formulate the forecasting step as a multi-modal sequence modeling problem and present Visual Language Model based Planner (VLaMP), which leverages pre-trained LMs as the sequence model. We demonstrate that VLaMP performs significantly better than baselines w.r.t all metrics that evaluate the generated plan. Moreover, through extensive ablations, we also isolate the value of language pre-training, visual observations, and goal information on the performance. We will release our data, model, and code to enable future research on visual planning for assistance.

翻译：暂无翻译

0

相关内容

语言模型化

语言模型化

ICLR2021有什么值得关注的投稿？这些高赞论文先睹为快

专知会员服务

46+阅读 · 2020年10月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

DNA甲基化调节SOX9基因表达对性别决定的作用

国家自然科学基金

0+阅读 · 2012年12月31日

联合活体生物光学成像及超声微血管显像动态监测FGF及VEGF调控乳腺癌新生血管

国家自然科学基金

0+阅读 · 2012年12月31日

GmMADS1在大豆花发育中的调控机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

Arxiv

0+阅读 · 2023年6月12日

Valley: Video Assistant with Large Language model Enhanced abilitY

Arxiv

0+阅读 · 2023年6月12日

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Arxiv

0+阅读 · 2023年6月12日

QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

Arxiv

0+阅读 · 2023年6月11日

Augmented Large Language Models with Parametric Knowledge Guiding

Arxiv

20+阅读 · 2023年5月8日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR2021有什么值得关注的投稿？这些高赞论文先睹为快

专知会员服务

46+阅读 · 2020年10月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美海军作战管理系统：变革战场空间的二十年

《任务与武器驱动美海军舰队设计》报告

俄罗斯“沙希德”/“天竺葵”攻击无人机

《利用动态图对网络攻击进行建模与仿真：在云安全评估中的应用》90页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

Arxiv

0+阅读 · 2023年6月12日

Valley: Video Assistant with Large Language model Enhanced abilitY

Arxiv

0+阅读 · 2023年6月12日

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Arxiv

0+阅读 · 2023年6月12日

QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

Arxiv

0+阅读 · 2023年6月11日

Augmented Large Language Models with Parametric Knowledge Guiding

Arxiv

20+阅读 · 2023年5月8日

相关基金

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

DNA甲基化调节SOX9基因表达对性别决定的作用

国家自然科学基金

0+阅读 · 2012年12月31日

联合活体生物光学成像及超声微血管显像动态监测FGF及VEGF调控乳腺癌新生血管

国家自然科学基金

0+阅读 · 2012年12月31日

GmMADS1在大豆花发育中的调控机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员