「Github」多模态机器学习文章阅读列表

2019 年 8 月 15 日 专知

【导读】CMU LTI 多媒体计算组的同学们在Github上维护了一个多模态机器学习的文章阅读列表,包含:教程,Survey,课程,paper等。又根据多模态机器学习的各个方向,对以上内容进行了细分。对多模态机器学习感兴趣的同学们不要错过哦。

Gihub库地址: 

https://github.com/pliang279/awesome-multimodal-ml

作者:

Paul Pu Liang


【主要方向

Representation Learning (表示学习):

  • VisualBERT: A Simple and Performant Baseline for Vision and Language, arXiv 2019

  • ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks, arXiv 2019

  • OmniNet: A Unified Architecture for Multi-modal Multi-task Learning, arXiv 2019

  • Learning Representations by Maximizing Mutual Information Across Views, arXiv 2019 

  • Deep Multimodal Representation Learning: A Survey, arXiv 2019

  • VideoBERT: A Joint Model for Video and Language Representation Learning, ICCV 2019

  • Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations, CVPR 2019

  • Multi-Task Learning of Hierarchical Vision-Language Representation, CVPR 2019

  • Learning Factorized Multimodal Representations, ICLR 2019 [code]

  • A Probabilistic Framework for Multi-view Feature Learning with Many-to-many Associations via Neural Networks, ICML 2018

  • Do Neural Network Cross-Modal Mappings Really Bridge Modalities?, ACL 2018

  • Learning Robust Visual-Semantic Embeddings, ICCV 2017

  • Deep Multimodal Representation Learning from Temporal Data, CVPR 2017

  • Is an Image Worth More than a Thousand Words? On the Fine-Grain Semantic Differences between Visual and Linguistic Representations, COLING 2016

  • Combining Language and Vision with a Multimodal Skip-gram Model, NAACL 2015

  • Deep Fragment Embeddings for Bidirectional Image Sentence Mapping, NIPS 2014

  • Multimodal Learning with Deep Boltzmann Machines, JMLR 2014

  • Learning Grounded Meaning Representations with Autoencoders, ACL 2014

  • DeViSE: A Deep Visual-Semantic Embedding Model, NeurIPS 2013

  • Multimodal Deep Learning, ICML 2011



Multimodal Fusion(多模态融合):

  • MFAS: Multimodal Fusion Architecture Search, CVPR 2019

  • The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision, ICLR 2019

  • Efficient Low-rank Multimodal Fusion with Modality-Specific Factors, ACL 2018

  • Memory Fusion Network for Multi-view Sequential Learning, AAAI 2018

  • Tensor Fusion Network for Multimodal Sentiment Analysis, EMNLP 2017 

  • Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework, AAAI 2015


Multimodal Alignment(多模态对齐):

  • Multimodal Transformer for Unaligned Multimodal Language Sequences, ACL 2019

  • Temporal Cycle-Consistency Learning, CVPR 2019

  • See, Hear, and Read: Deep Aligned Representations, arXiv 2017

  • On Deep Multi-View Representation Learning, ICML 2015

  • Unsupervised Alignment of Natural Language Instructions with Video Segments, AAAI 2014

  • Multimodal Alignment of Videos, MM 2014

  • Deep Canonical Correlation Analysis, ICML 2013


Missing/Imperfect Modalities(模态缺失):

  • Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series, arXiv 2019

  • Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization, ACL 2019

  • Multimodal Deep Learning for Robust RGB-D Object Recognition, IROS 2015



Knowledge Graphs and Knowledge Bases(知识图谱与知识库):

  • MMKG: Multi-Modal Knowledge Graphs, ESWC 2019

  • Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs, AKBC 2019

  • Embedding Multimodal Relational Data for Knowledge Base Completion, EMNLP 2018

  • A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning, SEM 2018

  • Order-Embeddings of Images and Language, ICLR 2016

  • Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries, arXiv 2015


Intepretable Learning(可解释学习):

  • Multimodal Explanations by Predicting Counterfactuality in Videos, CVPR 2019

  • Multimodal Explanations: Justifying Decisions and Pointing to the Evidence, CVPR 2018

  • Do Explanations make VQA Models more Predictable to a Human?, EMNLP 2018

  • Towards Transparent AI Systems: Interpreting Visual Question Answering Models, ICML Workshop on Visualization for Deep Learning 2016


Generative Learning(生成学习):

  • Multimodal Generative Models for Scalable Weakly-Supervised Learning, NeurIPS 2018

  • Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models, CVPR 2018

  • The Multi-Entity Variational Autoencoder, NeurIPS 2017


Semi-supervised Learning(半监督学习):

  • Semi-supervised Vision-language Mapping via Variational Learning, ICRA 2017

  • Semi-supervised Multimodal Hashing, arXiv 2017

  • Semi-Supervised Multimodal Deep Learning for RGB-D Object Recognition, IJCAI 2016

  • Multimodal Semi-supervised Learning for Image Classification, CVPR 2010


Self-supervised Learning(自监督学习):

  • Self-Supervised Learning from Web Data for Multimodal Retrieval, arXiv 2019

  • Self-Supervised Learning of Visual Features through Embedding Images into Text Topic Spaces, CVPR 2017

  • Multimodal Dynamics : Self-supervised Learning in Perceptual and Motor Systems, 2016


Language Models(语言模型):

  • Neural Language Modeling with Visual Features, arXiv 2019

  • Learning Multi-Modal Word Representation Grounded in Visual Context, AAAI 2018

  • [Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes] (https://arxiv.org/abs/1511.07067), CVPR 2016

  • Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, ICML 2014 [code]


Adversarial Attacks(对抗攻击):

  • Attend and Attack: Attention Guided Adversarial Attacks on Visual Question Answering Models, NeurIPS Workshop on Visually Grounded Interaction and Language 2018

  • Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning, ACL 2018

  • Fooling Vision and Language Models Despite Localization and Attention Mechanism, CVPR 2018


Zero-Shot Learning(零样本学习):

  • Zero-Shot Learning - The Good, the Bad and the Ugly, CVPR 2017

  • Zero-Shot Learning Through Cross-Modal Transfer, NIPS 2013


【主要应用】

Language and Visual QA:

  • GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering, CVPR 2019 

  • OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge, CVPR 2019 

  • MUREL: Multimodal Relational Reasoning for Visual Question Answering, CVPR 2019 

  • Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence, CVPR 2019 

  • Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering, ICML 2019 

  • Learning to Count Objects in Natural Images for Visual Question Answering, ICLR 2018, 

  • Overcoming Language Priors in Visual Question Answering with Adversarial Regularization, NeurIPS 2018

  • Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding, NeurIPS 2018 

  • RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes, EMNLP 2018 

  • TVQA: Localized, Compositional Video Question Answering, EMNLP 2018 

  • Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering, CVPR 2018 

  • Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering, CVPR 2018 

  • Stacked Latent Attention for Multimodal Reasoning, CVPR 2018

  • Learning to Reason: End-to-End Module Networks for Visual Question Answering, ICCV 2017 

  • CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning, CVPR 2017  [dataset generation]

  • Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension, CVPR 2017 

  • Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, EMNLP 2016 

  • MovieQA: Understanding Stories in Movies through Question-Answering, CVPR 2016 

  • VQA: Visual Question Answering, ICCV 2015 


Language Grounding in Vision:

  • Grounded Video Description, CVPR 2019

  • Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions, CVPR 2019

  • Multilevel Language and Vision Integration for Text-to-Clip Retrieval, AAAI 2019 

  • Binary Image Selection (BISON): Interpretable Evaluation of Visual Grounding, arXiv 2019 

  • Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos, CVPR 2018

  • SCAN: Learning Hierarchical Compositional Visual Concepts, ICLR 2018

  • Visual Coreference Resolution in Visual Dialog using Neural Module Networks, ECCV 2018 

  • Gated-Attention Architectures for Task-Oriented Language Grounding, AAAI 2018

  • Using Syntax to Ground Referring Expressions in Natural Images, AAAI 2018 

  • Grounding language acquisition by training semantic parsers using captioned videos, ACL 2018

  • Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts, NeurIPS 2017

  • Localizing Moments in Video with Natural Language, ICCV 2017

  • What are you talking about? Text-to-Image Coreference, CVPR 2014

  • Grounded Language Learning from Video Described with Sentences, ACL 2013

  • Grounded Compositional Semantics for Finding and Describing Images with Sentences, TACL 2013


Language Grouding in Navigation

  • Vision-and-Dialog Navigation, arXiv 2019 

  • Hierarchical Decision Making by Generating and Following Natural Language Instructions, arXiv 2019 

  • Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation, ACL 2019

  • Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation, ACL 2019

  • Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments, CVPR 2019 

  • Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation, CVPR 2019

  • Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation, CVPR 2019

  • The Regretful Navigation Agent for Vision-and-Language Navigation, CVPR 2019 

  • Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation, CVPR 2019 

  • Multi-modal Discriminative Model for Vision-and-Language Navigation, NAACL SpLU-RoboNLP Workshop 2019

  • Self-Monitoring Navigation Agent via Auxiliary Progress Estimation, ICLR 2019 

  • From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following, ICLR 2019

  • Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos, AAAI 2019

  • Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout, NAACL 2019 

  • Attention Based Natural Language Grounding by Navigating Virtual Environment, IEEE WACV 2019

  • Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction, EMNLP 2018 

  • Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments, CVPR 2018 

  • Embodied Question Answering, CVPR 2018 

  • Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation, ECCV 2018


Multimodal Machine Translation:

  • VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research, ICCV 2019 

  • Latent Variable Model for Multi-modal Translation, ACL 2019

  • Distilling Translations with Visual Awareness, ACL 2019

  • Probing the Need for Visual Context in Multimodal Machine Translation, NAACL 2019

  • Emergent Translation in Multi-Agent Communication, ICLR 2018

  • Zero-Resource Neural Machine Translation with Multi-Agent Communication Game, AAAI 2018

  • Learning Translations via Images with a Massively Multilingual Image Dataset, ACL 2018

  • A Visual Attention Grounding Neural Model for Multimodal Machine Translation, EMNLP 2018

  • Adversarial Evaluation of Multimodal Machine Translation, EMNLP 2018

  • Doubly-Attentive Decoder for Multi-modal Neural Machine Translation, ACL 2017

  • An empirical study on the effectiveness of images in Multimodal Neural Machine Translation, EMNLP 2017

  • Incorporating Global Visual Features into Attention-based Neural Machine Translation, EMNLP 2017

  • Multimodal Pivots for Image Caption Translation, ACL 2016

  • Multi30K: Multilingual English-German Image Descriptions, ACL Workshop on Language and Vision 2016

  • Does Multimodality Help Human and Machine for Translation and Image Captioning?, ACL WMT 2016


Multi-agent Communication:

  • Emergence of Compositional Language with Deep Generational Transmission, ICML 2019

  • On the Pitfalls of Measuring Emergent Communication, AAMAS 2019 

  • Emergent Translation in Multi-Agent Communication, ICLR 2018 

  • Emergent Communication in a Multi-Modal, Multi-Step Referential Game, ICLR 2018 

  • Emergence of Linguistic Communication From Referential Games with Symbolic and Pixel Input, ICLR 2018

  • Emergent Communication through Negotiation, ICLR 2018 

  • Emergence of Grounded Compositional Language in Multi-Agent Populations, AAAI 2018

  • Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols, NeurIPS 2017

  • Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog, EMNLP 2017 [code1] [code2]

  • Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning, ICCV 2017 code

  • Multi-agent Cooperation and the Emergence of (natural) Language, ICLR 2017

  • Learning to Communicate with Deep Multi-agent Reinforcement Learning, NIPS 2016.

  • Learning multiagent communication with backpropagation, NIPS 2016.

  • The Emergence of Compositional Structures in Perceptually Grounded Language Games, AI 2005


Commonsense Reasoning:

  • SocialIQA: Commonsense Reasoning about Social Interactions, arXiv 2019

  • From Recognition to Cognition: Visual Commonsense Reasoning, CVPR 2019 

  • CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge, NAACL 2019


Multimodal Reinforcement Learning:

  • Habitat: A Platform for Embodied AI Research, arXiv 2019 

  • Embodied Multimodal Multitask Learning, arXiv 2019

  • Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog, SIGDIAL 2018

  • Mapping Instructions and Visual Observations to Actions with Reinforcement Learning, EMNLP 2017

  • Reinforcement Learning for Mapping Instructions to Actions, ACL 2009


Multimodal Dialog:

  • MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations, ACL 2019 

  • CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog, NAACL 2019 

  • Talk the Walk: Navigating New York City through Grounded Dialogue, arXiv 2018

  • Dialog-based Interactive Image Retrieval, NeurIPS 2018 

  • Towards Building Large Scale Multimodal Domain-Aware Conversation Systems, arXiv 2017 

  • Visual Dialog, CVPR 2017 


Language and Audio:

  • Lattice Transformer for Speech Translation, ACL 2019

  • Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation, ACL 2019

  • Audio Caption: Listen and Tell, ICASSP 2019

  • Audio-Linguistic Embeddings for Spoken Sentences, ICASSP 2019

  • From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings, arXiv 2019

  • From Audio to Semantics: Approaches To End-to-end Spoken Language Understanding, arXiv 2018

  • Natural TTS Synthesis by Conditioning Wavenet on Mel Spectrogram Predictions, ICASSP 2018 

  • Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning, ICLR 2018

  • Deep Voice 2: Multi-Speaker Neural Text-to-Speech, NeurIPS 2017

  • Deep Voice: Real-time Neural Text-to-Speech, ICML 2017

  • Text-to-Speech Synthesis, 2009


Audio and Visual:

  • Reconstructing Faces from Voices, arXiv 2019

  • Learning Individual Styles of Conversational Gesture, CVPR 2019 

  • Speech2Face: Learning the Face Behind a Voice, CVPR 2019 

  • Capture, Learning, and Synthesis of 3D Speaking Styles, CVPR 2019 

  • Disjoint Mapping Network for Cross-modal Matching of Voices and Faces, ICLR 2019

  • Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks, ICASSP 2019 

  • Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input, ECCV 2018 

  • Seeing Voices and Hearing Faces: Cross-modal Biometric Matching, CVPR 2018 

  • Learning to Separate Object Sounds by Watching Unlabeled Video, CVPR 2018

  • Deep Audio-Visual Speech Recognition, IEEE TPAMI 2018

  • Look, Listen and Learn, ICCV 2017

  • Unsupervised Learning of Spoken Language with Visual Context, NeurIPS 2016

  • SoundNet: Learning Sound Representations from Unlabeled Video, NeurIPS 2016 


Media Description:

  • Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph, CVPR 2019 

  • Joint Event Detection and Description in Continuous Video Streams, WACVW 2019

  • Learning to Compose and Reason with Language Tree Structures for Visual Grounding, TPAMI 2019

  • Neural Baby Talk, CVPR 2018 

  • Grounding Referring Expressions in Images by Variational Context, CVPR 2018

  • Video Captioning via Hierarchical Reinforcement Learning, CVPR 2018

  • Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos, CVPR 2018 

  • Neural Motifs: Scene Graph Parsing with Global Context, CVPR 2018 

  • No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling, ACL 2018

  • Generating Descriptions with Grounded and Co-Referenced People, CVPR 2017

  • DenseCap: Fully Convolutional Localization Networks for Dense Captioning, CVPR 2016

  • Review Networks for Caption Generation, NeurIPS 2016 

  • Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding, ECCV 2016 

  • Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, TPAMI 2016 

  • Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML 2015 

  • Deep Visual-Semantic Alignments for Generating Image Descriptions, CVPR 2015 

  • Show and Tell: A Neural Image Caption Generator, CVPR 2015 

  • A Dataset for Movie Description, CVPR 2015 

  • What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision, NAACL 2015 

  • Microsoft COCO: Common Objects in Context, ECCV 2014 


Video Generation from Text:

  • Image Generation from Scene Graphs, CVPR 2018

  • Learning to Color from Language, NAACL 2018

  • Generative Adversarial Text to Image Synthesis, ICML 2016


Affect Recognition and Multimodal Language:

  • Towards Multimodal Sarcasm Detection (An Obviously_Perfect Paper), ACL 2019 

  • Multimodal Language Analysis with Recurrent Multistage Fusion, EMNLP 2018

  • Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph, ACL 2018 

  • Multi-attention Recurrent Network for Human Communication Comprehension, AAAI 2018 

  • AMHUSE - A Multimodal dataset for HUmor SEnsing, ICMI 2017 

  • Decoding Children’s Social Behavior, CVPR 2013 

  • Collecting Large, Richly Annotated Facial-Expression Databases from Movies, IEEE Multimedia 2012 

  • The Interactive Emotional Dyadic Motion Capture (IEMOCAP) Database, 2008 


Healthcare:

  • Leveraging Medical Visual Question Answering with Supporting Facts, arXiv 2019

  • Unsupervised Multimodal Representation Learning across Medical Images and Reports, ML4H 2018

  • Multimodal Medical Image Retrieval based on Latent Topic Modeling, ML4H 2018

  • Improving Hospital Mortality Prediction with Medical Named Entities and Multimodal Learning, ML4H 2018

  • Knowledge-driven Generative Subspaces for Modeling Multi-view Dependencies in Medical Data, ML4H 2018

  • Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors, TAC 2018

  • Learning the Joint Representation of Heterogeneous Temporal Events for Clinical Endpoint Prediction, AAAI 2018

  • Understanding Coagulopathy using Multi-view Data in the Presence of Sub-Cohorts: A Hierarchical Subspace Approach, MLHC 2017

  • Machine Learning in Multimodal Medical Imaging, 2017

  • Cross-modal Recurrent Models for Weight Objective Prediction from Multimodal Time-series Data, ML4H 2017

  • SimSensei Kiosk: A Virtual Human Interviewer for Healthcare Decision Support, AAMAS 2014

  • Dyadic Behavior Analysis in Depression Severity Assessment Interviews, ICMI 2014

  • Audiovisual Behavior Descriptors for Depression Assessment, ICMI 2013


Robotics:

  • See, Feel, Act: Hierarchical Learning for Complex Manipulation Skills with Multi-sensory Fusion, Science Robotics 2019

  • Early Fusion for Goal Directed Robotic Vision, IROS 2019

  • Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup, RSS 2019

  • Probabilistic Multimodal Modeling for Human-Robot Interaction Tasks, RSS 2019

  • Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks, ICRA 2019

  • Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multi-Objective Evolutionary Algorithm , arXiv 2018

  • Multimodal Probabilistic Model-Based Planning for Human-Robot Interaction, arXiv 2017

  • Perching and Vertical Climbing: Design of a Multimodal Robot, ICRA 2014

  • Multi-Modal Scene Understanding for Robotic Grasping, 2011

  • Strategies for Multi-Modal Scene Exploration, IROS 2010


更多内容,请访问该Github库。



-END-

专 · 知


专知,专业可信的人工智能知识分发,让认知协作更快更好!欢迎登录www.zhuanzhi.ai,注册登录专知,获取更多AI知识资料!

欢迎微信扫一扫加入专知人工智能知识星球群,获取最新AI专业干货知识教程视频资料和与专家交流咨询

请加专知小助手微信(扫一扫如下二维码添加),加入专知人工智能主题群,咨询技术商务合作~

专知《深度学习:算法到实战》课程全部完成!560+位同学在学习,现在报名,限时优惠!网易云课堂人工智能畅销榜首位!

点击“阅读原文”,了解报名专知《深度学习:算法到实战》课程

登录查看更多
123

相关内容

专知会员服务
60+阅读 · 2020年3月19日
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
164+阅读 · 2020年3月18日
专知会员服务
53+阅读 · 2020年3月16日
专知会员服务
109+阅读 · 2020年3月12日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
【强化学习资源集合】Awesome Reinforcement Learning
专知会员服务
94+阅读 · 2019年12月23日
人工智能顶刊TPAMI2019最新《多模态机器学习综述》
专知会员服务
93+阅读 · 2019年10月18日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
91+阅读 · 2019年10月16日
[综述]深度学习下的场景文本检测与识别
专知会员服务
77+阅读 · 2019年10月10日
【资源】元学习论文分类列表推荐
专知
19+阅读 · 2019年12月3日
Github项目推荐 | 全景分割相关资源列表
AI研习社
9+阅读 · 2019年5月13日
深度学习自然语言处理阅读清单
专知
23+阅读 · 2019年1月13日
【NIPS2018】接收论文列表
专知
5+阅读 · 2018年9月10日
语义分割+视频分割开源代码集合
极市平台
35+阅读 · 2018年3月5日
Deep learning for cardiac image segmentation: A review
Arxiv
21+阅读 · 2019年11月9日
A Comprehensive Survey on Graph Neural Networks
Arxiv
13+阅读 · 2019年3月10日
Arxiv
10+阅读 · 2018年3月22日
VIP会员
相关VIP内容
专知会员服务
60+阅读 · 2020年3月19日
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
164+阅读 · 2020年3月18日
专知会员服务
53+阅读 · 2020年3月16日
专知会员服务
109+阅读 · 2020年3月12日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
【强化学习资源集合】Awesome Reinforcement Learning
专知会员服务
94+阅读 · 2019年12月23日
人工智能顶刊TPAMI2019最新《多模态机器学习综述》
专知会员服务
93+阅读 · 2019年10月18日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
91+阅读 · 2019年10月16日
[综述]深度学习下的场景文本检测与识别
专知会员服务
77+阅读 · 2019年10月10日
Top
微信扫码咨询专知VIP会员