报告嘉宾:黄岩(中科院自动化研究所)
报告时间:2017年12月06日(星期三)晚20:00(北京时间)
报告题目:Improving Image and Sentence Matching with Multimodal Attention and Visual Attributes
主持人:任传贤(中山大学)
报告摘要:
Effective image and sentence matching depends on how to well measure their global visual-semantic similarity. Based on the observation that such a global similarity arises from a complex aggregation of multiple local similarities between pairwise instances of image (objects) and sentence (words), we propose a selective multimodal Long Short-Term Memory network (sm-LSTM) for instance-aware image and sentence matching. The sm-LSTM includes a multi-modal context-modulated attention scheme at each timestep that can selectively attend to a pair of instances of image and sentence, by predicting pairwise instance-aware saliency maps for image and sentence. By similarly measuring multiple local similarities within a few timesteps, the sm-LSTM sequentially aggregates them with hidden states to obtain a final matching score as the desired global similarity. Extensive experiments show that our model can well match image and sentence with complex content, and achieve the state-of-the-art results on two public benchmark datasets. In addition, this talk will introduces our recent progress on using visual attributes for instance-aware image and sentence matching.
报告人简介:
黄岩,助理研究员。2012年获电子科技大学学士学位,2017年获中科院大学博士学位。2017年7月加入中科院自动化研究所模式识别国家重点实验室工作。研究方向为深度学习、计算机视觉与模式识别。目前已在相关领域顶级会议和期刊上发表多篇文章,包括TPAMI, TIP, TMM, NIPS, ICCV, CVPR等。曾获得CVPR 2014-Deep Vision Workshop最佳论文奖、ICPR 2014最佳学生论文奖、RACV 2016最佳墙报奖、中科院院长特别奖、百度奖学金等奖项。
特别鸣谢本次Webinar主要组织者:
VOOC责任委员:任传贤(中山大学)
VODB协调理事:卢孝强(中国科学院西安光学精密机械研究所)
活动参与方式:
1、VALSE Webinar活动全部网上依托VALSE QQ群的“群视频”功能在线进行,活动时讲者会上传PPT或共享屏幕,听众可以看到Slides,听到讲者的语音,并通过文字或语音与讲者交互;
2、为参加活动,需加入VALSE QQ群,目前A、B、C、D、E、F群已满,除讲者等嘉宾外,只能申请加入VALSE G群,群号:669280237。申请加入时需验证姓名、单位和身份,缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M
3、为参加活动,请下载安装Windows QQ最新版,群视频不支持非Windows的系统,如Mac,Linux等,手机QQ可以听语音,但不能看视频slides;
4、在活动开始前10分钟左右,主持人会开启群视频,并发送邀请各群群友加入的链接,参加者直接点击进入即可;
5、活动过程中,请勿送花、棒棒糖等道具,也不要说无关话语,以免影响活动正常进行;
6、活动过程中,如出现听不到或看不到视频等问题,建议退出再重新进入,一般都能解决问题;
7、建议务必在速度较快的网络上参加活动,优先采用有线网络连接。