VALSE ICCV2017 专场重磅来袭:两年一度的视觉盛宴ICCV2017即将在上演,为了更好的促进学术交流,VALSE Webinar将连续举行3场ICCV Pre-Conference专场,奉上最新鲜的ICCV2017论文,提前引燃本年度的ICCV热潮。
报告嘉宾1:樊恒(Temple University)
报告时间:2017年9月27日(星期三)晚20:00(北京时间)
报告题目:Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking
主持人: 苏航(清华大学)
报告摘要:
Being intensively studied, visual tracking has seen great recent advances in either speed (e.g., with correlation filters) or accuracy (e.g., with deep features). Real-time and high accuracy tracking algorithms, however, remain scarce. In this paper we study the problem from a new perspective and present a novel parallel tracking and verifying (PTAV) framework, by taking advantage of the ubiquity of multithread techniques and borrowing from the success of parallel tracking and mapping in visual SLAM. Our PTAV framework typically consists of two components, a tracker T and a verifier V, working in parallel on two separate threads. The tracker T aims to provide a super real-time tracking inference and is expected to perform well most of the time; by contrast, the verifier V checks the tracking results and corrects T when needed. The key innovation is that, V does not work on every frame but only upon the requests from T; on the other end, T may adjust the tracking according to the feedback from V. With such collaboration, PTAV enjoys both the high efficiency provided by T and the strong discriminative power by V. In our extensive experiments on popular benchmarks including OTB2013, OTB2015, TC128 and UAV20L, PTAV achieves the best tracking accuracy among all real-time trackers, and in fact performs even better than many deep learning based solutions. Moreover, as a general framework, PTAV is very flexible and has great rooms for improvement and generalization.
参考文献:
[1] Heng Fan and Haibin Ling, "Parallel Tracking and Verifying: A Framework for Real-Time and High-Accuracy Visual Tracking", in ICCV, 2017.
报告人简介:
Heng Fan obtained his B.E degree from Huazhong Agricultural University in 2013. He is currently a second-year PhD student in Temple University, advised by Prof. Haibin Ling. His research interests include computer vision, pattern recognition and machine learning.
报告嘉宾2:谢凌曦(The Johns Hopkins University)
报告时间:2017年9月27日(星期三)晚20:30(北京时间)
报告题目:SORT: Second-Order Response Transform for Visual Recognition & Genetic CNN
主持人: 苏航(清华大学)
报告摘要:
1.In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks. We propose a novel approach named Second-Order Response Transform (SORT), which appends element-wise product transform to the linear sum of a two-branch network module. A direct advantage of SORT is to facilitate cross-branch response propagation, so that each branch can update its weights based on the current status of the other branch. Moreover, SORT augments the family of transform operations and increases the nonlinearity of the network, making it possible to learn flexible functions to fit the complicated distribution of feature space. SORT can be applied to a wide range of network architectures, including a branched variant of a chain-styled network and a residual network, with very light-weighted modifications. We observe consistent accuracy gain on both small (CIFAR10, CIFAR100 and SVHN) and big (ILSVRC2012) datasets. In addition, SORT is very efficient, as the extra computation overhead is less than 5%.
2.In this paper, we discuss the possibility of learning deep network structures automatically. Note that the number of possible network structures increases exponentially with the number of layers in the network, which motivates us to adopt the genetic algorithm to efficiently explore this large search space. The core idea is to propose an encoding method to represent each network structure in a fixed-length binary string. The genetic algorithm is initialized by generating a set of randomized individuals. In each generation, we define standard genetic operations, e.g., selection, mutation and crossover, to generate competitive individuals and eliminate weak ones. The competitiveness of each individual is defined as its recognition accuracy, which is obtained via a standalone training process on a reference dataset. We run the genetic process on CIFAR10, a small-scale dataset, demonstrating its ability to find high-quality structures which are little studied before. The learned powerful structures are also transferrable to the ILSVRC2012 dataset for large-scale visual recognition.
参考文献:
1.[1] Yan Wang, Lingxi Xie, Chenxi Liu, Siyuan Qiao, Ya Zhang, Wenjun Zhang, Qi Tian and Alan Yuille, "SORT: Second-Order Response Transform for Visual Recognition", in IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
2. [1] Lingxi Xie and Alan Yuille, "Genetic CNN", in IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
报告人简介:
Lingxi Xie obtained his B.E and Ph.D. degree from Tsinghua University in 2010 and 2015, respectively. He is currently a post-doctoral researcher in the Johns Hopkins University. He moved there from the University of California, Los Angeles. From 2013 to 2015, he was a research intern at Microsoft Research Asia. He was a visiting researcher at the University of Texas as San Antonio in 2014. Lingxi has been working on computer vision and multimedia information retrieval, especially in the area of image classification, image retrieval and object detection. He is also interested in the theory and application of deep learning. Lingxi obtained the best paper award on ICMR 2015.
报告嘉宾3:屠卓文(加州大学圣地亚哥分校)
报告时间:2017年9月27日(星期三)晚21:30(北京时间)
报告题目:Introspective Learning for Generative Modeling and Discriminative Classification
主持人:程明明(南开大学)
报告摘要:
In this talk, we will present our recent work on introspective learning. We start by discussing an unsupervised learning method that obtains generative models using a sequence of discriminative classifiers (Boosting), "Learning Generative Models via Discriminative Approaches" (Z. Tu CVPR 2017). We then present our recent work on introspective classification (NIPS 2017) and introspective generative modeling (ICCV 2017), that attain a single model being simultaneously a generator and a discriminator. When followed by iterative discriminative learning, desirable properties of modern discriminative classifiers (CNNs) are directly inherited by the generator. On the supervised classification side, our introspective neural networks (INN) shows immediate improvement to the state-of-the-art CNN architectures (ResNet) on benchmark datasets including MNIST, CIFAR, and SVHN; on the unsupervised learning side, encouraging results on a number of applications are observed including texture modeling, artistic style transferring, face modeling, and object modeling. Introspective learning points to a new direction with a wide range of applications in machine learning.
报告人简介:
Zhuowen Tu is an associate professor of Cognitive Science and also affiliated with Computer Science and Engineering at University of California, San Diego (UCSD). Before joining UCSD in 2013 as an assistant professor, he was a faculty member at UCLA. Between 2011 and 2013, he took a leave to work at Microsoft Research Asia. He received his Ph.D. in Computer Science from the Ohio State University.
特别鸣谢本次Webinar主要组织者:
VOOC主席:程明明(南开大学)
VOOC委员:苏航(清华大学)
VOOC责任委员:林巍峣(上海交通大学)
VODB协调理事:鲁继文(清华大学 )
活动参与方式:
1、VALSE Webinar活动全部网上依托VALSE QQ群的“群视频”功能在线进行,活动时讲者会上传PPT或共享屏幕,听众可以看到Slides,听到讲者的语音,并通过文字或语音与讲者交互;
2、为参加活动,需加入VALSE QQ群,目前A、B、C、D、E群已满,除讲者等嘉宾外,只能申请加入VALSE F群,群号:594312623 。申请加入时需验证姓名、单位和身份,缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M
3、为参加活动,请下载安装Windows QQ最新版,群视频不支持非Windows的系统,如Mac,Linux等,手机QQ可以听语音,但不能看视频slides;
4、在活动开始前10分钟左右,主持人会开启群视频,并发送邀请各群群友加入的链接,参加者直接点击进入即可;
5、活动过程中,请勿送花、棒棒糖等道具,也不要说无关话语,以免影响活动正常进行;
6、活动过程中,如出现听不到或看不到视频等问题,建议退出再重新进入,一般都能解决问题;
7、建议务必在速度较快的网络上参加活动,优先采用有线网络连接。