图像识别 Image Recognition 专知荟萃
入门学习
- 如何识别图像边缘? 阮一峰
- CS231n课程笔记翻译:图像分类笔记
- 深度学习、图像分类入门,从VGG16卷积神经网络开始
[http://blog.csdn.net/Errors_In_Life/article/details/65950699]
- The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3) 翻译
- 深度学习框架Caffe图片分类教程
- MobileNet教程:用TensorFlow搭建在手机上运行的图像分类器
- 图像验证码和大规模图像识别技术
- 卷积神经网络如何进行图像识别
- 图像识别与验证码
- 图像识别(知乎话题)
- [https://www.zhihu.com/topic/19588774/top-answers?page=1]
综述
- A Review of Image Recognition with Deep Convolutional Neural Network
- Review on Image Recognition
- 深度学习在图像识别中的研究进展与展望
- 图像物体分类与检测算法综述 黄凯奇 任伟强 谭铁牛
[http://cjc.ict.ac.cn/online/cre/hkq-2014526115913.pdf]
- Book Chapter - Objecter Recognition
进阶文章
Imagenet result
- Microsoft (Deep Residual Learning] [http://arxiv.org/pdf/1512.03385v1.pdf]][[Slide](http://image-net.org/challenges/talks/ilsvrc2015_deep_residual_learning_kaiminghe.pdf]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition, arXiv:1512.03385.
- Microsoft (PReLu/Weight Initialization] [http://arxiv.org/pdf/1502.01852]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, arXiv:1502.01852.
- Batch Normalization [http://arxiv.org/pdf/1502.03167]
Sergey Ioffe, Christian Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, arXiv:1502.03167.
- GoogLeNet [http://arxiv.org/pdf/1409.4842]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, CVPR, 2015.
- VGG-Net [http://www.robots.ox.ac.uk/~vgg/research/very_deep/] [http://arxiv.org/pdf/1409.1556]
Karen Simonyan and Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Visual Recognition, ICLR, 2015.
- AlexNet [http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012.
2013
- DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell
2014
- CNN Features off-the-shelf: an Astounding Baseline for Recognition CVPR 2014
- Deeply learned face representations are sparse, selective, and robust
- Deep Learning Face Representation by Joint Identification-Verification
- [https://arxiv.org/abs/1406.4773]
- Deep Learning Face Representation from Predicting 10,000 Classes. intro: CVPR 2014
- Multiple Object Recognition with Visual Attention**
2015
- HD-CNN: Hierarchical Deep Convolutional Neural Network for Image Classification intro: ICCV 2015
- Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. ImageNet top-5 error: 4.94%
- Multi-attribute Learning for Pedestrian Attribute Recognition in Surveillance Scenarios
- FaceNet: A Unified Embedding for Face Recognition and Clustering
2016
- Humans and deep networks largely agree on which kinds of variation make object recognition harder**
- FusionNet: 3D Object Classification Using Multiple Data Representations
- Deep FisherNet for Object Classification**
- Factorized Bilinear Models for Image Recognition**
- Hyperspectral CNN Classification with Limited Training Samples**
- The More You Know: Using Knowledge Graphs for Image Classification**
- MaxMin Convolutional Neural Networks for Image Classification**
- Cost-Effective Active Learning for Deep Image Classification. TCSVT 2016.
- DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment
2017
- Deep Collaborative Learning for Visual Recognition
- Bilinear CNN Models for Fine-grained Visual Recognition
- Multiple Instance Learning Convolutional Neural Networks for Object Recognition**
- B-CNN: Branch Convolutional Neural Network for Hierarchical Classification
- Why Do Deep Neural Networks Still Not Recognize These Images?: A Qualitative Analysis on Failure Cases of ImageNet Classification
- Deep Mixture of Diverse Experts for Large-Scale Visual Recognition
- Convolutional Low-Resolution Fine-Grained Classification
Tutorial
- CVPR tutorial : Large-Scale Visual Recognition
- Image Recognition with Tensorflow
- Visual Object Recognition Tutorial by Bastian Leibe & Kristen Grauman
视频教程
- CS231n: Convolutional Neural Networks for Visual Recognition
- 李飞飞: 我们怎么教计算机理解图片?
- [https://www.youtube.com/watch?v=40riCqvRoMs]
Datasets
- MNIST: handwritten digits (http://yann.lecun.com/exdb/mnist/)
- NIST: similar to MNIST, but larger
- Perturbed NIST: a dataset developed in Yoshua’s class (NIST with tons of deformations)
- CIFAR10 / CIFAR100: 32×32 natural image dataset with 10/100 categories ( http://www.cs.utoronto.ca/~kriz/cifar.html)
- Caltech 101: pictures of objects belonging to 101 categories (http://www.vision.caltech.edu/Image_Datasets/Caltech101/)
- Caltech 256: pictures of objects belonging to 256 categories (http://www.vision.caltech.edu/Image_Datasets/Caltech256/)
- Caltech Silhouettes: 28×28 binary images contains silhouettes of the Caltech 101 dataset
- STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. http://www.stanford.edu/~acoates//stl10/
- The Street View House Numbers (SVHN) Dataset – http://ufldl.stanford.edu/housenumbers/
- NORB: binocular images of toy figurines under various illumination and pose (http://www.cs.nyu.edu/~ylclab/data/norb-v1.0/)
- Imagenet: image database organized according to the WordNethierarchy (http://www.image-net.org/)
- Pascal VOC: various object recognition challenges (http://pascallin.ecs.soton.ac.uk/challenges/VOC/)
- Labelme: A large dataset of annotated images, http://labelme.csail.mit.edu/Release3.0/browserTools/php/dataset.php
- COIL 20: different objects imaged at every angle in a 360 rotation(http://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php)
- COIL100: different objects imaged at every angle in a 360 rotation (http://www1.cs.columbia.edu/CAVE/software/softlib/coil-100.php)
代码
- AlexNet
- ZFnet
[https://github.com/rainer85ah/Papers2Code/tree/master/ZFNet]
- VGG
- GoogLeNet
[https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet]
- ResNet
- HD-CNN
- Factorized Bilinear Models for Image Recognition
- MaxMin Convolutional Neural Networks for Image Classification
- Multiple Object Recognition with Visual Attention
- Learning Spatial Regularization with Image-level Supervisions for Multi-label Image Classification
- Deep Learning Face Representation from Predicting 10,000 Classes
- FaceNet: A Unified Embedding for Face Recognition and Clustering
- DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment
领域专家
- Yangqing Jia
- Ross Girshick
- Xiaodi Hou
- Kaiming He
- Jian Sun
- Xiaoou Tang
- Shuicheng Yan
初步版本,水平有限,有错误或者不完善的地方,欢迎大家提建议和补充,会一直保持更新,本文为专知内容组原创内容,未经允许不得转载,如需转载请发送邮件至fangquanyi@gmail.com 或 联系微信专知小助手(Rancho_Fang)
敬请关注http://www.zhuanzhi.ai 和关注专知公众号,获取第一手AI相关知识