# 卷积神经网络（CNN）从入门到精通——一个过来人的总结

## 基础入门

### 粗略了解

CNN笔记：通俗理解卷积神经网络https://www.2cto.com/kf/201607/522441.html

Visualizing and Understanding Convolutional Networks中文笔记http://www.gageet.com/2014/10235.php

### 基本实践

tensorflow官网http://www.tensorflow.org/

pytorch 官网http://pytorch.org/

tensorflow

TensorFlow 如何入门？https://www.zhihu.com/question/49909565

TensorFlow入门http://hacker.duanshishi.com/?p=1639

pytorch

PyTorch深度学习：60分钟入门(Translation)https://zhuanlan.zhihu.com/p/25572330

## 进阶学习

### 实践深入

dropout，lrn这些过去常用的模块最近已经用得越来越少了，就不赘述了，有关正则化，推荐BatchNorm https://www.zhihu.com/question/38102762， 思想简单，效果好

## 细化研究

### Understanding / Generalization / Transfer

Distilling the knowledge in a neural network (2015), G. Hinton et al. http://arxiv.org/pdf/1503.02531

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images (2015), A. Nguyen et al. http://arxiv.org/pdf/1412.1897

How transferable are features in deep neural networks? (2014), J. Yosinski et al.http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf

CNN features off-the-Shelf: An astounding baseline for recognition (2014), A. Razavian et al. http://www.cv-foundation.org//openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf

Learning and transferring mid-Level image representations using convolutional neural networks (2014), M. Oquab et al. http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Oquab_Learning_and_Transferring_2014_CVPR_paper.pdf

Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergus http://arxiv.org/pdf/1311.2901

Decaf: A deep convolutional activation feature for generic visual recognition (2014), J. Donahue et al. http://arxiv.org/pdf/1310.1531

### Optimization / Training Techniques

Training very deep networks (2015), R. Srivastava et al.http://papers.nips.cc/paper/5850-training-very-deep-networks.pdf

Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedy http://arxiv.org/pdf/1502.03167

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), K. He et al. http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf

Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al. http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Adam: A method for stochastic optimization (2014), D. Kingma and J. Bahttp://arxiv.org/pdf/1412.6980

Improving neural networks by preventing co-adaptation of feature detectors (2012), G. Hinton et al. http://arxiv.org/pdf/1207.0580.pdf

Random search for hyper-parameter optimization (2012) J. Bergstra and Y. Bengiohttp://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a

### Convolutional Neural Network Models

Rethinking the inception architecture for computer vision (2016), C. Szegedy et al. http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

Inception-v4, inception-resnet and the impact of residual connections on learning (2016), C. Szegedy et al.http://arxiv.org/pdf/1602.07261

Identity Mappings in Deep Residual Networks (2016), K. He et al. https://arxiv.org/pdf/1603.05027v2.pdf

Deep residual learning for image recognition (2016), K. He et al. http://arxiv.org/pdf/1512.03385

Spatial transformer network (2015), M. Jaderberg et al., http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf

Going deeper with convolutions (2015), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

Very deep convolutional networks for large-scale image recognition (2014), K. Simonyan and A. Zisserman http://arxiv.org/pdf/1409.1556

Return of the devil in the details: delving deep into convolutional nets (2014), K. Chatfield et al. http://arxiv.org/pdf/1405.3531

OverFeat: Integrated recognition, localization and detection using convolutional networks (2013), P. Sermanet et al.http://arxiv.org/pdf/1312.6229

Maxout networks (2013), I. Goodfellow et al. http://arxiv.org/pdf/1302.4389v4

Network in network (2013), M. Lin et al. http://arxiv.org/pdf/1312.4400

ImageNet classification with deep convolutional neural networks (2012), A. Krizhevsky et al.http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

### Image: Segmentation / Object Detection

You only look once: Unified, real-time object detection (2016), J. Redmon et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

Fully convolutional networks for semantic segmentation (2015), J. Long et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al.http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

Fast R-CNN (2015), R. Girshick http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

Rich feature hierarchies for accurate object detection and semantic segmentation (2014), R. Girshick et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

Spatial pyramid pooling in deep convolutional networks for visual recognition (2014), K. He et al. http://arxiv.org/pdf/1406.4729

Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al. https://arxiv.org/pdf/1412.7062

Learning hierarchical features for scene labeling (2013), C. Farabet et al. https://hal-enpc.archives-ouvertes.fr/docs/00/74/20/77/PDF/farabet-pami-13.pdf

### Image / Video / Etc

Image Super-Resolution Using Deep Convolutional Networks (2016), C. Dong et al. https://arxiv.org/pdf/1501.00092v3.pdf

A neural algorithm of artistic style (2015), L. Gatys et al. https://arxiv.org/pdf/1508.06576

Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Feihttp://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf

Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al. http://arxiv.org/pdf/1502.03044

Show and tell: A neural image caption generator (2015), O. Vinyals et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf

Long-term recurrent convolutional networks for visual recognition and description (2015), J. Donahue et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.pdf

VQA: Visual question answering (2015), S. Antol et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf

DeepFace: Closing the gap to human-level performance in face verification (2014), Y. Taigman et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf

Large-scale video classification with convolutional neural networks (2014), A. Karpathy et al. http://vision.stanford.edu/pdf/karpathy14.pdf

Two-stream convolutional networks for action recognition in videos (2014), K. Simonyan et al. http://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf

3D convolutional neural networks for human action recognition (2013), S. Ji et al.http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_JiXYY10.pdf

### VIP内容

https://www.zhuanzhi.ai/paper/a8c52c4b641c0a5bc840a955b6258b39

### 最新内容

Partial voluming (PV) is arguably the last crucial unsolved problem in Bayesian segmentation of brain MRI with probabilistic atlases. PV occurs when voxels contain multiple tissue classes, giving rise to image intensities that may not be representative of any one of the underlying classes. PV is particularly problematic for segmentation when there is a large resolution gap between the atlas and the test scan, e.g., when segmenting clinical scans with thick slices, or when using a high-resolution atlas. In this work, we present PV-SynthSeg, a convolutional neural network (CNN) that tackles this problem by directly learning a mapping between (possibly multi-modal) low resolution (LR) scans and underlying high resolution (HR) segmentations. PV-SynthSeg simulates LR images from HR label maps with a generative model of PV, and can be trained to segment scans of any desired target contrast and resolution, even for previously unseen modalities where neither images nor segmentations are available at training. PV-SynthSeg does not require any preprocessing, and runs in seconds. We demonstrate the accuracy and flexibility of the method with extensive experiments on three datasets and 2,680 scans. The code is available at https://github.com/BBillot/SynthSeg.

### 最新论文

Partial voluming (PV) is arguably the last crucial unsolved problem in Bayesian segmentation of brain MRI with probabilistic atlases. PV occurs when voxels contain multiple tissue classes, giving rise to image intensities that may not be representative of any one of the underlying classes. PV is particularly problematic for segmentation when there is a large resolution gap between the atlas and the test scan, e.g., when segmenting clinical scans with thick slices, or when using a high-resolution atlas. In this work, we present PV-SynthSeg, a convolutional neural network (CNN) that tackles this problem by directly learning a mapping between (possibly multi-modal) low resolution (LR) scans and underlying high resolution (HR) segmentations. PV-SynthSeg simulates LR images from HR label maps with a generative model of PV, and can be trained to segment scans of any desired target contrast and resolution, even for previously unseen modalities where neither images nor segmentations are available at training. PV-SynthSeg does not require any preprocessing, and runs in seconds. We demonstrate the accuracy and flexibility of the method with extensive experiments on three datasets and 2,680 scans. The code is available at https://github.com/BBillot/SynthSeg.

Top