深度学习是一门实践科学,实验发展远远甩开了理论研究,因此本文的架构采用理论与实践相结合的模式。
首先可以去专知深度学习条目下看看相关文章,
针对卷积神经网络,我们可以通过如下文章了解基本概念
卷积神经网络工作原理直观的解释?https://www.zhihu.com/question/39022858
技术向:一文读懂卷积神经网络CNN http://dataunion.org/11692.html
深度学习元老Yann Lecun详解卷积神经网络https://www.leiphone.com/news/201608/zaB48AcZ1AFm1TaP.html
CNN笔记:通俗理解卷积神经网络https://www.2cto.com/kf/201607/522441.html
了解完基本概念之后,还需要对CNN有一个直观理解,深度学习可视化是一个非常不错的选择
Visualizing and Understanding Convolutional Networks中文笔记http://www.gageet.com/2014/10235.php
英文原文,感兴趣的可以看一下https://arxiv.org/abs/1311.2901
在开始具体的实践之前,可以先去tensorflow的playground尝试一番,地址http://playground.tensorflow.org/,指导http://f.dataguru.cn/article-9324-1.html
之后就可以在自己的电脑上实验了,首先,使用GPU是必须的:
安装cudahttp://blog.csdn.net/u010480194/article/details/54287335
安装cudnnhttp://blog.csdn.net/lucifer_zzq/article/details/76675239
之后就是选择适合自己的框架
现在最火的深度学习框架是什么?https://www.zhihu.com/question/52517062?answer_deleted_redirect=true
深度 | 主流深度学习框架对比:看你最适合哪一款?http://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650719118&idx=2&sn=fad8b7cad70cc6a227f88ae07a89db66#rd
当然,还有一个专门评价框架的github项目,更新比较勤https://github.com/hunkim/DeepLearningStars
如果有选择困难症的话,不负责任地推荐两个框架:tensorflow和pytorch,tensorflow可视化和工程衔接做得很好,pytorch实现比较自由,用起来很舒服
tensorflow官网http://www.tensorflow.org/
pytorch 官网http://pytorch.org/
基本按照官网上的指示一步步地安装就没啥大问题了,如果真遇到问题,可以上一个神奇的网站https://stackoverflow.com/搜索解决方法,基本上都能找到
还需要熟悉一个重要的工具github https://github.com/,不论是自己管理代码还是借鉴别人的代码都很方便,想要教程的话可以参考这篇回答https://www.zhihu.com/question/20070065
当然,要是偷懒不想看的话,可以用IDE来辅助管理,例如pycharmhttp://www.jetbrains.com/pycharm/,教程http://blog.csdn.net/u013088062/article/details/50349833
一个可视化的交互工具也是非常重要的,这里推荐神器jupyter notebook http://python.jobbole.com/87527/?repeat=w3tc
以上准备工作都做好了,就可以开始自己的入门教程了。事实上官网的教程非常不错,但要是嫌弃全英文看着困难的话,也可以看看以下教程
tensorflow
TensorFlow 如何入门?https://www.zhihu.com/question/49909565
TensorFlow入门http://hacker.duanshishi.com/?p=1639
谷歌的官方tutorial其实挺完善的,不想看英文可以看看这个中文翻译http://wiki.jikexueyuan.com/project/tensorflow-zh/
pytorch
PyTorch深度学习:60分钟入门(Translation)https://zhuanlan.zhihu.com/p/25572330
新手如何入门pytorch?https://www.zhihu.com/question/55720139
超简单!pytorch入门教程(一):Tensorhttp://www.jianshu.com/p/5ae644748f21
如果对python不熟悉的话,可以先看看这两个教程python2:http://www.runoob.com/python/python-tutorial.html,python3:http://www.runoob.com/python3/python3-tutorial.html
如果只是玩票性质的,不想在框架上浪费太多时间的话,可以试试keras
Keras入门教程http://www.360doc.com/content/17/0624/12/1489589_666148811.shtml
经过了前面的入门,相信大家已经对卷积神经网络有了一个基本概念了,同时对如何实现CNN也有了基本的了解。而进阶学习的学习同样也是两个方面
首先是反向传播算法,入门时虽然用不着看,因为常用的框架都有自动求导,但是想要进一步一定要弄清楚。教程http://blog.csdn.net/u014313009/article/details/51039334
接着熟悉一下CNN的几个经典模型
文章:ImageNet Classification with Deep Convolutional Neural Networkshttp://ml.informatik.uni-freiburg.de/former/_media/teaching/ws1314/dl/talk_simon_group2.pdf
讲解:http://blog.csdn.net/u014088052/article/details/50898842
代码:tensorflowhttps://github.com/kratzert/finetune_alexnet_with_tensorflow pytorchhttps://github.com/aaron-xichen/pytorch-playground
文章:Deep Residual Learning for Image Recognitionhttps://arxiv.org/abs/1512.03385
讲解:http://blog.csdn.net/wspba/article/details/56019373
代码:tensorflowhttps://github.com/ry/tensorflow-resnet pytorchhttps://github.com/isht7/pytorch-deeplab-resnet
文章:Densely Connected Convolutional Networks https://arxiv.org/pdf/1608.06993.pdf
讲解:http://blog.csdn.net/u014380165/article/details/75142664
代码:原版https://github.com/liuzhuang13/DenseNet tensorflowhttps://github.com/YixuanLi/densenet-tensorflow pytorchhttps://github.com/bamos/densenet.pytorch
推荐先看讲解,然后阅读源码,一方面可以加深对模型的理解,另一方面也可以从别人的源码中学习各种框架新姿势。
当然,我们不能仅仅停留在表面上,这里推荐一本非常有名的书《Deep Learning》,这里是中文版的链接https://github.com/exacity/deeplearningbook-chinese
更为基础的理论研究目前还处于缺失状态
要是有耐心的同学,可以学习一下斯坦福新开的课程https://www.bilibili.com/video/av9156347/
具体到实践中,有非常多需要学习的点。在学习之前,最好先看看调参技巧
深度学习调参有哪些技巧?https://www.zhihu.com/question/25097993
过去有本调参圣经Neural Networks: Tricks of the Trade ,太老了,不推荐看。
dropout,lrn这些过去常用的模块最近已经用得越来越少了,就不赘述了,有关正则化,推荐BatchNorm https://www.zhihu.com/question/38102762, 思想简单,效果好
虽然有了BatchNorm之后训练基本已经非常稳定了,但最好还是学习一下梯度裁剪http://blog.csdn.net/zyf19930610/article/details/71743291
激活函数也是一个非常重要的点,不过在卷积神经网络中基本无脑用ReLuhttp://www.cnblogs.com/neopenx/p/4453161.html就行了,计算快,ReLu+BatchNorm可以说是万金油。当然,像一些具体的任务还是需要具体分析,例如GAN就不适合用这种简单粗暴的激活函数。
结构上基本完善了,接下来就是优化了,优化的算法有很多,最常见的是SGD与Adam。
所有优化算法概览http://www.mamicode.com/info-detail-1931210.html
好的算法可以更快地收敛或者有更好的效果,不过大多数实验中SGD与Adam已经够用了。
大神们的经验也是要看一下的:Yoshua Bengio等大神传授:26条深度学习经验http://www.csdn.net/article/2015-09-16/2825716
前面的这些学完之后,就是具体的研究项目了,大家可以去这个github上找自己感兴趣的论文https://github.com/terryum/awesome-deep-learning-papers,下面列举了一些和卷积神经网络相关的优秀论文。
Distilling the knowledge in a neural network (2015), G. Hinton et al. http://arxiv.org/pdf/1503.02531
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images (2015), A. Nguyen et al. http://arxiv.org/pdf/1412.1897
How transferable are features in deep neural networks? (2014), J. Yosinski et al.http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf
CNN features off-the-Shelf: An astounding baseline for recognition (2014), A. Razavian et al. http://www.cv-foundation.org//openaccess/content_cvpr_workshops_2014/W15/papers/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.pdf
Learning and transferring mid-Level image representations using convolutional neural networks (2014), M. Oquab et al. http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Oquab_Learning_and_Transferring_2014_CVPR_paper.pdf
Visualizing and understanding convolutional networks (2014), M. Zeiler and R. Fergus http://arxiv.org/pdf/1311.2901
Decaf: A deep convolutional activation feature for generic visual recognition (2014), J. Donahue et al. http://arxiv.org/pdf/1310.1531
Training very deep networks (2015), R. Srivastava et al.http://papers.nips.cc/paper/5850-training-very-deep-networks.pdf
Batch normalization: Accelerating deep network training by reducing internal covariate shift (2015), S. Loffe and C. Szegedy http://arxiv.org/pdf/1502.03167
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015), K. He et al. http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf
Dropout: A simple way to prevent neural networks from overfitting (2014), N. Srivastava et al. http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf
Adam: A method for stochastic optimization (2014), D. Kingma and J. Bahttp://arxiv.org/pdf/1412.6980
Improving neural networks by preventing co-adaptation of feature detectors (2012), G. Hinton et al. http://arxiv.org/pdf/1207.0580.pdf
Random search for hyper-parameter optimization (2012) J. Bergstra and Y. Bengiohttp://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a
Rethinking the inception architecture for computer vision (2016), C. Szegedy et al. http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf
Inception-v4, inception-resnet and the impact of residual connections on learning (2016), C. Szegedy et al.http://arxiv.org/pdf/1602.07261
Identity Mappings in Deep Residual Networks (2016), K. He et al. https://arxiv.org/pdf/1603.05027v2.pdf
Deep residual learning for image recognition (2016), K. He et al. http://arxiv.org/pdf/1512.03385
Spatial transformer network (2015), M. Jaderberg et al., http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf
Going deeper with convolutions (2015), C. Szegedy et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf
Very deep convolutional networks for large-scale image recognition (2014), K. Simonyan and A. Zisserman http://arxiv.org/pdf/1409.1556
Return of the devil in the details: delving deep into convolutional nets (2014), K. Chatfield et al. http://arxiv.org/pdf/1405.3531
OverFeat: Integrated recognition, localization and detection using convolutional networks (2013), P. Sermanet et al.http://arxiv.org/pdf/1312.6229
Maxout networks (2013), I. Goodfellow et al. http://arxiv.org/pdf/1302.4389v4
Network in network (2013), M. Lin et al. http://arxiv.org/pdf/1312.4400
ImageNet classification with deep convolutional neural networks (2012), A. Krizhevsky et al.http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
You only look once: Unified, real-time object detection (2016), J. Redmon et al.http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf
Fully convolutional networks for semantic segmentation (2015), J. Long et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (2015), S. Ren et al.http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf
Fast R-CNN (2015), R. Girshick http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf
Rich feature hierarchies for accurate object detection and semantic segmentation (2014), R. Girshick et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf
Spatial pyramid pooling in deep convolutional networks for visual recognition (2014), K. He et al. http://arxiv.org/pdf/1406.4729
Semantic image segmentation with deep convolutional nets and fully connected CRFs, L. Chen et al. https://arxiv.org/pdf/1412.7062
Learning hierarchical features for scene labeling (2013), C. Farabet et al. https://hal-enpc.archives-ouvertes.fr/docs/00/74/20/77/PDF/farabet-pami-13.pdf
Image Super-Resolution Using Deep Convolutional Networks (2016), C. Dong et al. https://arxiv.org/pdf/1501.00092v3.pdf
A neural algorithm of artistic style (2015), L. Gatys et al. https://arxiv.org/pdf/1508.06576
Deep visual-semantic alignments for generating image descriptions (2015), A. Karpathy and L. Fei-Feihttp://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.pdf
Show, attend and tell: Neural image caption generation with visual attention (2015), K. Xu et al. http://arxiv.org/pdf/1502.03044
Show and tell: A neural image caption generator (2015), O. Vinyals et al. http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf
Long-term recurrent convolutional networks for visual recognition and description (2015), J. Donahue et al.http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.pdf
VQA: Visual question answering (2015), S. Antol et al.http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Antol_VQA_Visual_Question_ICCV_2015_paper.pdf
DeepFace: Closing the gap to human-level performance in face verification (2014), Y. Taigman et al.http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Taigman_DeepFace_Closing_the_2014_CVPR_paper.pdf
Large-scale video classification with convolutional neural networks (2014), A. Karpathy et al. http://vision.stanford.edu/pdf/karpathy14.pdf
Two-stream convolutional networks for action recognition in videos (2014), K. Simonyan et al. http://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf
3D convolutional neural networks for human action recognition (2013), S. Ji et al.http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_JiXYY10.pdf
更多的需要可以参考专知的另一篇deeplearning相关的文章http://www.zhuanzhi.ai/topic/2001228999615594/awesome,其中有很多具体细化的领域以及相关文章,这里就不重复了。