在整个数据集上训练之前,先在非常小的子数据集上训练进行过拟合,这样你会知道你的网络可以收敛。这个 tip 来自 Karpathy。
始终使用 dropout 将过拟合的几率最小化。在大小 > 256 (完全连接层或卷积层)之后就应该使用 dropout。关于这一点有一篇很好的论文:Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning [Gal Yarin & Zoubin Ghahramani,2015].
经常使用批标准化。参考论文:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [Sergey Ioffe & Christian Szegedy,2015]。这会很有效。批标准化允许更快的收敛(非常快)以及更小的数据集。这样你能够节省时间和资源。
尽可能使用 xavier 初始化。你可以只在大的完全连接层上使用它,然后避免在 CNN 层上使用。有关这点的解释可以阅读这篇文章:An Explanation of Xavier Initialization(by Andy Jones)
如果你的输入数据有空间参数,可以试试端到端的 CNN。可以阅读这篇论文:SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size [Forrest N. Iandola et. al. 2016],它介绍了一种新的方法,而且性能非常好,你可以尝试应用上面提到的tips。
Should I standardize the input variables (column vectors)?(ftp://ftp.sas.com/pub/neural/FAQ2.html#A_std)
How To Prepare Your Data For Machine Learning in Python with Scikit-Learn(http://machinelearningmastery.com/prepare-data-machine-learning-python-scikit-learn/)
How to Define Your Machine Learning Problem(http://machinelearningmastery.com/how-to-define-your-machine-learning-problem/)
Discover Feature Engineering, How to Engineer Features and How to Get Good at It(http://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/)
How To Prepare Your Data For Machine Learning in Python with Scikit-Learn(http://machinelearningmastery.com/prepare-data-machine-learning-python-scikit-learn/)
A Data-Driven Approach to Machine Learning(http://machinelearningmastery.com/a-data-driven-approach-to-machine-learning/)
Why you should be Spot-Checking Algorithms on your Machine Learning Problems(http://machinelearningmastery.com/why-you-should-be-spot-checking-algorithms-on-your-machine-learning-problems/)
Spot-Check Classification Machine Learning Algorithms in Python with scikit-learn(http://machinelearningmastery.com/spot-check-classification-machine-learning-algorithms-python-scikit-learn/)
Evaluate the Performance Of Deep Learning Models in Keras(http://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/)
Evaluate the Performance of Machine Learning Algorithms in Python using Resampling(http://machinelearningmastery.com/evaluate-performance-machine-learning-algorithms-python-using-resampling/)
How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras(http://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/)
Display Deep Learning Model Training History in Keras(http://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/)
Overfitting and Underfitting With Machine Learning Algorithms(http://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/)
Using Learning Rate Schedules for Deep Learning Models in Python with Keras(http://machinelearningmastery.com/using-learning-rate-schedules-deep-learning-models-python-keras/)
What learning rate should be used for backprop?(ftp://ftp.sas.com/pub/neural/FAQ2.html#A_learn_rate)
What are batch, incremental, on-line … learning?(ftp://ftp.sas.com/pub/neural/FAQ2.html#A_styles)
Intuitively, how does mini-batch size affect the performance of (stochastic) gradient descent?(https://www.quora.com/Intuitively-how-does-mini-batch-size-affect-the-performance-of-stochastic-gradient-descent)
Ensemble Machine Learning Algorithms in Python with scikit-learn(http://machinelearningmastery.com/ensemble-machine-learning-algorithms-python-scikit-learn/)
How to Improve Machine Learning Results(http://machinelearningmastery.com/how-to-improve-machine-learning-results/)
How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras(http://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/)
Must Know Tips/Tricks in Deep Neural Networks(http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html)
How to increase validation accuracy with deep neural net?(http://stackoverflow.com/questions/37020754/how-to-increase-validation-accuracy-with-deep-neural-net)