入门 | 简单实用的DL优化技巧

会员服务 ·

入门 | 简单实用的DL优化技巧

2019 年 1 月 8 日 机器学习算法与Python学习

本文介绍了几个深度学习模型的简单优化技巧，包括迁移学习、dropout、学习率调整等，并展示了如何用 Keras 实现。

以下是我与同事和学生就如何优化深度模型进行的对话、消息和辩论的摘要。如果你发现了有影响力的技巧，请分享。

首先，为什么要改进模型？

像卷积神经网络（CNN）这样的深度学习模型具有大量的参数；实际上，我们可以调用这些超参数，因为它们原本在模型中并没有被优化。你可以网格搜索这些超参数的最优值，但需要大量硬件计算和时间。那么，一个真正的数据科学家能满足于猜测这些基本参数吗？

改进模型的最佳方法之一是基于在你的领域进行过深入研究的专家的设计和体系结构，他们通常拥有强大的硬件可供使用。而且，他们经常慷慨地开源建模架构和原理。

深度学习技术

以下是一些通过预训练模型来改善拟合时间和准确性的方法：

研究理想的预训练体系架构：了解迁移学习的好处，或了解一些功能强大的 CNN 体系架构。考虑那些看起来不太适合但具有潜在共享特性的领域。
使用较小的学习率：由于预训练的权重通常优于随机初始化的权重，因此修改要更为精细！你在此处的选择取决于学习环境和预训练的表现，但请检查各个时期的误差，以了解距离收敛还要多久。
使用 dropout：与回归模型的 Ridge 和 LASSO 正则化一样，没有适用于所有模型的优化 alpha 或 dropout。这是一个超参数，取决于具体问题，必须进行测试。从更大的变化开始——用更大的网格搜索跨越几个数量级，如 np.logspace() 所能提供的那样——然后像上面的学习率一样下降。
限制权重大小：可以限制某些层的权重的最大范数（绝对值），以泛化我们的模型。
不要动前几层：神经网络的前几个隐藏层通常用于捕获通用和可解释的特征，如形状、曲线或跨域的相互作用。我们应该经常把这些放在一边，把重点放在进一步优化元潜在级别的特征上。这可能意味着添加隐藏层，这样我们就不需要匆忙处理了！
修改输出层：使用适合你的领域的新激活函数和输出大小替换模型默认值。不过，不要把自己局限于最明显的解决方案。尽管 MNIST 看起来似乎需要 10 个输出类，但有些数字有共同的变量，允许 12-16 个类可能会更好地解决这些变量，并提高模型性能！与上面提到的提示一样，深度学习模型应该随着我们接近输出而不断修改和定制。

Keras 中的技术

在 Keras 中修改 MNIST 的 dropout 和限制权重大小的方法如下：

# dropout in input and hidden layers
# weight constraint imposed on hidden layers
# ensures the max norm of the weights does not exceed 5
model = Sequential()
model.add(Dropout(0.2, input_shape=(784,))) # dropout on the inputs
# this helps mimic noise or missing data
model.add(Dense(128, input_dim=784, kernel_initializer='normal', activation='relu', kernel_constraint=maxnorm(5)))
model.add(Dropout(0.5))
model.add(Dense(128, kernel_initializer='normal', activation='tanh', kernel_constraint=maxnorm(5)))
model.add(Dropout(0.5))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid'))

dropout 最佳实践

使用 20-50 % 的 dropout，建议输入 20%。太低，影响可以忽略；太高，可能欠拟合。
在输入层和隐藏层上使用 dropout。这已被证明可以提高深度学习的性能。
使用伴有衰减的较大的学习速率，以及较大的动量。
限制权重！较大的学习速率会导致梯度爆炸。通过对网络权值施加约束（如大小为 5 的最大范数正则化）可以改善结果。
使用更大的网络。在较大的网络上使用 dropout 可能会获得更好的性能，从而使模型有更多的机会学习独立的表征。

下面是 Keras 中的最终层修改示例，其中包含 14 个 MNIST 类：

from keras.layers.core import Activation, Dense
model.layers.pop() # defaults to last
model.outputs = [model.layers[-1].output]
model.layers[-1].outbound_nodes = []
model.add(Dense(14, activation='softmax'))

以及如何冻结前五层权重的示例：

for layer in model.layers[:5]:
    layer.trainable = False

或者，我们可以将该层的学习速率设为零，或者使用每个参数的自适应学习算法，如 Adadelta 或 Adam。这有点复杂，在其他平台（如 Caffe）中实现得更好。

预训练网络库

Keras

Kaggle 列表：https://www.kaggle.com/gaborfodor/keras-pretrained-models
Keras 应用：https://keras.io/applications/
OpenCV 示例：https://www.learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/

TensorFlow

VGG16：https://www.learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/
Inceptiom V3：https://github.com/tensorflow/models/blob/master/inception/README.md#how-to-fine-tune-a-pre-trained-model-on-a-new-task
ResNet：https://github.com/tensorflow/models/blob/master/inception/README.md#how-to-fine-tune-a-pre-trained-model-on-a-new-task

Torch

LoadCaffe：https://github.com/szagoruyko/loadcaffe

Caffe

Model Zoo：https://github.com/BVLC/caffe/wiki/Model-Zoo

在 Jupyter 中查看你的 TensorBoard 图

模型的可视化通常很重要。如果你用 Keras 编写模型，它的抽象很好，但不允许你深入到模型的各个部分进行更细致的分析。幸运的是，下面的代码可以让我们直接使用 Python 可视化模型：

# From: http://nbviewer.jupyter.org/github/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb
# Helper functions for TF Graph visualization
from IPython.display import clear_output, Image, display, HTML
def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = bytes("<stripped %d bytes>"%size, 'utf-8')
    return strip_def

def rename_nodes(graph_def, rename_func):
    res_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = res_def.node.add() 
        n.MergeFrom(n0)
        n.name = rename_func(n.name)
        for i, s in enumerate(n.input):
            n.input[i] = rename_func(s) if s[0]!='^' else '^'+rename_func(s[1:])
    return res_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:800px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))
# Visualizing the network graph. Be sure expand the "mixed" nodes to see their 
# internal structure. We are going to visualize "Conv2D" nodes.
graph_def = tf.get_default_graph().as_graph_def()
tmp_def = rename_nodes(graph_def, lambda s:"/".join(s.split('_',1)))
show_graph(tmp_def)

使用 Keras 可视化你的模型

这一步将绘制模型的图并将其保存为 png 文件：

from keras.utils.visualize_util import plot
plot(model, to_file='model.png')

plot 采用两个可选参数：

show_shapes（默认为 False）控制输出形状是否显示在图中。
show_layer_names（默认为 True）控制层命名是否显示在图中。

也可以直接获得 pydot.Graph 对象并自己对其进行渲染，如在 iPython notebook 中显示它：

from IPython.display import SVG
from keras.utils.visualize_util import model_to_dot
SVG(model_to_dot(model).create(prog='dot', format='svg'))

授权转自机器之心

推荐阅读

下载 | 512页教程《神经网络与深度学习》，2018最新著作

必备 | AI & DS七大 Python 库

下载 | 954页《数据可视化》手册

知识点 | 全面理解支持向量机

下载 | 866页《计算机视觉：原理，算法，应用，学习》第五版

教程 | 106页《Python进阶》中文版

下载 | 479页《数据科学基础》教程

教程 | Vim 教程【命令-操作-快捷键】

登录查看更多

相关内容

暂退法

关注 0

《Python机器学习项目实战》，135页pdf带你小白入门机器学习

专知会员服务

174+阅读 · 2020年6月6日

【新书册】贝叶斯神经网络，41页pdf

专知会员服务

180+阅读 · 2020年6月3日

【干货书】Python深度学习第二版，Deep Learning with Python, Second Edition

专知会员服务

169+阅读 · 2020年5月9日

【干货书】Python概率图模型实现，284页pdf带你实战学习概率图模型

专知会员服务

237+阅读 · 2020年4月8日

【干货书】深度学习生命科学：基因组学、药物发现，238页pdf

专知会员服务

200+阅读 · 2020年3月18日

Sklearn 与 TensorFlow 机器学习实用指南,385页pdf

专知会员服务

131+阅读 · 2020年3月15日

【2020新书】实用Matlab深度学习 Practical MATLAB Deep Learning，260页pdf

专知会员服务

159+阅读 · 2020年2月13日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

【新书】傻瓜式入门深度学习，371页pdf

专知会员服务

193+阅读 · 2019年12月28日

花书《深度学习》笔记，深度学习规则，帮你抓住精髓！(附下载)

专知会员服务

62+阅读 · 2019年12月25日

介绍高维超参数调整 - 优化ML模型的最佳实践

AI研习社

7+阅读 · 2019年4月17日

7个实用的深度学习技巧

机器学习算法与Python学习

16+阅读 · 2019年3月6日

一位ML工程师构建深度神经网络的实用技巧

AI100

11+阅读 · 2018年9月12日

入门 | 深度学习模型的简单优化技巧

机器之心

10+阅读 · 2018年6月10日

精华 | 深度学习中的【五大正则化技术】与【七大优化策略】

机器学习算法与Python学习

5+阅读 · 2017年12月28日

深度学习基础之LSTM

全球人工智能

28+阅读 · 2017年12月18日

干货 | 深度学习之卷积神经网络(CNN)的模型结构

机器学习算法与Python学习

12+阅读 · 2017年11月1日

[学习] 这些深度学习网络调参技巧，你了解吗？

菜鸟的机器学习

7+阅读 · 2017年7月30日

[学习] 这些深度学习网络训练技巧，你了解吗？

菜鸟的机器学习

7+阅读 · 2017年7月29日

CNN超参数优化和可视化技巧详解

量子位

4+阅读 · 2017年7月15日

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Object detection on aerial imagery using CenterNet

Arxiv

6+阅读 · 2019年8月22日

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Arxiv

5+阅读 · 2018年12月19日

Efficient Road Lane Marking Detection with Deep Learning

Arxiv

5+阅读 · 2018年9月11日

Deep Learning for Generic Object Detection: A Survey

Arxiv

14+阅读 · 2018年9月6日

Deep Adaptive Proposal Network for Object Detection in Optical Remote Sensing Images

Arxiv

6+阅读 · 2018年7月19日

Asymmetric Similarity Loss Function to Balance Precision and Recall in Highly Unbalanced Deep Medical Image Segmentation

Arxiv

5+阅读 · 2018年6月29日

Improving Online Multiple Object tracking with Deep Metric Learning

Arxiv

7+阅读 · 2018年6月20日

Online Deep Metric Learning

Arxiv

8+阅读 · 2018年5月15日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

VIP会员