使用地形框架设计深海神经网络活性功能和模型调节设计 (Using Topological Framework for the Design of Activation Function and Model Pruning in Deep Neural Networks)

Success of deep neural networks in diverse tasks across domains of computer vision, speech recognition and natural language processing, has necessitated understanding the dynamics of training process and also working of trained models. Two independent contributions of this paper are 1) Novel activation function for faster training convergence 2) Systematic pruning of filters of models trained irrespective of activation function. We analyze the topological transformation of the space of training samples as it gets transformed by each successive layer during training, by changing the activation function. The impact of changing activation function on the convergence during training is reported for the task of binary classification. A novel activation function aimed at faster convergence for classification tasks is proposed. Here, Betti numbers are used to quantify topological complexity of data. Results of experiments on popular synthetic binary classification datasets with large Betti numbers(>150) using MLPs are reported. Results show that the proposed activation function results in faster convergence requiring fewer epochs by a factor of 1.5 to 2, since Betti numbers reduce faster across layers with the proposed activation function. The proposed methodology was verified on benchmark image datasets: fashion MNIST, CIFAR-10 and cat-vs-dog images, using CNNs. Based on empirical results, we propose a novel method for pruning a trained model. The trained model was pruned by eliminating filters that transform data to a topological space with large Betti numbers. All filters with Betti numbers greater than 300 were removed from each layer without significant reduction in accuracy. This resulted in faster prediction time and reduced memory size of the model.

翻译：在计算机愿景、语音识别和自然语言处理等不同领域,深心神经网络取得成功,这需要理解培训过程的动态动态,并使用经过培训的模式。本文的两种独立贡献是:(1) 用于更快培训趋同的新启动功能;(2) 系统处理经过培训的模型过滤器,而不论其激活功能如何。我们分析培训过程中每一层变化的培训样本空间的地形变化,改变激活功能。报告培训期间不断变化的激活功能对合并作用的影响,以进行二元分类。提议了一个旨在加快分类任务趋同的新启动功能。在这里,使用贝蒂数字来量化数据表层复杂性。报告了使用高贝蒂数字(>150)对广受欢迎的合成二进分类数据集的实验结果。结果显示,拟议的激活功能导致更快的趋同,要求以1.5至2的系数来减少其范围,因为贝蒂数字比拟议的激活功能更快。拟议的方法在基准图像数据集上得到验证:MNIST、CIFAR-10和CT-V-I的精确度。使用经过培训的大规模缩缩略数据,通过我们所培训的缩缩的缩缩缩的图像,用B的缩略图,通过S-B的缩略图的缩略图。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

首篇「课程学习（Curriculum Learning)」2021综述论文

专知会员服务

50+阅读 · 2021年1月31日

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

专知会员服务

51+阅读 · 2020年8月25日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日