麦克斯韦妖在工作：利用神经元饱和实现高效剪枝 (Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons) - 专知论文

会员服务 ·

0

神经元 · 剪枝 · 稀疏 · 饱和 · 单元 ·

2025 年 12 月 30 日

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons

翻译：麦克斯韦妖在工作：利用神经元饱和实现高效剪枝

Simon Dufort-Labbé,Pierluca D'Oro,Evgenii Nikishin,Razvan Pascanu,Pierre-Luc Bacon,Aristide Baratin

When training neural networks, dying neurons -- units becoming inactive or saturated -- are traditionally seen as harmful. This paper sheds new light on this phenomenon. By exploring the impact of various hyperparameter configurations on dying neurons during training, we gather insights on how to improve upon sparse training approaches to pruning. We introduce Demon Pruning (DemP), a method that controls the proliferation of dead neurons through a combination of noise injection on active units and a one-cycle schedule regularization strategy, dynamically leading to network sparsity. Experiments on CIFAR-10 and ImageNet datasets demonstrate that DemP outperforms existing dense-to-sparse structured pruning methods, achieving better accuracy-sparsity tradeoffs and accelerating training by up to 3.56$\times$. These findings provide a novel perspective on dying neurons as a resource for efficient model compression and optimization.

翻译：在训练神经网络时，死亡神经元——即变得不活跃或饱和的单元——传统上被视为有害。本文为这一现象提供了新的视角。通过探究不同超参数配置对训练过程中死亡神经元的影响，我们获得了关于如何改进稀疏训练剪枝方法的洞见。我们提出了Demon剪枝（DemP）方法，该方法通过在活跃单元上注入噪声并结合单周期调度正则化策略来控制死亡神经元的增殖，从而动态地引导网络实现稀疏化。在CIFAR-10和ImageNet数据集上的实验表明，DemP优于现有的稠密到稀疏结构化剪枝方法，实现了更优的精度-稀疏度权衡，并将训练速度最高提升至3.56倍。这些发现为将死亡神经元视为高效模型压缩与优化的资源提供了新颖的视角。

0

相关内容

神经元

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

专知会员服务

99+阅读 · 2020年7月3日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

开放知识图谱

36+阅读 · 2018年3月30日

基于散射点密度信息熵的层析SAR建筑三维重建新方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

“模块化自组装”DNA计算模型的研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于分层稀疏表示的微动目标ISAR三维层析成像技术

国家自然科学基金

1+阅读 · 2015年12月31日

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization

Arxiv

0+阅读 · 2025年12月30日

Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition

Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition

Arxiv

0+阅读 · 2025年12月29日

BLISS: Bandit Layer Importance Sampling Strategy for Efficient Training of Graph Neural Networks

Arxiv

0+阅读 · 2025年12月26日

LLM-Guided Exemplar Selection for Few-Shot Wearable-Sensor Human Activity Recognition

Arxiv

0+阅读 · 2025年12月26日

Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Arxiv

0+阅读 · 2025年12月26日

VIP会员

文章信息

相关主题

相关VIP内容

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

【KDD2020】图神经网络生成式预训练，GPT-GNN: Generative Pre-Training of Graph Neural Networks

专知会员服务

99+阅读 · 2020年7月3日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

热门VIP内容

开通专知VIP会员享更多权益服务

生成式人工智能导论：可靠性、负责任开发及实际应用（第二版）

《2025财年美陆军转型倡议（ATI）部队结构与组织提案》

【CMU博士论文】分布偏移下的可信机器学习

智能体 EDA 的曙光：自主数字芯片设计综述

相关资讯

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

论文浅尝 | Interaction Embeddings for Prediction and Explanation

论文浅尝 | Interaction Embeddings for Prediction and Explanation

开放知识图谱

11+阅读 · 2019年2月1日

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

CosFace: Large Margin Cosine Loss for Deep Face Recognition论文笔记

统计学习与视觉计算组

44+阅读 · 2018年4月25日

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

论文浅尝 | Know-Evolve: Deep Temporal Reasoning for Dynamic KG

开放知识图谱

36+阅读 · 2018年3月30日

相关论文

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization

Arxiv

0+阅读 · 2025年12月30日

Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition

Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition

Arxiv

0+阅读 · 2025年12月29日

BLISS: Bandit Layer Importance Sampling Strategy for Efficient Training of Graph Neural Networks

Arxiv

0+阅读 · 2025年12月26日

LLM-Guided Exemplar Selection for Few-Shot Wearable-Sensor Human Activity Recognition

Arxiv

0+阅读 · 2025年12月26日

Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Pruning as a Game: Equilibrium-Driven Sparsification of Neural Networks

Arxiv

0+阅读 · 2025年12月26日

相关基金

基于散射点密度信息熵的层析SAR建筑三维重建新方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

“模块化自组装”DNA计算模型的研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于分层稀疏表示的微动目标ISAR三维层析成像技术

国家自然科学基金

1+阅读 · 2015年12月31日

微信扫码咨询专知VIP会员