古代革命神经网络统一重量初始化模型 (A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks) - 专知论文

会员服务 ·

0

Weight · Tensor · Neural Networks · 卷积 · Networking ·

2022 年 7 月 25 日

A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks

翻译：古代革命神经网络统一重量初始化模型

Yu Pan,Zeyong Su,Ao Liu,Jingquan Wang,Nannan Li,Zenglin Xu

from arxiv, Accepted in ICML 2022

Tensorial Convolutional Neural Networks (TCNNs) have attracted much research attention for their power in reducing model parameters or enhancing the generalization ability. However, exploration of TCNNs is hindered even from weight initialization methods. To be specific, general initialization methods, such as Xavier or Kaiming initialization, usually fail to generate appropriate weights for TCNNs. Meanwhile, although there are ad-hoc approaches for specific architectures (e.g., Tensor Ring Nets), they are not applicable to TCNNs with other tensor decomposition methods (e.g., CP or Tucker decomposition). To address this problem, we propose a universal weight initialization paradigm, which generalizes Xavier and Kaiming methods and can be widely applicable to arbitrary TCNNs. Specifically, we first present the Reproducing Transformation to convert the backward process in TCNNs to an equivalent convolution process. Then, based on the convolution operators in the forward and backward processes, we build a unified paradigm to control the variance of features and gradients in TCNNs. Thus, we can derive fan-in and fan-out initialization for various TCNNs. We demonstrate that our paradigm can stabilize the training of TCNNs, leading to faster convergence and better results.

翻译：在减少模型参数或提高一般化能力方面,TCNN的探索甚至受到重量初始化方法的阻碍。具体而言,Xavier或Kaiming初始化等一般初始化方法通常不能产生对TCNN的适当加权。与此同时,尽管对具体结构(例如Tensor Ring Nets)有特殊的方法,但它们在减少模型参数或提高一般化能力方面的力量引起了许多研究关注。然而,对TCNN的探索甚至受到重量初始化方法的阻碍。为了解决这一问题,我们提出了一个普遍加权初始化模式,该模式将Xavier和Kaiming方法普遍化,并可以广泛适用于任意的TCNNs。具体地说,我们首先介绍将TCNN的落后进程转换为类似的革命进程的再生转型。然后,根据前向和后向进程中的变异体操作者,我们用统一的模式来控制TCNN的特征和梯变异性,这样我们就可以对TCN的初始化和先导式进行更快速的趋同。因此,我们能够对TCN的趋同式进行更快速的趋同。

0

相关内容

Weight

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

FoxO3转录因子调节小鼠成肌细胞增殖／分化的信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

相关于算子的变指标函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑绝缘体薄膜与异质结制备及表面拓扑电子态研究

国家自然科学基金

0+阅读 · 2011年12月31日

调控家蚕发育非编码RNA（non-coding RNA, ncRNA）的功能解析

国家自然科学基金

0+阅读 · 2011年12月31日

生物钟调控细胞钙离子节律振荡的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

磷离子注入制备高电导率n型纳米金刚石薄膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

Arxiv

0+阅读 · 2022年9月16日

NAAP-440 Dataset and Baseline for Neural Architecture Accuracy Prediction

NAAP-440 Dataset and Baseline for Neural Architecture Accuracy Prediction

Arxiv

0+阅读 · 2022年9月15日

Deep invariant networks with differentiable augmentation layers

Arxiv

0+阅读 · 2022年9月15日

A Tensor-EM Method for Large-Scale Latent Class Analysis with Binary Responses

Arxiv

0+阅读 · 2022年9月14日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

13+阅读 · 2018年6月26日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

Arxiv

0+阅读 · 2022年9月16日

NAAP-440 Dataset and Baseline for Neural Architecture Accuracy Prediction

NAAP-440 Dataset and Baseline for Neural Architecture Accuracy Prediction

Arxiv

0+阅读 · 2022年9月15日

Deep invariant networks with differentiable augmentation layers

Arxiv

0+阅读 · 2022年9月15日

A Tensor-EM Method for Large-Scale Latent Class Analysis with Binary Responses

Arxiv

0+阅读 · 2022年9月14日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Bridging the Gap Between Spectral and Spatial Domains in Graph Neural Networks

Arxiv

15+阅读 · 2020年3月26日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

13+阅读 · 2018年6月26日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

相关基金

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

EV71病毒感染介导Sam68调控PI3K/AKT信号通路的分子机制

国家自然科学基金

1+阅读 · 2013年12月31日

FoxO3转录因子调节小鼠成肌细胞增殖／分化的信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

相关于算子的变指标函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Cocycle动力学和拟周期薛定谔算子的谱

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑绝缘体薄膜与异质结制备及表面拓扑电子态研究

国家自然科学基金

0+阅读 · 2011年12月31日

调控家蚕发育非编码RNA（non-coding RNA, ncRNA）的功能解析

国家自然科学基金

0+阅读 · 2011年12月31日

生物钟调控细胞钙离子节律振荡的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

磷离子注入制备高电导率n型纳米金刚石薄膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员