多级后勤回归网络趋同的证明 (A proof of convergence of multi-class logistic regression network) - 专知论文

会员服务 ·

0

对数几率回归 · Networking · Neural Networks · 层 · 二阶导数 ·

2021 年 1 月 27 日

A proof of convergence of multi-class logistic regression network

翻译：多级后勤回归网络趋同的证明

This paper revisits the special type of a neural network known under two names. In the statistics and machine learning community it is known as a multi-class logistic regression neural network. In the neural network community, it is simply the soft-max layer. The importance is underscored by its role in deep learning: as the last layer, whose autput is actually the classification of the input patterns, such as images. Our exposition focuses on mathematically rigorous derivation of the key equation expressing the gradient. The fringe benefit of our approach is a fully vectorized expression, which is a basis of an efficient implementation. The second result of this paper is the positivity of the second derivative of the cross-entropy loss function as function of the weights. This result proves that optimization methods based on convexity may be used to train this network. As a corollary, we demonstrate that no $L^2$-regularizer is needed to guarantee convergence of gradient descent.

翻译：本文重新审视了在两个名称下已知的神经网络的特殊类型。在统计和机器学习界中, 它被称为多级物流回归神经网络。在神经网络界中, 它只是软负层。它的重要性通过它在深层学习中的作用而得到强调: 作为最后一个层, 其自算法实际上是输入模式的分类, 如图像。我们的演示侧重于以数学严格的方式推断表达梯度的关键方程。我们的方法的边际效益是一种完全传导的表达方式, 这是高效实施的基础。本文的第二个结果是, 跨热带损失函数的第二个衍生物作为重量的函数的假设性。这个结果证明, 可以利用基于共性的最佳方法来训练这个网络。作为必然结果, 我们证明不需要$L$2$- 正规化器来保证梯度的归并。

0

相关内容

对数几率回归

对数几率回归

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

教程推荐 | 机器学习、Python等最好的150余个教程

教程推荐 | 机器学习、Python等最好的150余个教程

七月在线实验室

7+阅读 · 2018年6月6日

春节充电系列：李宏毅2017机器学习课程学习笔记05之Logistic 回归

春节充电系列：李宏毅2017机器学习课程学习笔记05之Logistic 回归

专知

5+阅读 · 2018年2月18日

春节充电系列：李宏毅2017机器学习课程学习笔记02之Regression

春节充电系列：李宏毅2017机器学习课程学习笔记02之Regression

专知

3+阅读 · 2018年2月13日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression

Arxiv

0+阅读 · 2021年3月23日

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Arxiv

0+阅读 · 2021年3月23日

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Arxiv

0+阅读 · 2021年3月23日

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression

Arxiv

0+阅读 · 2021年3月23日

Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence

Arxiv

0+阅读 · 2021年3月22日

Physics-Informed Neural Network Method for Solving One-Dimensional Advection Equation Using PyTorch

Arxiv

0+阅读 · 2021年3月21日

Horseshoe Prior Bayesian Quantile Regression

Arxiv

0+阅读 · 2021年3月18日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

VIP会员

文章信息

相关主题

对数几率回归

Neural Networks

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福大学博士论文】构建大语言模型的交互式学习流程管线

ACL 2025 | 弹性可伸缩知识图谱嵌入

【ICML2025】扩散模型的二重性

医学图像分割中的通用模型：与任务特定方法的综述与性能比较

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

教程推荐 | 机器学习、Python等最好的150余个教程

教程推荐 | 机器学习、Python等最好的150余个教程

七月在线实验室

7+阅读 · 2018年6月6日

春节充电系列：李宏毅2017机器学习课程学习笔记05之Logistic 回归

春节充电系列：李宏毅2017机器学习课程学习笔记05之Logistic 回归

专知

5+阅读 · 2018年2月18日

春节充电系列：李宏毅2017机器学习课程学习笔记02之Regression

春节充电系列：李宏毅2017机器学习课程学习笔记02之Regression

专知

3+阅读 · 2018年2月13日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

相关论文

SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression

Arxiv

0+阅读 · 2021年3月23日

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Arxiv

0+阅读 · 2021年3月23日

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Arxiv

0+阅读 · 2021年3月23日

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression

Arxiv

0+阅读 · 2021年3月23日

Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence

Arxiv

0+阅读 · 2021年3月22日

Physics-Informed Neural Network Method for Solving One-Dimensional Advection Equation Using PyTorch

Arxiv

0+阅读 · 2021年3月21日

Horseshoe Prior Bayesian Quantile Regression

Arxiv

0+阅读 · 2021年3月18日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

微信扫码咨询专知VIP会员