标签平滑改善神经源代码摘要 (Label Smoothing Improves Neural Source Code Summarization) - 专知论文

会员服务 ·

0

标签平滑 · 平滑 · 神经模型 · 代码 · 自然语言描述 ·

2023 年 3 月 28 日

Label Smoothing Improves Neural Source Code Summarization

翻译：标签平滑改善神经源代码摘要

Sakib Haque,Aakash Bansal,Collin McMillan

Label smoothing is a regularization technique for neural networks. Normally neural models are trained to an output distribution that is a vector with a single 1 for the correct prediction, and 0 for all other elements. Label smoothing converts the correct prediction location to something slightly less than 1, then distributes the remainder to the other elements such that they are slightly greater than 0. A conceptual explanation behind label smoothing is that it helps prevent a neural model from becoming "overconfident" by forcing it to consider alternatives, even if only slightly. Label smoothing has been shown to help several areas of language generation, yet typically requires considerable tuning and testing to achieve the optimal results. This tuning and testing has not been reported for neural source code summarization - a growing research area in software engineering that seeks to generate natural language descriptions of source code behavior. In this paper, we demonstrate the effect of label smoothing on several baselines in neural code summarization, and conduct an experiment to find good parameters for label smoothing and make recommendations for its use.

翻译：标签平滑是一种神经网络的正则化技术。通常神经模型被训练成输出一个向量，其中只有一个元素是1，其它元素都是0。标签平滑将正确预测位置的值转换为略小于1的值，然后将剩下的部分分配给它们的它们的元素，这些元素略大于0。标签平滑的一个概念解释是帮助防止神经模型变得“过于自信”，强制其考虑备选项，即使只有微小的误差。标签平滑已被证明可以帮助多个语言生成领域，并且通常需要大量的调整和测试才能实现最优结果。这种调整和测试还没有报道在神经源代码摘要（一个旨在生成源代码行为的自然语言描述的软件工程领域中）上的效应。在这篇文章中，我们展示了标签平滑对神经代码摘要基准的影响，并进行了一项实验来寻找良好的标签平滑参数，并就其使用提出建议。

0

相关内容

标签平滑

标签平滑，在AI领域多指利用软标签方法对标签进行平滑，以限制模型过拟合。

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

49篇ICLR2020高分「图机器学习GML」接受论文及代码

49篇ICLR2020高分「图机器学习GML」接受论文及代码

专知会员服务

62+阅读 · 2020年1月18日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

近临界随机环境中随机游动的若干极限性质

国家自然科学基金

0+阅读 · 2015年12月31日

基于粘性解的随机时滞方程最优控制问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

自适应移动Kriging插值响应面可靠性分析方法及其应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

基于Takagi-Sugeno模型的非线性系统数据驱动最优控制方法适定性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

动力学可控合成Fe@Au纳米核壳结构的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ⅲ族氮化物多元合金纳米结构的制备、形貌与成分调控和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

形貌可控的磁性纳米粒子负载催化剂的制备及催化的有机合成反应

国家自然科学基金

0+阅读 · 2011年12月31日

A Simple Generative Model of Logical Reasoning and Statistical Learning

Arxiv

0+阅读 · 2023年5月18日

Online List Labeling with Predictions

Arxiv

0+阅读 · 2023年5月17日

Learning Likelihood Ratios with Neural Network Classifiers

Arxiv

0+阅读 · 2023年5月17日

LeTI: Learning to Generate from Textual Interactions

Arxiv

0+阅读 · 2023年5月17日

Infinite Class Mixup

Arxiv

0+阅读 · 2023年5月17日

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

Arxiv

0+阅读 · 2023年5月17日

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Arxiv

0+阅读 · 2023年5月16日

Content-Adaptive Downsampling in Convolutional Neural Networks

Arxiv

0+阅读 · 2023年5月16日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

Arxiv

11+阅读 · 2019年3月25日

VIP会员

文章信息

相关主题

自然语言描述

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

如何使用TensorFlow 排序构建推荐系统? How to build a recommendation system using TensorFlow Ranking?

专知会员服务

19+阅读 · 2022年3月13日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

49篇ICLR2020高分「图机器学习GML」接受论文及代码

49篇ICLR2020高分「图机器学习GML」接受论文及代码

专知会员服务

62+阅读 · 2020年1月18日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】VideoLucy：用于长视频理解的深度记忆回溯机制

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

【NTU博士论文】端到端鲁棒自动语音识别的最新进展

用于强化学习的扩散模型：基础、分类与发展

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

A Simple Generative Model of Logical Reasoning and Statistical Learning

Arxiv

0+阅读 · 2023年5月18日

Online List Labeling with Predictions

Arxiv

0+阅读 · 2023年5月17日

Learning Likelihood Ratios with Neural Network Classifiers

Arxiv

0+阅读 · 2023年5月17日

LeTI: Learning to Generate from Textual Interactions

Arxiv

0+阅读 · 2023年5月17日

Infinite Class Mixup

Arxiv

0+阅读 · 2023年5月17日

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

Arxiv

0+阅读 · 2023年5月17日

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Arxiv

0+阅读 · 2023年5月16日

Content-Adaptive Downsampling in Convolutional Neural Networks

Arxiv

0+阅读 · 2023年5月16日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

Arxiv

11+阅读 · 2019年3月25日

相关基金

近临界随机环境中随机游动的若干极限性质

国家自然科学基金

0+阅读 · 2015年12月31日

基于粘性解的随机时滞方程最优控制问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据下多因变量回归模型的统计推断

国家自然科学基金

5+阅读 · 2013年12月31日

自适应移动Kriging插值响应面可靠性分析方法及其应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

基于Takagi-Sugeno模型的非线性系统数据驱动最优控制方法适定性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

动力学可控合成Fe@Au纳米核壳结构的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ⅲ族氮化物多元合金纳米结构的制备、形貌与成分调控和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

形貌可控的磁性纳米粒子负载催化剂的制备及催化的有机合成反应

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员