When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario - 专知论文

会员服务 ·

0

黑盒 · Continuity · 优化器 · 最优化 · tuning ·

2023 年 5 月 17 日

When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario

翻译：暂无翻译

Chengcheng Han,Liqing Cui,Renyu Zhu,Jianing Wang,Nuo Chen,Qiushi Sun,Xiang Li,Ming Gao

Large pre-trained language models (PLMs) have garnered significant attention for their versatility and potential for solving a wide spectrum of natural language processing (NLP) tasks. However, the cost of running these PLMs may be prohibitive. Furthermore, PLMs may not be open-sourced due to commercial considerations and potential risks of misuse, such as GPT-3. The parameters and gradients of PLMs are unavailable in this scenario. To solve the issue, black-box tuning has been proposed, which utilizes derivative-free optimization (DFO), instead of gradient descent, for training task-specific continuous prompts. However, these gradient-free methods still exhibit a significant gap compared to gradient-based methods. In this paper, we introduce gradient descent into black-box tuning scenario through knowledge distillation. Furthermore, we propose a novel method GDFO, which integrates gradient descent and derivative-free optimization to optimize task-specific continuous prompts in a harmonized manner. Experimental results show that GDFO can achieve significant performance gains over previous state-of-the-art methods.

翻译：暂无翻译

0

相关内容

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

lncRNA-SNHG5通过影响Twist/Mi2/NuRD蛋白复合物活性调控胃癌EMT进展的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

A2AlTaO7基稀土钽铝酸盐设计及热物理性能调控

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

慢病毒介导ChABC基因修饰神经干细胞治疗脊髓损伤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

TfR抗体和CTX修饰纳米载体介导hTERTC27治疗神经胶质瘤

国家自然科学基金

0+阅读 · 2009年12月31日

ScenarioNet: Open-Source Platform for Large-Scale Traffic Scenario Simulation and Modeling

Arxiv

0+阅读 · 2023年7月2日

New intelligent defense systems to reduce the risks of Selfish Mining and Double-Spending attacks using Learning Automata

Arxiv

0+阅读 · 2023年7月2日

A Black-box NLP Classifier Attacker

Arxiv

0+阅读 · 2023年7月1日

Implicit Balancing and Regularization: Generalization and Convergence Guarantees for Overparameterized Asymmetric Matrix Sensing

Arxiv

0+阅读 · 2023年6月30日

Accelerating Inexact HyperGradient Descent for Bilevel Optimization

Arxiv

0+阅读 · 2023年6月30日

Modeling Parallel Programs using Large Language Models

Arxiv

0+阅读 · 2023年6月29日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Sequential Scenario-Specific Meta Learner for Online Recommendation

Sequential Scenario-Specific Meta Learner for Online Recommendation

Arxiv

16+阅读 · 2019年6月2日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

ScenarioNet: Open-Source Platform for Large-Scale Traffic Scenario Simulation and Modeling

Arxiv

0+阅读 · 2023年7月2日

New intelligent defense systems to reduce the risks of Selfish Mining and Double-Spending attacks using Learning Automata

Arxiv

0+阅读 · 2023年7月2日

A Black-box NLP Classifier Attacker

Arxiv

0+阅读 · 2023年7月1日

Implicit Balancing and Regularization: Generalization and Convergence Guarantees for Overparameterized Asymmetric Matrix Sensing

Arxiv

0+阅读 · 2023年6月30日

Accelerating Inexact HyperGradient Descent for Bilevel Optimization

Arxiv

0+阅读 · 2023年6月30日

Modeling Parallel Programs using Large Language Models

Arxiv

0+阅读 · 2023年6月29日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Arxiv

10+阅读 · 2021年12月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Sequential Scenario-Specific Meta Learner for Online Recommendation

Sequential Scenario-Specific Meta Learner for Online Recommendation

Arxiv

16+阅读 · 2019年6月2日

相关基金

lncRNA-SNHG5通过影响Twist/Mi2/NuRD蛋白复合物活性调控胃癌EMT进展的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

A2AlTaO7基稀土钽铝酸盐设计及热物理性能调控

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

慢病毒介导ChABC基因修饰神经干细胞治疗脊髓损伤的研究

国家自然科学基金

0+阅读 · 2011年12月31日

TfR抗体和CTX修饰纳米载体介导hTERTC27治疗神经胶质瘤

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员