最佳第一光束搜索 (Best-First Beam Search) - 专知论文

会员服务 ·

0

束搜索 · 精确推断 · 剪枝 · 确切的 · 有偏 ·

2021 年 1 月 17 日

Best-First Beam Search

翻译：最佳第一光束搜索

Clara Meister,Tim Vieira,Ryan Cotterell

from arxiv, TACL 2020

Decoding for many NLP tasks requires an effective heuristic algorithm for approximating exact search since the problem of searching the full output space is often intractable, or impractical in many settings. The default algorithm for this job is beam search -- a pruned version of breadth-first search. Quite surprisingly, beam search often returns better results than exact inference due to beneficial search bias for NLP tasks. In this work, we show that the standard implementation of beam search can be made up to 10x faster in practice. Our method assumes that the scoring function is monotonic in the sequence length, which allows us to safely prune hypotheses that cannot be in the final set of hypotheses early on. We devise effective monotonic approximations to popular nonmonontic scoring functions, including length normalization and mutual information decoding. Lastly, we propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance, but runs in a fraction of the time.

翻译：解码许多 NLP 任务时, 需要一种有效的超值算法, 以接近精确搜索, 因为搜索全部输出空间的问题往往难以解决, 或者在许多设置中不切实际。这项工作的默认算法是波束搜索 -- -- 宽度第一搜索的原始版本。相当令人惊讶的是, 光束搜索的结果往往比对 NLP 任务进行有益的搜索偏差而得出的精确推理结果要好。在这项工作中, 我们显示, 光束搜索的标准实施速度在实际操作中可以达到10x。我们的方法假设, 分数函数在序列长度中是单数, 从而使我们能够安全地提取无法在早期最后一组假设中出现的假设。我们设计出有效的单调近似非单调评分函数, 包括长度正常化和相互解码。最后, 我们建议采用最佳第一 Beam 搜索的记忆调整变体, 它在下游操作中具有类似的有益搜索偏差, 但会在时间的一小部分。

0

相关内容

束搜索

小米在预训练模型的探索与优化

小米在预训练模型的探索与优化

专知会员服务

20+阅读 · 2020年12月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

专知会员服务

55+阅读 · 2020年8月28日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

AutoML与轻量模型大列表

AutoML与轻量模型大列表

专知

8+阅读 · 2019年4月29日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

专知

11+阅读 · 2018年3月29日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Learning by Teaching, with Application to Neural Architecture Search

Arxiv

0+阅读 · 2021年3月11日

Benchmarking deep inverse models over time, and the neural-adjoint method

Arxiv

0+阅读 · 2021年3月11日

Learning by Self-Explanation, with Application to Neural Architecture Search

Arxiv

0+阅读 · 2021年3月11日

Error-and-Erasure Decoding of Product and Staircase Codes

Arxiv

0+阅读 · 2021年3月9日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

IRLAS: Inverse Reinforcement Learning for Architecture Search

IRLAS: Inverse Reinforcement Learning for Architecture Search

Arxiv

4+阅读 · 2018年12月14日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

MnasNet: Platform-Aware Neural Architecture Search for Mobile

Arxiv

4+阅读 · 2018年7月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

相关VIP内容

小米在预训练模型的探索与优化

小米在预训练模型的探索与优化

专知会员服务

20+阅读 · 2020年12月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

专知会员服务

55+阅读 · 2020年8月28日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

AutoML与轻量模型大列表

AutoML与轻量模型大列表

专知

8+阅读 · 2019年4月29日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

【论文推荐】最新六篇主题模型相关论文—收敛率、大规模、深度主题建模、优化、情绪强度、广义动态主题模型

专知

11+阅读 · 2018年3月29日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

相关论文

Learning by Teaching, with Application to Neural Architecture Search

Arxiv

0+阅读 · 2021年3月11日

Benchmarking deep inverse models over time, and the neural-adjoint method

Arxiv

0+阅读 · 2021年3月11日

Learning by Self-Explanation, with Application to Neural Architecture Search

Arxiv

0+阅读 · 2021年3月11日

Error-and-Erasure Decoding of Product and Staircase Codes

Arxiv

0+阅读 · 2021年3月9日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

IRLAS: Inverse Reinforcement Learning for Architecture Search

IRLAS: Inverse Reinforcement Learning for Architecture Search

Arxiv

4+阅读 · 2018年12月14日

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Arxiv

3+阅读 · 2018年9月27日

MnasNet: Platform-Aware Neural Architecture Search for Mobile

Arxiv

4+阅读 · 2018年7月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员