We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation. The first algorithm, p-exact search, locally prunes the next-token distribution and performs an exact search over the remaining space. The second algorithm, dynamic beam search, shrinks and expands the beam size according to the entropy of the candidate's probability distribution. Despite the probabilistic intuition behind nucleus search, experiments on machine translation and summarization benchmarks show that both algorithms reach the same performance levels as standard beam search.
翻译:我们把光束搜索与核取样的概率裁剪技术结合起来,为自然语言的生成创造两种决定性核心搜索算法。 第一个算法, p-exact search, 本地微调次声波分布, 并对剩余空间进行精确搜索。 第二个算法, 动态波束搜索, 根据候选人概率分布的昆虫缩小和扩展光束大小。 尽管核心搜索背后有概率直觉, 机器翻译和总和基准实验显示两种算法都达到标准波束搜索的性能水平 。