Recent advances in large reasoning models have enabled complex, step-by-step reasoning but often introduce significant overthinking, resulting in verbose and redundant outputs that hinder efficiency. In this study, we examine whether explicit self-reflection, signaled by tokens such as "Wait" and "Hmm", is necessary for advanced reasoning. We propose NoWait, a simple yet effective approach that disables explicit self-reflection by suppressing these tokens during inference. Extensive experiments on ten benchmarks across textual, visual, and video reasoning tasks show that NoWait reduces chain-of-thought trajectory length by up to 27%-51% in five R1-style model series, without compromising model utility. NoWait thus offers a plug-and-play solution for efficient and utility-preserving multimodal reasoning.
翻译:近期大规模推理模型的进展已能实现复杂的逐步推理,但常伴随显著的过度思考现象,导致输出冗长冗余,影响推理效率。本研究探讨以“Wait”“Hmm”等标记为表征的显式自我反思是否对高级推理确有必要。我们提出NoWait方法——通过在推理过程中抑制此类标记来禁用显式自我反思的简洁有效方案。在涵盖文本、视觉及视频推理任务的十个基准测试中,针对五个R1系列模型的大规模实验表明,NoWait可将思维链轨迹长度缩减27%-51%,且不损害模型性能。该方法由此为高效且保持性能的多模态推理提供了一种即插即用解决方案。