Imbalanced token distributions naturally exist in text documents, leading neural language models to overfit on frequent tokens. The token imbalance may dampen the robustness of radiology report generators, as complex medical terms appear less frequently but reflect more medical information. In this study, we demonstrate how current state-of-the-art models fail to generate infrequent tokens on two standard benchmark datasets (IU X-RAY and MIMIC-CXR) of radiology report generation. % However, no prior study has proposed methods to adapt infrequent tokens for text generators feeding with medical images. To solve the challenge, we propose the \textbf{T}oken \textbf{Im}balance Adapt\textbf{er} (\textit{TIMER}), aiming to improve generation robustness on infrequent tokens. The model automatically leverages token imbalance by an unlikelihood loss and dynamically optimizes generation processes to augment infrequent tokens. We compare our approach with multiple state-of-the-art methods on the two benchmarks. Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens. Our ablation analysis shows that our reinforcement learning method has a major effect in adapting token imbalance for radiology report generation.
翻译:消息文档中的标记不平衡自然存在,导致神经语言模型在常见标记上过度拟合。标记不平衡可能降低放射学报表生成器的鲁棒性,因为复杂的医学术语出现频率较低,但反映更多的医疗信息。在本文中,我们展示了当前最先进的模型在放射学报告生成的两个标准基准数据集(IU X-RAY和MIMIC-CXR)上无法生成不常见标记。然而,以往的研究没有提出如何为输入医学图像的文本生成器自适应不常见的标记的方法。为了解决这个问题,我们提出了标记不平衡适应器(TIMER),旨在提高不常见标记的生成鲁棒性。该模型通过非似然损失自动利用标记不平衡,并动态优化生成过程以增强不常见标记。我们在两个基准数据集上将我们的方法与多种最先进的方法进行了比较。实验证明了我们的方法在整体和不常见标记的模型鲁棒性方面的有效性。我们的消融分析表明,我们的强化学习方法在适应放射学报告生成的标记不平衡方面有重要影响。