Generated synthetic data in medical research can substitute privacy and security-sensitive data with a large-scale curated dataset, reducing data collection and annotation costs. As part of this effort, we propose UniXGen, a unified chest X-ray and report generation model, with the following contributions. First, we design a unified model for bidirectional chest X-ray and report generation by adopting a vector quantization method to discretize chest X-rays into discrete visual tokens and formulating both tasks as sequence generation tasks. Second, we introduce several special tokens to generate chest X-rays with specific views that can be useful when the desired views are unavailable. Furthermore, UniXGen can flexibly take various inputs from single to multiple views to take advantage of the additional findings available in other X-ray views. We adopt an efficient transformer for computational and memory efficiency to handle the long-range input sequence of multi-view chest X-rays with high resolution and long paragraph reports. In extensive experiments, we show that our unified model has a synergistic effect on both generation tasks, as opposed to training only the task-specific models. We also find that view-specific special tokens can distinguish between different views and properly generate specific views even if they do not exist in the dataset, and utilizing multi-view chest X-rays can faithfully capture the abnormal findings in the additional X-rays. The source code is publicly available at: https://github.com/ttumyche/UniXGen.
翻译:医学研究中生成的合成数据可以取代隐私和安全敏感数据,以大规模整理数据集取代隐私和安全敏感数据,减少数据收集和批注费用。作为这一努力的一部分,我们建议UnXGen,一个统一的胸前X光和报告的生成模型,并作出以下贡献。首先,我们设计一个双向胸X光和报告的生成统一模型,采用矢量量化方法,将胸X光分解成离散的视觉符号,并将这两项任务作为序列生成任务。第二,我们推出若干特殊标志,以生成带有特定观点的胸X光片,在没有理想观点时,这些观点是有用的。此外,UnXGen可以灵活地从一个角度到多个角度提供各种投入,以利用其他X光视图中的额外发现。我们采用了高效的计算和记忆效率变压器,以高分辨率和长段落报告方式处理多视X光胸透的远程输入序列。在广泛的实验中,我们展示了我们的统一模型对两组生成任务具有协同效应,而不是培训任务特定模型。我们还发现,在特定模型中,我们还可以正确区分特定图像。</s>