Accurate survival prediction in radiotherapy (RT) is critical for optimizing treatment decisions. This study developed and validated the RT-Surv framework, which integrates general-domain, open-source large language models (LLMs) to structure unstructured electronic health records alongside structured clinical data. Using data from 34,276 patients and an external cohort of 852, the framework successfully transformed unstructured clinical information into structured formats. Incorporating LLM-structured clinical features improved the concordance index from 0.779 to 0.842 during external validation, demonstrating a significant performance enhancement. Key LLM-structured features, such as disease extent, general condition, and RT purpose, showed high predictive importance and aligned closely with statistically significant predictors identified through conventional statistical analyses, thereby improving model interpretability. Furthermore, the framework enhanced risk stratification, enabling more distinct differentiation among low-, intermediate-, and high-risk groups (p < 0.001) using LLM-structured clinical features. These findings highlight the potential of LLMs to convert unstructured data into actionable insights, improving predictive modeling and patient outcomes in clinics.
翻译:暂无翻译