We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians. Because lung cancer has a low prevalence in primary care, we also address the problem of classification under highly imbalanced classes. Specifically, we use large Transformer-based pretrained language models (PLMs) and investigate: 1) how \textit{soft prompt-tuning} -- an NLP technique used to adapt PLMs using small amounts of training data -- compares to standard model fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more robust compared to PLMs in highly imbalanced settings; and 3) how models fare when trained on notes from a small number of patients. We find that 1) soft-prompt tuning is an efficient alternative to standard model fine-tuning; 2) PLMs show better discrimination but worse calibration compared to simpler static word embedding models as the classification problem becomes more imbalanced; and 3) results when training models on small number of patients are mixed and show no clear differences between PLMs and WEMs. All our code is available open source in \url{https://bitbucket.org/aumc-kik/prompt_tuning_cancer_prediction/}.
翻译:我们探究了不同的自然语言处理(NLP)方法,基于上下文表示的单词,用于使用荷兰初级保健医生的免费病人医疗记录的早期预测肺癌问题。因为肺癌在初级保健领域的发病率很低,所以我们还需要解决高度不平衡类别下的分类问题。具体来说,我们使用基于Transformer的预训练语言模型(PLMs),并研究:1)如何比较\textit{ 软提示调整}和标准模型微调—一种使用少量训练数据来适应PLMs的NLP技术; 2)在高度不平衡的情况下,简单的静态词嵌入模型(WEMs)是否比PLMs更具有鲁棒性; 3)当在少数患者的笔记上进行训练时,模型如何表现。我们发现:1)软提示调整是标准模型微调的有效替代方案; 2)随着分类问题变得更加不平衡,PLMs表现出更好的区分能力但校准性较差,而不是相对简单的静态词嵌入模型。3)当在少数患者的笔记上训练模型时,结果是混合的,没有清晰地显示出PLMs和WEMs之间的差异。我们的所有代码都可以在 \url {https://bitbucket.org/aumc-kik/prompt_tuning_cancer_prediction/} 中以开放源代码的形式使用。