医学SANSUFs:培训不受电子医疗记录关注的自我监督变压器 (Medical SANSformers: Training self-supervised transformers without attention for Electronic Medical Records)

We leverage deep sequential models to tackle the problem of predicting healthcare utilization for patients, which could help governments to better allocate resources for future healthcare use. Specifically, we study the problem of \textit{divergent subgroups}, wherein the outcome distribution in a smaller subset of the population considerably deviates from that of the general population. The traditional approach for building specialized models for divergent subgroups could be problematic if the size of the subgroup is very small (for example, rare diseases). To address this challenge, we first develop a novel attention-free sequential model, SANSformers, instilled with inductive biases suited for modeling clinical codes in electronic medical records. We then design a task-specific self-supervision objective and demonstrate its effectiveness, particularly in scarce data settings, by pre-training each model on the entire health registry (with close to one million patients) before fine-tuning for downstream tasks on the divergent subgroups. We compare the novel SANSformer architecture with the LSTM and Transformer models using two data sources and a multi-task learning objective that aids healthcare utilization prediction. Empirically, the attention-free SANSformer models perform consistently well across experiments, outperforming the baselines in most cases by at least $\sim 10$\%. Furthermore, the self-supervised pre-training boosts performance significantly throughout, for example by over $\sim 50$\% (and as high as $800$\%) on $R^2$ score when predicting the number of hospital visits.

翻译：我们利用深层次的顺序模型来解决预测病人保健利用情况的问题,这可以帮助政府更好地分配用于未来保健使用的资源。具体地说,我们研究的是“textit{differgend groups ”的问题,即小部分人口中的结果分布与一般人口有很大差异。如果分组规模很小(例如罕见疾病),为不同分组建立专门模型的传统方法可能会有问题。为了应对这一挑战,我们首先开发一个新的无注意力连续模型,即SANSexers,在电子医疗记录中注入适合于模拟临床代码的诱导偏差。然后我们设计一个任务专用的自我监督目标并展示其有效性,特别是在稀缺的数据环境中,先对整个健康登记册的每个模型(近100万病人)进行培训,然后对不同分组的下游任务进行微调。我们用两个数据源将新的SANSexexerm 架构与LSTM 和变异模型进行比较,并用一个多任务学习目标来帮助进行医疗利用预测。从50美元到50美元的免费的医院访问。我们设计了一个任务定位的自我监督目标,在最低的SNSxxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx