Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and developing hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data. In practice, successful RL relies on informative latent states derived from sequential observations to develop optimal treatment strategies. To date, how best to construct such states in a healthcare setting is an open question. In this paper, we perform an empirical study of several information encoding architectures using data from septic patients in the MIMIC-III dataset to form representations of a patient state. We evaluate the impact of representation dimension, correlations with established acuity scores, and the treatment policies derived from them. We find that sequentially formed state representations facilitate effective policy learning in batch settings, validating a more thoughtful approach to representation learning that remains faithful to the sequential and partial nature of healthcare data.
翻译:近期,强化学习(RL)被用于确定和制定化粪病人的假设治疗战略,特别侧重于通过观察数据进行离线学习; 在实践中,成功的强化学习依靠从连续观察中得出的信息潜在状态来制定最佳治疗战略; 迄今为止,如何在保健环境中最好地建立这种状态是一个未决问题; 在本文件中,我们用MIMIC-III数据集中的化粪病人的数据对若干信息编码结构进行了实证研究,以形成病人状态的表示方式; 我们评估代表性层面的影响、与既定精度分数的关联以及由此产生的治疗政策。 我们发现,依次组成的州代表机构有助于在批次环境中进行有效的政策学习,并论证一种更深思熟虑的方法来代表仍然忠实于保健数据顺序和部分性质的学习。