Automatic self-diagnosis provides low-cost and accessible healthcare via an agent that queries the patient and makes predictions about possible diseases. From a machine learning perspective, symptom-based self-diagnosis can be viewed as a sequential feature selection and classification problem. Reinforcement learning methods have shown good performance in this task but often suffer from large search spaces and costly training. To address these problems, we propose a competitive framework, called FIT, which uses an information-theoretic reward to determine what data to collect next. FIT improves over previous information-based approaches by using a multimodal variational autoencoder (MVAE) model and a two-step sampling strategy for disease prediction. Furthermore, we propose novel methods to substantially reduce the computational cost of FIT to a level that is acceptable for practical online self-diagnosis. Our results in two simulated datasets show that FIT can effectively deal with large search space problems, outperforming existing baselines. Moreover, using two medical datasets, we show that FIT is a competitive alternative in real-world settings.
翻译:为了解决这些问题,我们提议了一个竞争框架,称为FIT,它使用信息理论奖励来确定下一步要收集的数据。FIT通过机器学习的角度,可以将基于症状的自我诊断视为一个连续特征选择和分类问题。强化学习方法在这项工作中表现良好,但往往需要大量的搜索空间和昂贵的培训。为了解决这些问题,我们提议了一个称为FIT的竞争性框架,它使用信息理论奖励来确定下一步要收集的数据。FIT通过使用多式自动变换器模型和疾病预测的两步抽样战略,改进了以往基于信息的方法。此外,我们提出新方法,将FIT的计算成本大幅降低到实用在线自我诊断可以接受的水平。我们在两个模拟数据集中的结果显示,FIT可以有效地处理大型搜索空间问题,超过现有的基线。此外,我们用两个医疗数据集显示,FIT在现实环境中是一种竞争性的替代方法。