Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD), by delivering electrical pulses to the Basal Ganglia (BG) region of the brain. However, DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude; this energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects (e.g., gait impairment). In this work, we introduce an offline reinforcement learning (RL) framework, allowing the use of past clinical data to train an RL policy to adjust the stimulation amplitude in real time, with the goal of reducing energy use while maintaining the same level of treatment (i.e., control) efficacy as cDBS. Moreover, clinical protocols require the safety and performance of such RL controllers to be demonstrated ahead of deployments in patients. Thus, we also introduce an offline policy evaluation (OPE) method to estimate the performance of RL policies using historical data, before deploying them on patients. We evaluated our framework on four PD patients equipped with the RC+S DBS system, employing the RL controllers during monthly clinical visits, with the overall control efficacy evaluated by severity of symptoms (i.e., bradykinesia and tremor), changes in PD biomakers (i.e., local field potentials), and patient ratings. The results from clinical experiments show that our RL-based controller maintains the same level of control efficacy as cDBS, but with significantly reduced stimulation energy. Further, the OPE method is shown effective in accurately estimating and ranking the expected returns of RL controllers.
翻译:深脑刺激(DBS)在治疗帕金森氏病引起的运动症状方面显示了巨大的希望。 但是,美国食品和药品管理局(FDA)批准的DBS设备只能以固定振幅提供连续的DBS(cDBS)刺激力;这种能源效率低下的操作降低了该装置的电池寿命,无法动态地适应治疗活动,并可能造成严重的副作用(例如, gait defail)。在这项工作中,我们引入了一个离线强化学习(RL)框架,允许使用过去的临床数据来培训RL政策,以实时调整刺激振动振荡度,目标是减少能源使用量,同时保持与cDBS等级相同的治疗(e.DBS)效率。此外,临床协议要求这种RL控制器的安全性能和性能在病人部署前得到演示。因此,我们还引入了一种离线政策评估(OPE)方法来评估RLD的性能水平,在使用历史数据之前,在RLS水平上运用了我们的RL值评估,在RL值的临床分析框架中, 展示了我们对RL值的能量分析结果。</s>