Clinically significant prostate cancer has a better chance to be sampled during ultrasound-guided biopsy procedures, if suspected lesions found in pre-operative magnetic resonance (MR) images are used as targets. However, the diagnostic accuracy of the biopsy procedure is limited by the operator-dependent skills and experience in sampling the targets, a sequential decision making process that involves navigating an ultrasound probe and placing a series of sampling needles for potentially multiple targets. This work aims to learn a reinforcement learning (RL) policy that optimises the actions of continuous positioning of 2D ultrasound views and biopsy needles with respect to a guiding template, such that the MR targets can be sampled efficiently and sufficiently. We first formulate the task as a Markov decision process (MDP) and construct an environment that allows the targeting actions to be performed virtually for individual patients, based on their anatomy and lesions derived from MR images. A patient-specific policy can thus be optimised, before each biopsy procedure, by rewarding positive sampling in the MDP environment. Experiment results from fifty four prostate cancer patients show that the proposed RL-learned policies obtained a mean hit rate of 93% and an average cancer core length of 11 mm, which compared favourably to two alternative baseline strategies designed by humans, without hand-engineered rewards that directly maximise these clinically relevant metrics. Perhaps more interestingly, it is found that the RL agents learned strategies that were adaptive to the lesion size, where spread of the needles was prioritised for smaller lesions. Such a strategy has not been previously reported or commonly adopted in clinical practice, but led to an overall superior targeting performance when compared with intuitively designed strategies.
翻译:临床上重要的前列腺癌在超声道导导导生物检查程序期间,如果将磁共振前图像中发现的疑似损伤作为目标,则更有可能在超声道导导导导生物检查程序中进行取样;然而,生物检查程序的诊断准确性受到依赖操作者的技能和经验的限制,在对目标进行取样方面,这是一个连续的决策过程,涉及通过超声波探测,为潜在的多重目标设置一系列采样针。这项工作的目的是学习强化学习(RL)政策,该政策对在指导模板方面持续定位2D超声视和生物检查针的操作进行优化,从而使得MRM的目标能够有效和充分地进行取样;然而,我们首先将这一任务设计为Markov决定程序,并建立一个环境,使个人病人能够根据超声波探测器探测器探测到一系列可能多重目标的采样针;因此,在每种生物测序程序之前,可以对MDP环境中的正值进行精度采采样取样。 来自54位前癌症患者的实验结果显示,其前期目标不是有效和充分的采样;我们首先将这一任务设计为Mark决定程序,而后期测测测测测测测测测测测测测测测测测测的逻辑战略,在11毫米的战略中,因此测测测测测测测为直接测测测测测测测测测测测测测测测测测测测测测测测测测测为人类的正确测为测为193 m 。