Since the mapping relationship between definitized intra-interventional 2D X-ray and undefined pre-interventional 3D Computed Tomography(CT) is uncertain, auxiliary positioning devices or body markers, such as medical implants, are commonly used to determine this relationship. However, such approaches can not be widely used in clinical due to the complex realities. To determine the mapping relationship, and achieve a initializtion post estimation of human body without auxiliary equipment or markers, a cross-modal matching transformer network is proposed to matching 2D X-ray and 3D CT images directly. The proposed approach first deep learns skeletal features from 2D X-ray and 3D CT images. The features are then converted into 1D X-ray and CT representation vectors, which are combined using a multi-modal transformer. As a result, the well-trained network can directly predict the spatial correspondence between arbitrary 2D X-ray and 3D CT. The experimental results show that when combining our approach with the conventional approach, the achieved accuracy and speed can meet the basic clinical intervention needs, and it provides a new direction for intra-interventional registration.
翻译:由于分解的2DX射线和未定义的3D前干涉成形成像(CT)之间的绘图关系不确定,因此通常使用辅助定位装置或身体标记(如医用植入器)来确定这种关系,但是,由于复杂的现实,在临床中无法广泛使用这种方法。为了确定绘图关系,在没有辅助设备或标记的情况下对人体进行初始后估计,建议建立一个跨模式匹配的变压器网络,直接匹配2DX射线和3DCT成像。拟议的方法首先从2DX射线和3DCT成像中深入学习骨骼特征,然后将这些特征转换为1DX射线和CT为代表矢量,这些特征结合使用多式变压器。结果,经过良好训练的网络可以直接预测任意的2DX射线和3DCT之间的空间通信。实验结果显示,在将我们的方法与常规方法相结合时,实现的准确性和速度能够满足基本的临床干预需要,为内部的登记提供了新的方向。