Ultrasound (US) is widely used for its advantages of real-time imaging, radiation-free and portability. In clinical practice, analysis and diagnosis often rely on US sequences rather than a single image to obtain dynamic anatomical information. This is challenging for novices to learn because practicing with adequate videos from patients is clinically unpractical. In this paper, we propose a novel framework to synthesize high-fidelity US videos. Specifically, the synthesis videos are generated by animating source content images based on the motion of given driving videos. Our highlights are three-fold. First, leveraging the advantages of self- and fully-supervised learning, our proposed system is trained in weakly-supervised manner for keypoint detection. These keypoints then provide vital information for handling complex high dynamic motions in US videos. Second, we decouple content and texture learning using the dual decoders to effectively reduce the model learning difficulty. Last, we adopt the adversarial training strategy with GAN losses for further improving the sharpness of the generated videos, narrowing the gap between real and synthesis videos. We validate our method on a large in-house pelvic dataset with high dynamic motion. Extensive evaluation metrics and user study prove the effectiveness of our proposed method.
翻译:超声波(US)被广泛用于其实时成像、无辐射和可移动性的优势。在临床实践中,分析和诊断往往依靠美国序列而不是单一图像获得动态解剖信息。这对新手来说是一个挑战,因为使用病人的适当视频练习临床上是不实用的。在本文中,我们提议了一个新颖的框架,以综合美国高不贞的视频。具体地说,合成视频是通过根据特定驾驶视频的动作对源内容图像进行动画生成的。我们的亮点是三重。首先,利用自我和完全监督学习的优势,我们提议的系统在关键点检测方面受到薄弱的监控方式的培训。这些关键点随后为处理美国视频中复杂的高动态动作提供了重要信息。第二,我们用双调的解调内容和纹学习工具来有效减少模型学习困难。最后,我们采用了对抗性培训战略,用GAN损失来进一步提高生成的视频的清晰度,缩小了真实和全面监控视频之间的差距。我们提出的系统在关键点检测上以薄弱的监控方式对系统进行训练。我们用高动态模型进行测试的方法。我们用高压模型测试。