Breast cancer is one of the leading causes of cancer deaths in women. As the primary output of breast screening, breast ultrasound (US) video contains exclusive dynamic information for cancer diagnosis. However, training models for video analysis is non-trivial as it requires a voluminous dataset which is also expensive to annotate. Furthermore, the diagnosis of breast lesion faces unique challenges such as inter-class similarity and intra-class variation. In this paper, we propose a pioneering approach that directly utilizes US videos in computer-aided breast cancer diagnosis. It leverages masked video modeling as pretraining to reduce reliance on dataset size and detailed annotations. Moreover, a correlation-aware contrastive loss is developed to facilitate the identifying of the internal and external relationship between benign and malignant lesions. Experimental results show that our proposed approach achieved promising classification performance and can outperform other state-of-the-art methods.
翻译:乳腺癌是导致妇女癌症死亡的主要原因之一。作为乳房检查的主要产出,乳房超声(美国)视频含有用于癌症诊断的独家动态信息;然而,视频分析培训模式并非三重性,因为它需要庞大的数据集,对笔记费也是昂贵的。此外,对乳房损伤的诊断面临独特的挑战,如阶级间相似性和阶级内部差异。在本文中,我们提议了一种开拓性方法,在计算机辅助乳腺癌诊断中直接使用美国视频。它利用蒙面视频模型作为培训前,以减少对数据集大小和详细说明的依赖。此外,还开发了一种具有相关性的对比性损失,以便于识别良性与恶性损伤之间的内外部关系。实验结果表明,我们提出的方法取得了良好的分类性能,并超越了其他最先进的方法。