Annotation and labeling of images are some of the biggest challenges in applying deep learning to medical data. Current processes are time and cost-intensive and, therefore, a limiting factor for the wide adoption of the technology. Additionally validating that measured performance improvements are significant is important to select the best model. In this paper, we demonstrate a method for creating segmentations, a necessary part of a data cleaning for ultrasound imaging machine learning pipelines. We propose a four-step method to leverage automatically generated training data and fast human visual checks to improve model accuracy while keeping the time/effort and cost low. We also showcase running experiments multiple times to allow the usage of statistical analysis. Poor quality automated ground truth data and quick visual inspections efficiently train an initial base model, which is refined using a small set of more expensive human-generated ground truth data. The method is demonstrated on a cardiac ultrasound segmentation task, removing background data, including static PHI. Significance is shown by running the experiments multiple times and using the student's t-test on the performance distributions. The initial segmentation accuracy of a simple thresholding algorithm of 92% was improved to 98%. The performance of models trained on complicated algorithms can be matched or beaten by pre-training with the poorer performing algorithms and a small quantity of high-quality data. The introduction of statistic significance analysis for deep learning models helps to validate the performance improvements measured. The method offers a cost-effective and fast approach to achieving high-accuracy models while minimizing the cost and effort of acquiring high-quality training data.
 翻译:使用迭代优化和统计结果验证的高效人机交互深度学习模型训练
翻译后的摘要:
标注和标记图像是将深度学习应用于医学数据的最大挑战之一。当前的处理过程耗时且成本高昂,因此成为技术广泛采用的限制因素。另外,验证测量性能的提升是否显著对于选择最佳模型非常重要。在本文中,我们展示了一种方法来创建分割图像,这是超声成像机器学习管道数据清理的必要部分。我们提出了一个四步方法,利用自动生成的训练数据和快速的人类视觉检查来提高模型的准确性,同时保持时间/精力和成本的低廉。我们还展示了多次运行实验来允许使用统计分析。利用质量较差的自动化基础真实数据和快速目测检查有效地训练了一个初始基础模型,通过一小部分更昂贵的人为生成的基础真实数据对其进行了改进。该方法在心脏超声分割任务上进行了演示,去除了静态患者信息。通过多次运行实验并在性能分布上使用学生t检验来展示其重要性。简单阈值算法的初始分割准确率为92%,通过改进,提高到了98%。通过在性能较差的算法上进行预训练以及使用少量高质量的数据,可以匹配或超过使用复杂算法训练的模型的性能。深度学习模型引入统计显著性分析有助于验证性能提升。该方法提供了一种成本效益高、快速实现高精度模型的方法,同时最大程度地减少了获取高质量训练数据的成本和精力。