Purpose: We propose a formal framework for the modeling and segmentation of minimally-invasive surgical tasks using a unified set of motion primitives (MPs) to enable more objective labeling and the aggregation of different datasets. Methods: We model dry-lab surgical tasks as finite state machines, representing how the execution of MPs as the basic surgical actions results in the change of surgical context, which characterizes the physical interactions among tools and objects in the surgical environment. We develop methods for labeling surgical context based on video data and for automatic translation of context to MP labels. We then use our framework to create the COntext and Motion Primitive Aggregate Surgical Set (COMPASS), including six dry-lab surgical tasks from three publicly-available datasets (JIGSAWS, DESK, and ROSMA), with kinematic and video data and context and MP labels. Results: Our context labeling method achieves near-perfect agreement between consensus labels from crowd-sourcing and expert surgeons. Segmentation of tasks to MPs results in the creation of the COMPASS dataset that nearly triples the amount of data for modeling and analysis and enables the generation of separate transcripts for the left and right tools. Conclusion: The proposed framework results in high quality labeling of surgical data based on context and fine-grained MPs. Modeling surgical tasks with MPs enables the aggregation of different datasets and the separate analysis of left and right hands for bimanual coordination assessment. Our formal framework and aggregate dataset can support the development of explainable and multi-granularity models for improved surgical process analysis, skill assessment, error detection, and autonomy.
翻译:目的:我们提出了一个正式的框架,用于使用一组统一的动作原语(MPs)建模和分割微创手术任务,以实现更客观的标记和不同数据集的聚合。方法:我们将干式手术任务建模为有限状态机,表示将MP作为基本手术操作的执行如何导致手术环境中工具和物体之间的物理相互作用的变化,进而表示手术上下文。我们根据视频数据开发了一种基于手术上下文的标记方法以及自动将上下文转换为MP标记的方法。然后,我们使用我们的框架创建了COntext和Motion Primitive Aggregate Surgical Set(COMPASS),其中包括来自三个公开数据集(JIGSAWS,DESK和ROSMA)的六个干式手术任务,具有运动学和视频数据、上下文和MP标签。结果:我们的上下文标记方法在众包和专家医生共识标签之间达到几乎完美的一致性。将任务分割为MP会导致COMPASS数据集的创建,该数据集将可用于建模和分析,并启用左右工具的单独转录。结论:所提出的框架基于上下文和细粒度MP实现了对手术数据的高质量标记。使用MP对手术任务进行建模使不同数据集得以聚合,并可以分析左右手的双手协调性。我们的正式框架和聚合数据集可支持解释和多粒度模型的开发,以改进手术过程分析,技能评估,错误检测和自主性。