While existing approaches excel at recognising current surgical phases, they provide limited foresight and intraoperative guidance into future procedural steps. Similarly, current anticipation methods are constrained to predicting short-term and singular events, neglecting the dense and sequential nature of surgical workflows. To address these needs and limitations, we propose SWAG (Surgical Workflow Anticipative Generation), a framework to combine phase recognition and anticipation, using a generative approach for surgical workflow guidance. This paper investigates two distinct decoding methods-single-pass (SP) and auto-regressive (AR)-to generate sequences of future surgical phases at minute intervals over long horizons of up to 60 minutes. We propose a novel embedding approach using prior knowledge to enhance the accuracy of phase anticipation. Additionally, our anticipative framework offers remaining time regression and proposes a regression-to-classification (R2C) method. SWAG's performance was evaluated on the Cholec80 and AutoLaparo21 datasets. Our single-pass model with prior knowledge embeddings (SP*) achieves 49.8% mean accuracy over 18-minute anticipation on AutoLaparo21, while the simple SP with R2C extension reaches 56.6% mean accuracy over the same horizon on Cholec80. Moreover, our approach outperforms existing methods on the phase remaining time regression task, achieving weighted mean absolute errors of 0.32 and 0.48 minutes for 2- and 3-minute horizons, respectively. SWAG demonstrates versatility across classification and regression tasks and creates a temporal continuity between surgical workflow recognition and anticipation. While further studies are required to understand the impact of generative-based anticipation intraoperatively, our method provides steps towards this direction.
翻译:暂无翻译