AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria - rubric - on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions.
翻译:由AI驱动的体育视频行动质量评估(AQA)可以模仿奥林匹克法官,帮助将表演评分作为第二次意见或培训,然而,这些AI方法是不可解释的,不能为其评分提供理由,而对于算法问责制很重要。事实上,为了说明其决定,体育法官在每一表演序列中采用一套一致的标准,而不是主观评分。因此,我们建议IRIS对AQA的动作序列进行可解释的鲁比分化分解。我们调查了IRIS,以评分图表滑雪性能的视频。IRIS预测了(1)行动部分,(2)每个部分的技术要素与基本评分的得分差异,(3)多个方案分,(4)总最后得分。在一项模型研究中,我们发现IRIS的表现优于非解释性、最先进的模型。在一项成型用户研究中,采用数字滑冰者与标注和知情解释一致,认为它们有用,并更值得信赖AI的评分。这项工作强调了使用判断性推理来解释AI决定的重要性。</s>