Video recording is a widely used method for documenting infant and child behaviours in research and clinical practice. Video data has rarely been shared due to ethical concerns of confidentiality, although the need of shared large-scaled datasets remains increasing. This demand is even more imperative when data-driven computer-based approaches are involved, such as screening tools to complement clinical assessments. To share data while abiding by privacy protection rules, a critical question arises whether efforts at data de-identification reduce data utility? We addressed this question by showcasing the Prechtl's general movements assessment (GMA), an established and globally practised video-based diagnostic tool in early infancy for detecting neurological deficits, such as cerebral palsy. To date, no shared expert-annotated large data repositories for infant movement analyses exist. Such datasets would massively benefit training and recalibration of human assessors and the development of computer-based approaches. In the current study, sequences from a prospective longitudinal infant cohort with a total of 19451 available general movements video snippets were randomly selected for human clinical reasoning and computer-based analysis. We demonstrated for the first time that pseudonymisation by face-blurring video recordings is a viable approach. The video redaction did not affect classification accuracy for either human assessors or computer vision methods, suggesting an adequate and easy-to-apply solution for sharing movement video data. We call for further explorations into efficient and privacy rule-conforming approaches for deidentifying video data in scientific and clinical fields beyond movement assessments. These approaches shall enable sharing and merging stand-alone video datasets into large data pools to advance science and public health.
翻译:在研究和临床实践中,录象记录婴儿和儿童行为是一种广泛使用的方法,由于对保密的道德关切,很少分享录象数据,尽管共用大型数据集的需要仍在增加。当数据驱动的计算机化方法涉及数据驱动的计算机化方法时,这种需求就更加迫切,例如用于补充临床评估的筛选工具;为了在遵守隐私保护规则的同时分享数据,一个关键问题是数据去身份的努力是否降低了数据效用?我们通过展示普雷什特尔的一般运动评估方法(Prechtl的通用运动评估)来解决这一问题,这是在早期就学时就有一个既定的、全球采用的基于视频的诊断工具,用于发现神经系统缺陷,例如大脑麻痹。迄今为止,没有共享专家附加说明的大型婴儿运动分析规则数据库。这种数据集将极大地有利于培训和重新校正人类评估以及计算机化方法的开发。 在目前的研究中,从潜在的纵向婴儿群的序列中,共有的1945-91年一般运动和基于计算机的分析方法,我们首次展示了在临床缺陷的早期的临床诊断方法,例如大脑麻麻麻。我们曾证明,在虚拟化的大规模健康分析中进行模拟结构化的大规模数据流动方法,或者通过计算机数据流改为适当的数据转换方法,以便进行适当的数据转换分析。