This chapter addresses important steps during the quality assurance and control of RWD, with particular emphasis on the identification and handling of missing values. A gentle introduction is provided on common statistical and machine learning methods for imputation. We discuss the main strengths and weaknesses of each method, and compare their performance in a literature review. We motivate why the imputation of RWD may require additional efforts to avoid bias, and highlight recent advances that account for informative missingness and repeated observations. Finally, we introduce alternative methods to address incomplete data without the need for imputation.
翻译:本章论述在保障和控制社署质量方面的重要措施,尤其强调识别和处理缺失的值;对通用统计和机器估算学习方法进行温和的介绍;讨论每种方法的主要优点和缺点,并在文献审查中比较这些方法的成绩;我们提出为什么在计算社署时可能需要进一步努力避免偏见,并着重说明最近取得的进展,说明缺乏信息的情况和反复观察;最后,我们采用替代方法,处理不完全的数据,而无需估算数据。