The ongoing technological revolution in measurement systems enables the acquisition of high-resolution samples in fields such as engineering, biology, and medicine. However, these observations are often subject to errors from measurement devices. Motivated by this challenge, we propose a denoising framework that employs diffusion models to generate denoised data whose distribution closely approximates the unobservable, error-free data, thereby permitting standard data analysis based on the denoised data. The key element of our framework is a novel Reproducing Kernel Hilbert Space-based method that trains the diffusion model with only error-contaminated data, admits a closed-form solution, and achieves a fast convergence rate in terms of estimation error. Furthermore, we verify the effectiveness of our method by deriving an upper bound on the Kullback--Leibler divergence between the distributions of the generated denoised data and the error-free data. A series of conducted simulations also verify the promising empirical performance of the proposed method compared to other state-of-the-art methods. To further illustrate the potential of this denoising framework in a real-world application, we apply it in a digital health context, showing how measurement error in continuous glucose monitors can influence conclusions drawn from a clinical trial on diabetes Mellitus disease.
翻译:暂无翻译