Identifying disease-associated changes in DNA methylation can help to gain a better understanding of disease etiology. Bisulfite sequencing technology allows the generation of methylation profiles at single base of DNA. We previously developed a method for estimating smooth covariate effects and identifying differentially methylated regions (DMRs) from bisulfite sequencing data, which copes with experimental errors and variable read depths; this method utilizes the binomial distribution to characterize the variability in the methylated counts. However, bisulfite sequencing data frequently include low-count integers and can exhibit over or under dispersion relative to the binomial distribution. We present a substantial improvement to our previous work by proposing a quasi-likelihood-based regional testing approach which accounts for multiplicative and additive sources of dispersion. We demonstrate the theoretical properties of the resulting tests, as well as their marginal and conditional interpretations. Simulations show that the proposed method provides correct inference for smooth covariate effects and captures the major methylation patterns with excellent power.
翻译:确定DNA甲基化中与疾病有关的变化有助于更好地了解疾病病因病因学。二硫硫酸测序技术允许在DNA的单一基点上生成甲基化剖面。我们以前开发了一种方法,用于估计顺常共变效应,并查明来自双硫酸测序数据的有差异的甲基区域(DMRs),该方法可应对实验错误和可变阅读深度;这种方法利用二亚酸盐分布来说明甲基计数的变异性。然而,双硫酸盐测序数据通常包括低计数整数,并能够显示相对于二亚基分布的超值或分散度。我们提出了一种以准相似性为基础的区域测试方法,其中考虑到多种复制和添加的分散源,从而大大改进了我们以前的工作。我们展示了由此产生的测试的理论性质及其边际和有条件的解读。模拟表明,拟议方法为平稳共变数效应提供了正确的推断,并用极强力捕捉取主要甲基化模式。