Latent factor models are widely used to discover and adjust for hidden variation in modern applications. However, most methods do not fully account for uncertainty in the latent factors, which can lead to miscalibrated inferences such as overconfident p-values. In this article, we develop a fast and accurate method of uncertainty quantification in generalized bilinear models, which are a flexible extension of generalized linear models to include latent factors as well as row covariates, column covariates, and interactions. In particular, we introduce delta propagation, a general technique for propagating uncertainty among model components using the delta method. Further, we provide a rapidly converging algorithm for maximum a posteriori GBM estimation that extends earlier methods by estimating row and column dispersions. In simulation studies, we find that our method provides approximately correct frequentist coverage of most parameters of interest. We demonstrate on RNA-seq gene expression analysis and copy ratio estimation in cancer genomics.
翻译:隐性要素模型被广泛用于发现和调整现代应用中隐藏的变异,但是,大多数方法并不完全考虑到潜伏因素的不确定性,而潜伏因素可能导致错误校正的推论,例如过于自信的参数值。在本条中,我们开发了通用双线模型的快速和准确的不确定性量化方法,这是通用线性模型的灵活延伸,包括潜伏因素以及行共变、列共变和相互作用。特别是,我们引入了三角传播,这是利用三角方法在模型组成部分中传播不确定性的一种一般技术。此外,我们提供了一种快速趋同的算法,用于通过估计行和列的分散性来扩大早期方法的后部GBM估计。在模拟研究中,我们发现我们的方法对大多数感兴趣的参数提供了大致正确的常态覆盖。我们用RNA-seq基因表达分析和癌症基因组学的复制率估计。