The analysis of large-scale datasets, especially in biomedical contexts, frequently involves a principled screening of multiple hypotheses. The celebrated two-group model jointly models the distribution of the test statistics with mixtures of two competing densities, the null and the alternative distributions. We investigate the use of weighted densities and, in particular, non-local densities as working alternative distributions, to enforce separation from the null and thus refine the screening procedure. We show how these weighted alternatives improve various operating characteristics, such as the Bayesian False Discovery rate, of the resulting tests for a fixed mixture proportion with respect to a local, unweighted likelihood approach. Parametric and nonparametric model specifications are proposed, along with efficient samplers for posterior inference. By means of a simulation study, we exhibit how our model outperforms both well-established and state-of-the-art alternatives in terms of various operating characteristics. Finally, to illustrate the versatility of our method, we conduct three differential expression analyses with publicly-available datasets from genomic studies of heterogeneous nature.
翻译:对大型数据集的分析,特别是在生物医学方面,经常涉及对多种假设进行有原则的筛选。著名的两组模型共同模拟试验统计数据与两种相竞密度、空洞和替代分布的混合体的分布。我们调查加权密度的使用,特别是非本地密度作为工作替代分布的用途,以强制与空洞分离,从而完善筛选程序。我们表明这些加权替代体如何改进各种操作特征,如巴耶斯假发现率,由此对当地非加权可能性方法的固定混合物比例进行测试。我们提出了参数和非参数模型规格,同时提出了后方推断的有效样本。通过模拟研究,我们展示了我们的模型在各种操作特征方面如何超越既有的和最新替代物。最后,为了说明我们方法的多功能性,我们用来自多种不同性质基因研究的公开数据集进行了三种差异表达分析。