The analysis of large-scale datasets, especially in biomedical contexts, frequently involves a principled screening of multiple hypotheses. The celebrated two-group model jointly models the distribution of the test statistics with mixtures of two competing densities, the null and the alternative distributions. We investigate the use of weighted densities and, in particular, non-local densities as working alternative distributions, to enforce separation from the null and thus refine the screening procedure. We show how these weighted alternatives improve various operating characteristics, such as the Bayesian False Discovery rate, of the resulting tests for a fixed mixture proportion with respect to a local, unweighted likelihood approach. Parametric and nonparametric model specifications are proposed, along with efficient samplers for posterior inference. By means of a simulation study, we exhibit how our model compares with both well-established and state-of-the-art alternatives in terms of various operating characteristics. Finally, to illustrate the versatility of our method, we conduct three differential expression analyses with publicly-available datasets from genomic studies of heterogeneous nature.
翻译:对大规模数据集的分析,特别是在生物医学方面,经常涉及对多种假设进行有原则的筛选。著名的两组模型共同模拟试验统计数据与两种相竞密度、无效和替代分布的混合物的分布。我们调查加权密度的使用情况,特别是非本地密度作为替代分布法的使用情况,以实施与无效的分离,从而完善筛选程序。我们表明这些加权替代方法如何改进各种操作特征,如巴耶斯假发现率,由此对当地非加权可能性方法的固定混合物比例进行测试。我们提出了参数和非参数模型规格,同时提出了后方推断的有效样本。通过模拟研究,我们展示了我们的模型如何在各种操作特征方面与成熟和最先进的替代方法相比较。最后,为了说明我们方法的多功能性,我们用来自混杂性质基因研究的公开数据集进行了三种差异表达分析。</s>