Scientists often must simultaneously discover signals and localize them as precisely as possible. For instance, in genetic fine-mapping, high correlations between nearby genetic variants make it hard to identify the exact locations of causal variants. So the statistical task is to output as many disjoint regions containing a signal as possible, each as small as possible, while controlling false positives. Similar problems arise in any application where signals cannot be perfectly localized, such as locating stars in astronomical surveys and change point detection in time series data. Our first contribution is to propose a notion of resolution-adjusted power for such problems. Second, we introduce Bayesian Linear Programming (BLiP), a Bayesian method for jointly detecting and localizing signals. BLiP overcomes an extremely high-dimensional and non-convex problem to verifiably nearly maximize expected power while provably controlling false positives. BLiP is very computationally efficient and can wrap around nearly any Bayesian model and algorithm. Applying BLiP to existing state-of-the-art analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased resolution-adjusted power by 30-120% in just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R).
翻译:科学家往往必须同时发现信号并尽可能精确地将其本地化。 例如,在基因精细绘图中,附近遗传变异体之间的高度关联使得很难确定因果关系变异体的确切位置。 因此,统计任务是输出尽可能多的不相连区域,尽可能小,尽可能包含信号,同时控制假阳性。 在信号无法完全本地化的任何应用中都会出现类似的问题,例如在天文测量中定位恒星和在时间序列数据中改变点探测。 我们的第一个贡献是提出这些问题的分辨率调整能力概念。 其次,我们引入Bayesian 线性变异体(BLiP),这是一种用于联合检测和本地化信号的Bayesian方法。 BLiP克服了一个极高的、非连接区域的问题,以可核查的方式将预期的功率最大化,同时可以准确地控制假阳性。 BLiP在计算中非常高效,并且可以围绕任何Bayesian 模型和算法。我们的第一个贡献是将BliP应用到现有的状态分析器状态。 其次,我们引入了Beesian 数据(用于基因精密映测) 和Sloan CalLimal Croprodeal rouperveal 的新的解算算法中一个新数据源( 30 解算法中, 解算算法中,一个新的数字路路路路路路路路路路路路路路路路路路路基路基路基路路路路路路路路基路路路路路路路的测量路路路路路路路路路路的测量路段) 。