Scientists often must simultaneously discover signals and localize them as precisely as possible. For instance, in genetic fine-mapping, high correlations between nearby genetic variants make it hard to identify the exact locations of causal variants. So the statistical task is to output as many disjoint regions containing a signal as possible, each as small as possible, while controlling false positives. Similar problems arise in any application where signals cannot be perfectly localized, such as locating stars in astronomical surveys and change point detection in time series data. With this motivation, we introduce Bayesian Linear Programming (BLiP), a Bayesian method for jointly detecting and localizing signals. BLiP overcomes an extremely high-dimensional and non-convex problem to verifiably nearly maximize expected power while provably controlling false positives. BLiP is very computationally efficient and can wrap around nearly any Bayesian model and algorithm. Applying BLiP to existing state-of-the-art analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased power by 30-120% in just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R).
翻译:科学家往往必须同时发现信号并尽可能精确地将其本地化。 例如,在基因精细绘图中,附近遗传变异体之间的高度关联使得很难确定因果关系变异物的确切位置。 因此统计任务是输出尽可能多的不相连区域,尽可能小,尽可能包含信号,同时控制假正数。 在信号无法完全本地化的任何应用中都会出现类似的问题,例如在天文测量中定位恒星和在时间序列数据中改变点探测等。 有了这个动机, 我们引入了巴伊西亚线性线性编程( BLiP ), 一种用于联合探测和本地化信号的巴伊西亚方法。 BLiP 克服了极高的高度和非混异质问题, 以可核查的方式将预期的能量最大化, 同时可以控制假正数正数的正数。 BliP 非常高效, 可以覆盖近任何波亚模型和算法。 将 Biob 数据应用 BliP 应用到现有的最新分析( 用于基因精细绘图) 和少数斯隆数字天空调查( 用于天文源检测) 将Bimpimpimpal 安装30- pmal 。