It is well known that the Lasso can be interpreted as a Bayesian posterior mode estimate with a Laplacian prior. Obtaining samples from the full posterior distribution, the Bayesian Lasso, confers major advantages in performance as compared to having only the Lasso point estimate. Traditionally, the Bayesian Lasso is implemented via Gibbs sampling methods which suffer from lack of scalability, unknown convergence rates, and generation of samples that are necessarily correlated. We provide a measure transport approach to generate i.i.d samples from the posterior by constructing a transport map that transforms a sample from the Laplacian prior into a sample from the posterior. We show how the construction of this transport map can be parallelized into modules that iteratively solve Lasso problems and perform closed-form linear algebra updates. With this posterior sampling method, we perform maximum likelihood estimation of the Lasso regularization parameter via the EM algorithm. We provide comparisons to traditional Gibbs samplers using the diabetes dataset of Efron et al. Lastly, we give an example implementation on a computing system that leverages parallelization, a graphics processing unit, whose execution time has much less dependence on dimension as compared to a standard implementation.
翻译:众所周知, Lasso 可以用拉普拉西亚之前的拉普拉西亚语进行一种巴伊西亚的子外线模型估计。 从完整后部分布中采集样本,Bayesian Lasso 与仅使用Lasso点估计值相比,在性能方面有很大的优势。传统上,Bayesian Lasso是通过Gib抽样方法实施的,这些方法缺乏可缩放性、聚合率不明以及必然相关的样本的生成。我们提供了一种测量运输方法,通过建造一个运输图,将拉普拉西安之前的样本转换成来自Papolician的样本,变成来自Papolior的样本。我们展示了如何将该运输图的构造平行化为模块,以迭接方式解决Lasso问题,并进行封闭式线性变代数更新。我们用这种离子取样法对Lasso规范参数进行最大的可能性估计。我们用Efron等人的糖尿病数据集对传统的吉卜采样进行比较。最后,我们以实例为实例,在一种计算机系统上实施一个比对平行化工具,即图像处理标准版件。