Quantile regression is a powerful tool for inferring how covariates affect specific percentiles of the response distribution. Existing methods either estimate conditional quantiles separately for each quantile of interest or estimate the entire conditional distribution using semi- or non-parametric models. The former often produce inadequate models for real data and do not share information across quantiles, while the latter are characterized by complex and constrained models that can be difficult to interpret and computationally inefficient. Neither approach is well-suited for quantile-specific subset selection. Instead, we pose the fundamental problems of linear quantile estimation, uncertainty quantification, and subset selection from a Bayesian decision analysis perspective. For any Bayesian regression model -- including, but not limited to existing Bayesian quantile regression models -- we derive optimal point estimates, interpretable uncertainty quantification, and scalable subset selection techniques for all model-based conditional quantiles. Our approach introduces a quantile-focused squared error loss that enables efficient, closed-form computing and maintains a close relationship with Wasserstein-based density estimation. In an extensive simulation study, our methods demonstrate substantial gains in quantile estimation accuracy, inference, and variable selection over frequentist and Bayesian competitors. We use these tools to identify and quantify the heterogeneous impacts of multiple social stressors and environmental exposures on educational outcomes across the full spectrum of low-, medium-, and high-achieving students in North Carolina.
翻译:暂无翻译