Discrete data are abundant and often arise as counts or rounded data. These data commonly exhibit complex distributional features such as zero-inflation, over-/under-dispersion, boundedness, and heaping, which render many parametric models inadequate. Yet even for parametric regression models, approximations such as MCMC typically are needed for posterior inference. This paper introduces a Bayesian modeling and algorithmic framework that enables semiparametric regression analysis for discrete data with Monte Carlo (not MCMC) sampling. The proposed approach pairs a nonparametric marginal model with a latent linear regression model to encourage both flexibility and interpretability, and delivers posterior consistency even under model misspecification. For a parametric or large-sample approximation of this model, we identify a class of conjugate priors with (pseudo) closed-form posteriors. All posterior and predictive distributions are available analytically or via direct Monte Carlo sampling. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and variable selection with discrete data. Simulation studies demonstrate significant advantages in computing, prediction, estimation, and selection relative to existing alternatives. This novel approach is applied successfully to self-reported mental health data that exhibit zero-inflation, overdispersion, boundedness, and heaping.
翻译:分散数据是丰富的,往往作为计数或四舍四入数据产生。这些数据通常显示出复杂的分布特征,如零通货膨胀、过度/分散、界限和加热等,使得许多参数模型不完备。即使对于参数回归模型来说,对于后推推论来说,通常也需要像MCMC这样的近似值。本文介绍一个贝叶建模和算法框架,以便能够通过蒙特卡洛(而不是MCMC)取样对离散数据进行半参数回归分析。拟议方法配对一个非参数边际模型,带有潜线性线性回归模型,鼓励灵活性和可解释性,甚至在模型误差的情况下也提供后推线性一致性。对于模型的准或大模版近似近似性模型,我们通常需要找到一种与(假体)闭式后推体后推仪相匹配的类别。所有后推和预测性分布都可以通过分析或直接蒙特卡洛取样获得。这些工具对线性回归、非线性模型通过基础扩展和可变性回归模型进行广泛有用,可以鼓励灵活性和可解释性回归性回归模型,甚至在模型中提供离性数据。对于离层数据、模拟分析、模拟分析、模拟、模拟选择现有自我分析、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟、模拟等等等等等等等等等等等等等等。</s>