Missing values in covariates due to censoring by signal interference or lack of sensitivity in the measuring devices are common in industrial problems. We propose a full Bayesian solution to the prediction problem with an efficient Markov Chain Monte Carlo (MCMC) algorithm that updates all the censored covariate values jointly in a random scan Gibbs sampler. We show that the joint updating of missing covariate values can be at least two orders of magnitude more efficient than univariate updating. This increased efficiency is shown to be crucial for quickly learning the missing covariate values and their uncertainty in a real-time decision making context, in particular when there is substantial correlation in the posterior for the missing values. The approach is evaluated on simulated data and on data from the telecom sector. Our results show that the proposed Bayesian imputation gives substantially more accurate predictions than na\"ive imputation, and that the use of auxiliary variables in the imputation gives additional predictive power.
翻译:由信号干扰或测量装置缺乏敏感性造成的共变体缺失值在工业问题中很常见。 我们建议用高效的Markov 链子蒙特卡洛(MCMC)算法来完全解决预测问题,该算法在随机扫描Gibbs取样器中共同更新所有受审查的共变体值。 我们显示,对缺失的共变体值的联合更新至少比单变体更新效率高两个数量级。 效率的提高对于快速了解缺失的共变值及其在实时决策环境中的不确定性至关重要, 特别是当外传值与缺失值有实质性关联时。 这种方法是根据模拟数据和电信部门的数据进行评估的。 我们的结果显示, 拟议的Bayesian 估算法所提供的预测比na\ ive 估算法更准确得多, 在估算中使用辅助变量提供了额外的预测力。