High dimensional data reduction techniques are provided by using partial least squares within deep learning. Our framework provides a nonlinear extension of PLS together with a disciplined approach to feature selection and architecture design in deep learning. This leads to a statistical interpretation of deep learning that is tailor made for predictive problems. We can use the tools of PLS, such as scree-plot, bi-plot to provide model diagnostics. Posterior predictive uncertainty is available using MCMC methods at the last layer. Thus we achieve the best of both worlds: scalability and fast predictive rule construction together with uncertainty quantification. Our key construct is to employ deep learning within PLS by predicting the output scores as a deep learner of the input scores. As with PLS our X-scores are constructed using SVD and applied to both regression and classification problems and are fast and scalable. Following Frank and Friedman 1993, we provide a Bayesian shrinkage interpretation of our nonlinear predictor. We introduce a variety of new partial least squares models: PLS-ReLU, PLS-Autoencoder, PLS-Trees and PLS-GP. To illustrate our methodology, we use simulated examples and the analysis of preferences of orange juice and predicting wine quality as a function of input characteristics. We also illustrate Brillinger's estimation procedure to provide the feature selection and data dimension reduction. Finally, we conclude with directions for future research.
翻译:在深层学习中,我们的框架提供了PLS的非线性扩展和快速预测规则的构建以及不确定性量化。我们的关键构筑是在PLS中采用深层次的研究成绩,预测输入分数的深层次学习。这导致对深层次学习的统计解释,这是针对预测问题的定制。我们可以使用PLS的工具,例如scree-plot、双点来提供模型诊断。在Frank和Friedman之后,1993年,我们用MCMC 方法为我们的非线性预测器提供巴伊西亚缩缩略图解释。因此,我们实现了两个世界的最佳方向:可缩放和快速预测规则的构建以及不确定性的量化。我们的关键构筑是在PLS中采用深层次的研究成绩,作为输入分数的深层次学习者。正如 PLS 那样,我们的X分是用SVD构建的,用于回归和分类问题,并且可以快速和可缩放。在Frank和Friedman1993年之后,我们为我们的非线性预测器提供了一种巴伊西亚的缩缩缩影解释。我们用新的最差的模型模型模型:PLS-ReLU、PLS-Autcoencoder 和Mill eximill eximactal ex pal eximactal eximactal ex prial expactal prial view wealpolpal views。