A general class of models is proposed that is able to estimate the whole predictive distribution of a dependent variable $Y$ given a vector of explanatory variables $\xb$. The models exploit that the strength of explanatory variables to distinguish between low and high values of the dependent variable may vary across the thresholds that are used to define low and high. Simple linear versions of the models are generalizations of classical linear regression models but also of widely used ordinal regression models. They allow to visualize the effect of explanatory variables in the form of parameter functions. More general models are based on efficient nonparametric approaches like random forests, which are more flexible and are strong prediction tools. A general estimation method is given that can use all the estimation tools that have been proposed for binary regression, including selection methods like the lasso or elastic net. For linearly structured models maximum likelihood estimates are derived. The usefulness of the models is illustrated by simulations and several real data set.
翻译:提议了一个能够估计一个依附变量美元的整体预测分布的通用模型类别,以解释变量矢量 $\xb$为单位。模型利用解释变量的强度来区分依附变量的低值和高值,在用于界定低值和高值的阈值之间可能有所不同。模型的简单线性版本是古典线性回归模型的概括,但也是广泛使用的圆形回归模型。这些模型能够以参数函数的形式直观解释解释变量的效果。更一般的模型基于高效的非参数方法,如随机森林,这些方法更灵活,是强有力的预测工具。一般估算方法可以使用为二进制回归而提议的所有估算工具,包括诸如 lasso 或弹性网等选择方法。对于线性结构模型,可以得出最大限度的可能性估计。模型的有用性通过模拟和若干真实的数据集加以说明。