This study addresses a fundamental, yet overlooked, gap between standard theory and empirical modelling practices in the OLS regression model $\boldsymbol{y}=\boldsymbol{X\beta}+\boldsymbol{u}$ with collinearity. In fact, while an estimated model in practice is desired to have stability and efficiency in its "individual OLS estimates", $\boldsymbol{y}$ itself has no capacity to identify and control the collinearity in $\boldsymbol{X}$ and hence no theory including model selection process (MSP) would fill this gap unless $\boldsymbol{X}$ is controlled in view of sampling theory. In this paper, first introducing a new concept of "empirically effective modelling" (EEM), we propose our EEM methodology (EEM-M) as an integrated process of two MSPs with data $(\boldsymbol{y^o,X})$ given. The first MSP uses $\boldsymbol{X}$ only, called the XMSP, and pre-selects a class $\scr{D}$ of models with individually inefficiency-controlled and collinearity-controlled OLS estimates, where the corresponding two controlling variables are chosen from predictive standard error of each estimate. Next, defining an inefficiency-collinearity risk index for each model, a partial ordering is introduced onto the set of models to compare without using $\boldsymbol{y^o}$, where the better-ness and admissibility of models are discussed. The second MSP is a commonly used MSP that uses $(\boldsymbol{y^o,X})$, and evaluates total model performance as a whole by such AIC, BIC, etc. to select an optimal model from $\scr{D}$. Third, to materialize the XMSP, two algorithms are proposed.
翻译:暂无翻译