For regression model selection under the maximum likelihood framework, we study the likelihood ratio confidence region for the regression parameter vector of a full regression model. We show that, when the confidence level increases with the sample size at a certain speed, with probability tending to one, the confidence region contains only vectors representing models having all active variables, including the parameter vector of the true model. This result leads to a consistent model selection criterion with a sparse maximum likelihood interpretation and certain advantages over popular information criteria. It also provides a large-sample characterization of models of maximum likelihood at different model sizes which shows that, for selection consistency, it suffices to consider only this small set of models.
翻译:暂无翻译