In this paper, we revisit McFadden (1978)'s correction factor for sampling of alternatives in multinomial logit (MNL) and mixed multinomial logit (MMNL) models. McFadden (1978) proved that consistent parameter estimates are obtained when estimating MNL models using a sampled subset of alternatives, including the chosen alternative, in combination with a correction factor. We decompose this correction factor into i) a correction for overestimating the MNL choice probability due to using a smaller subset of alternatives, and ii) a correction for which a subset of alternatives is contrasted through utility differences and thereby the extent to which we learn about the parameters of interest in MNL. Keane and Wasi (2016) proved that the overall expected positive information divergence - comprising the above two elements - is minimised between the true and sampled likelihood when applying a sampling protocol satisfying uniform conditioning. We generalise their result to the case of positive conditioning and show that whilst McFadden (1978)'s correction factor may not minimise the overall expected information divergence, it does minimise the expected information loss with respect to the parameters of interest. We apply this result in the context of Bayesian analysis and show that McFadden (1978)'s correction factor minimises the expected information loss regarding the parameters of interest across the entire posterior density irrespective of sample size. In other words, McFadden (1978)'s correction factor has desirable small and large sample properties. We also show that our results for Bayesian MNL models transfer to MMNL and that only McFadden (1978) correction factor is sufficient to minimise the expected information loss in the parameters of interest. Monte Carlo simulations illustrate the successful application of sampling of alternatives in Bayesian MMNL models.
翻译:暂无翻译