This paper discusses capabilities that are essential to models applied in policy analysis settings and the limitations of direct applications of off-the-shelf machine learning methodologies to such settings. Traditional econometric methodologies for building discrete choice models for policy analysis involve combining data with modeling assumptions guided by subject-matter considerations. Such considerations are typically most useful in specifying the systematic component of random utility discrete choice models but are typically of limited aid in determining the form of the random component. We identify an area where machine learning paradigms can be leveraged, namely in specifying and systematically selecting the best specification of the random component of the utility equations. We review two recent novel applications where mixed-integer optimization and cross-validation are used to algorithmically select optimal specifications for the random utility components of nested logit and logit mixture models subject to interpretability constraints.
翻译:本文讨论了在政策分析环境中应用模型所必不可少的能力以及直接应用现成机器学习方法在这种环境中的局限性。为建立独立的政策分析选择模型而采用的传统计量经济学方法涉及将数据与模型假设相结合,以主题事项的考虑为指导,这些考虑通常最有助于说明随机实用离散选择模型的系统组成部分,但在确定随机组成部分的形式方面通常帮助有限。我们确定了一个可以利用机器学习模式的领域,即具体说明和系统地选择实用方程随机组成部分的最佳规格。我们审查了最近两个新应用,即采用混合整数优化和交叉校验,在逻辑学上选择符合可解释性限制条件的嵌入式和逻辑混合物模型随机效用组成部分的最佳规格。