在数据-Rich环境中有非对称损失的二进选择:理论和对种族正义的适用 (Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice)

We study the binary choice problem in a data-rich environment with asymmetric loss functions. In contrast to asymmetric regression problems, the binary choice with general loss functions and high-dimensional datasets is challenging and not well understood. Econometricians have studied nonparametric binary choice problems for a long time, but the literature does not offer computationally attractive solutions in data-rich environments. In contrast, the machine learning literature has many algorithms that form the basis for much of the automated procedures that are implemented in practice, but is focused mostly on loss functions that are independent of individual characteristics. We show that the theoretically valid predictions of binary outcomes with a generic loss function can be achieved via a very simple reweighting of the logistic regression or state-of-the-art machine learning techniques, such as LASSO, boosting, or deep learning. We apply our analysis to racial justice in pretrial detention.

翻译：与不对称回归问题相反,具有一般损失函数和高维数据集的二元选择具有挑战性,而且没有得到很好理解。计量经济学人长期研究非参数二元选择问题,但文献并未在数据丰富的环境中提供具有计算吸引力的解决方案。相比之下,机器学习文献有许多算法,这些算法构成了实践中实施的许多自动化程序的基础,但主要侧重于与个人特点无关的损失功能。我们表明,理论上有效的预测具有一般损失函数的二元结果可以通过非常简单地重新加权物流回归或最先进的机器学习技术(如LASSO、促进或深层学习)实现。我们在审前拘留中将我们的分析应用于种族公正。