This paper investigates how to best compare algorithms for predicting chronic homelessness for the purpose of identifying good candidates for housing programs. Predictive methods can rapidly refer potentially chronic shelter users to housing but also sometimes incorrectly identify individuals who will not become chronic (false positives). We use shelter access histories to demonstrate that these false positives are often still good candidates for housing. Using this approach, we compare a simple threshold method for predicting chronic homelessness to the more complex logistic regression and neural network algorithms. While traditional binary classification performance metrics show that the machine learning algorithms perform better than the threshold technique, an examination of the shelter access histories of the cohorts identified by the three algorithms show that they select groups with very similar characteristics. This has important implications for resource constrained not-for-profit organizations since the threshold technique can be implemented using much simpler information technology infrastructure than the machine learning algorithms.
翻译:本文探究了如何在预测长期无家可归者时,最好采用哪种算法来识别适合于住房计划的合适人选。预测方法可以快速将潜在的长期收容所使用者推荐给住房单位,但有时也会错误地识别不会成为长期无家可归者的个人(误报)。我们使用收容所使用历史记录来证明这些误报仍然是适合住房的候选人。在这个方法的基础上,我们比较了一种简单的“阈值法”与更为复杂的逻辑回归和神经网络算法。虽然传统的二元分类性能指标表明,机器学习算法优于阈值技术,但对比三种算法识别出的收容群体的历史记录发现,他们选择的群体具有非常相似的特征。这对于资源有限的非营利组织来说非常重要,因为阈值技术可以使用比机器学习算法更简单的信息技术基础设施来实现。