Youth in the American foster care system are significantly more likely than their peers to face a number of negative life outcomes, from homelessness to incarceration. Administrative data on these youth have the potential to provide insights that can help identify ways to improve their path towards a better life. However, such data also suffer from a variety of biases, from missing data to reflections of systemic inequality. The present work proposes a novel, prescriptive approach to using these data to provide insights about both data biases and the systems and youth they track. Specifically, we develop a novel categorical clustering and cluster summarization methodology that allows us to gain insights into subtle biases in existing data on foster youth, and to provide insight into where further (often qualitative) research is needed to identify potential ways of assisting youth.
翻译:美国寄养系统中的青年比同龄人更有可能面对从无家可归到监禁等一系列负面生活结果。关于这些青年的行政数据有可能提供洞察力,帮助确定如何改善他们改善生活的道路。然而,这些数据也存在各种偏见,从缺少数据到系统性不平等的反映。目前的工作提出了一种新颖的规范性方法,用这些数据来提供关于数据偏差以及他们所跟踪的系统和青年的洞察力。具体地说,我们开发了一种新颖的绝对集群和集群总结方法,使我们能够了解关于寄养青年的现有数据中隐蔽的偏见,并深入了解需要进一步(经常是定性的)研究的领域,以确定援助青年的潜在方法。