Many applications affecting human lives rely on models that have come to be known under the umbrella of machine learning and artificial intelligence. These AI models are usually complicated mathematical functions that map from an input space to an output space. Stakeholders are interested to know the rationales behind models' decisions and functional behavior. We study this functional behavior in relation to the data used to create the models. On this topic, scholars have often assumed that models do not extrapolate, i.e., they learn from their training samples and process new input by interpolation. This assumption is questionable: we show that models extrapolate frequently; the extent of extrapolation varies and can be socially consequential. We demonstrate that extrapolation happens for a substantial portion of datasets more than one would consider reasonable. How can we trust models if we do not know whether they are extrapolating? Given a model trained to recommend clinical procedures for patients, can we trust the recommendation when the model considers a patient older or younger than all the samples in the training set? If the training set is mostly Whites, to what extent can we trust its recommendations about Black and Hispanic patients? Which dimension (race, gender, or age) does extrapolation happen? Even if a model is trained on people of all races, it still may extrapolate in significant ways related to race. The leading question is, to what extent can we trust AI models when they process inputs that fall outside their training set? This paper investigates several social applications of AI, showing how models extrapolate without notice. We also look at different sub-spaces of extrapolation for specific individuals subject to AI models and report how these extrapolations can be interpreted, not mathematically, but from a humanistic point of view.
翻译:影响人类生活的许多应用都依赖于在机器学习和人工智能的伞状下已知的模型。 这些人工智能模型通常是复杂的数学函数,从输入空间到输出空间。 利益攸关方有兴趣了解模型决定和功能行为背后的理由。 我们研究模型创建模型所用数据方面的这种功能行为。 关于这个主题, 学者们常常假设模型不会外推, 也就是说, 他们从培训样本中学习, 通过内推处理新的输入。 这个假设是值得怀疑的: 我们显示模型经常外推; 外推范围不同, 可以产生社会影响。 我们证明, 大量数据集的外推值会发生, 而不是一种合理。 如果我们不知道模型是否具有外推值, 我们如何信任模型, 也就是当模型认为病人年龄大于或小于培训中的所有样本时, 我们能否相信它关于黑人和拉美裔患者的建议? 在外推值方面, 我们如何在外推值上外推值时, 我们如何从一个特定的外推值, 如何从一个特定的外推值来解释。