Although researchers increasingly adopt machine learning to model travel behavior, they predominantly focus on prediction accuracy, ignoring the ethical challenges embedded in machine learning algorithms. This study introduces an important missing dimension - computational fairness - to travel behavior analysis. We first operationalize computational fairness by equality of opportunity, then differentiate between the bias inherent in data and the bias introduced by modeling. We then demonstrate the prediction disparities in travel behavior modeling using the 2017 National Household Travel Survey (NHTS) and the 2018-2019 My Daily Travel Survey in Chicago. Empirically, deep neural network (DNN) and discrete choice models (DCM) reveal consistent prediction disparities across multiple social groups: both over-predict the false negative rate of frequent driving for the ethnic minorities, the low-income and the disabled populations, and falsely predict a higher travel burden of the socially disadvantaged groups and the rural populations than reality. Comparing DNN with DCM, we find that DNN can outperform DCM in prediction disparities because of DNN's smaller misspecification error. To mitigate prediction disparities, this study introduces an absolute correlation regularization method, which is evaluated with synthetic and real-world data. The results demonstrate the prevalence of prediction disparities in travel behavior modeling, and the disparities still persist regarding a variety of model specifics such as the number of DNN layers, batch size and weight initialization. Since these prediction disparities can exacerbate social inequity if prediction results without fairness adjustment are used for transportation policy making, we advocate for careful consideration of the fairness problem in travel behavior modeling, and the use of bias mitigation algorithms for fair transport decisions.
翻译:虽然研究人员越来越多地采用机器学习来模拟旅行行为,但他们主要侧重于预测准确性,忽视机器学习算法中固有的道德挑战。本研究对旅行行为分析提出了一个重要的缺失层面,即计算公平性。我们首先通过机会平等实现计算公平性,然后将数据固有的偏差与建模带来的偏差区分开来。然后我们用2017年全国家庭旅行调查(NHTS)和2018-2019年芝加哥我的每日旅行调查来显示旅行行为模型模型的预测差异。有规律的、深厚的神经网络(DNN)和离散的选择模型(DCM)揭示了多种社会群体之间持续的预测差异:过度预测少数族裔、低收入和残疾人口频繁驾驶的虚假负率,然后错误地预测社会弱势群体和农村人口的旅行负担高于现实。 将DNNN与D相比,我们发现DNN在预测差异的预测模型中可以超越DCM,因为D的精确度差错误。为了减轻预测差异,这项研究引入了绝对的互连性调整模式,在进行综合和真实的预测后,对旅行的比值的比值分析中,我们仍使用了对旅行的比值进行精确的比值分析。