The paper is devoted to the study of the model fairness and process fairness of the Russian demographic dataset by making predictions of divorce of the 1st marriage, religiosity, 1st employment and completion of education. Our goal was to make classifiers more equitable by reducing their reliance on sensitive features while increasing or at least maintaining their accuracy. We took inspiration from "dropout" techniques in neural-based approaches and suggested a model that uses "feature drop-out" to address process fairness. To evaluate a classifier's fairness and decide the sensitive features to eliminate, we used "LIME Explanations". This results in a pool of classifiers due to feature dropout whose ensemble has been shown to be less reliant on sensitive features and to have improved or no effect on accuracy. Our empirical study was performed on four families of classifiers (Logistic Regression, Random Forest, Bagging, and Adaboost) and carried out on real-life dataset (Russian demographic data derived from Generations and Gender Survey), and it showed that all of the models became less dependent on sensitive features (such as gender, breakup of the 1st partnership, 1st partnership, etc.) and showed improvements or no impact on accuracy
翻译:本文致力于研究俄罗斯人口数据集的模型公平性和过程公平性,通过预测第一次婚姻离婚、宗教、第一次就业和完成教育的情况,研究俄罗斯人口数据集的模型公平性和过程公平性;我们的目标是通过减少对敏感特征的依赖,同时增加或至少保持其准确性,使分类人员更加公平;我们从神经方法中的“辍学”技术中汲取灵感,并提出了一个使用“自然失学”处理过程公平性的模型;为了评价分类人员的公正性并决定要消除的敏感特征,我们使用了“LIME解释”。由于特异性辍学,导致分类人员集合起来,其特点被证明不那么依赖敏感特征,对准确性有改进或没有影响;我们的经验研究针对四个分类人员家庭(Logrestic Regresion、Ranging For、Blagging和Adaboost)进行,并用真实生命数据集(从一代和性别调查中得出的俄罗斯人口数据)进行,它表明所有模型都不太依赖敏感特征(例如性别、断裂第1个伙伴关系,没有显示伙伴关系等)。