In this research, we focus on the usage of adversarial sampling to test for the fairness in the prediction of deep neural network model across different classes of image in a given dataset. While several framework had been proposed to ensure robustness of machine learning model against adversarial attack, some of which includes adversarial training algorithm. There is still the pitfall that adversarial training algorithm tends to cause disparity in accuracy and robustness among different group. Our research is aimed at using adversarial sampling to test for fairness in the prediction of deep neural network model across different classes or categories of image in a given dataset. We successfully demonstrated a new method of ensuring fairness across various group of input in deep neural network classifier. We trained our neural network model on the original image, and without training our model on the perturbed or attacked image. When we feed the adversarial samplings to our model, it was able to predict the original category/ class of the image the adversarial sample belongs to. We also introduced and used the separation of concern concept from software engineering whereby there is an additional standalone filter layer that filters perturbed image by heavily removing the noise or attack before automatically passing it to the network for classification, we were able to have accuracy of 93.3%. Cifar-10 dataset have ten categories of dataset, and so, in order to account for fairness, we applied our hypothesis across each categories of dataset and were able to get a consistent result and accuracy.
翻译:在这一研究中,我们侧重于使用对抗式抽样,以测试在预测某一数据集中不同图像类别中深神经网络模型的公平性。虽然提出了若干框架,以确保机器学习模型对敌对式攻击的稳健性,其中一些包括对抗式培训算法。目前仍然存在一个陷阱,即敌对式培训算法往往在不同群体中造成准确性和稳健性差异。我们的研究旨在使用对抗式抽样,以测试在预测某一数据集中不同类别或图像类别中的深神经网络模型时的公平性。我们成功地展示了确保各种输入群体在深神经网络分类中的公平性的新方法。我们用原始图像培训了我们的神经网络模型,而没有培训我们的模型。当我们向模型提供对抗式抽样往往造成准确性和稳健性差异。我们还引入并使用了关切概念,将关切与软件工程中不同类别中的深神经网络模型进行公平性预测。我们通过大量消除噪音或攻击的每类输入过滤器图像,从而能够自动通过10号数据的准确性,在10号中自动传递数据到10号的分类中,我们得以应用了10号数据的准确性。</s>