A major concern of Machine Learning (ML) models is their opacity. They are deployed in an increasing number of applications where they often operate as black boxes that do not provide explanations for their predictions. Among others, the potential harms associated with the lack of understanding of the models' rationales include privacy violations, adversarial manipulations, and unfair discrimination. As a result, the accountability and transparency of ML models have been posed as critical desiderata by works in policy and law, philosophy, and computer science. In computer science, the decision-making process of ML models has been studied by developing accountability and transparency methods. Accountability methods, such as adversarial attacks and diagnostic datasets, expose vulnerabilities of ML models that could lead to malicious manipulations or systematic faults in their predictions. Transparency methods explain the rationales behind models' predictions gaining the trust of relevant stakeholders and potentially uncovering mistakes and unfairness in models' decisions. To this end, transparency methods have to meet accountability requirements as well, e.g., being robust and faithful to the underlying rationales of a model. This thesis presents my research that expands our collective knowledge in the areas of accountability and transparency of ML models developed for complex reasoning tasks over text.
翻译:机器学习模型(ML)的主要关切之一是其不透明性,这些模型被部署在越来越多的应用中,这些应用往往作为黑盒运行,无法解释其预测;除其他外,由于对模型原理缺乏了解而可能造成的伤害包括侵犯隐私、对抗性操纵和不公平歧视;因此,由于政策和法律、哲学和计算机科学方面的著作,ML模型的问责制和透明度被作为关键的分层。在计算机科学方面,通过制定问责制和透明度方法,对ML模型的决策过程进行了研究。问责方法,如对抗性攻击和诊断数据集,暴露了ML模型的脆弱性,这些脆弱性可能导致恶意操纵或系统错误预测。透明方法解释了模型预测背后的理由,获得相关利益攸关方的信任,并有可能发现模型决定中的错误和不公。为此,透明度方法还必须满足问责要求,例如,对一个模型的基本原理进行有力和忠实的研究。这个论文介绍了我的研究,扩展了我们在问责和透明性模型的复杂推理领域的集体知识。