Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as freetext. We propose a novel task of exploring fairness on a multimodal clinical dataset, adopting equalized odds for the downstream medical prediction tasks. To this end, we investigate a modality-agnostic fairness algorithm - equalized odds post processing - and compare it to a text-specific fairness algorithm: debiased clinical word embeddings. Despite the fact that debiased word embeddings do not explicitly address equalized odds of protected groups, we show that a text-specific approach to fairness may simultaneously achieve a good balance of performance and classical notions of fairness. We hope that our paper inspires future contributions at the critical intersection of clinical NLP and fairness. The full source code is available here: https://github.com/johntiger1/multimodal_fairness
翻译:临床机学日益采用多式形式,以结构化表格形式和自由文本等非结构化形式收集。我们提出探索多式联运临床数据集的公平性的新任务,对下游医疗预测任务采用均等的几率。为此,我们调查一种模式的、不可知的公平算法,即同等的事后处理,并将其与文本特有的公平算法进行比较:有偏向的临床词嵌入。尽管有偏向的字嵌入没有明确地解决受保护群体同等的几率问题,但我们表明,针对文本的公平方法可以同时实现业绩和传统公平概念之间的良好平衡。我们希望我们的论文能够激发今后在临床NLP和公平的关键交叉点上的贡献。这里有完整的源代码:https://github.com/johntiger1/muldmodal_fairity。这里有:https://github.modal_fairity。