Deep learning models for natural language processing (NLP) are increasingly adopted and deployed by analysts without formal training in NLP or machine learning (ML). However, the documentation intended to convey the model's details and appropriate use is tailored primarily to individuals with ML or NLP expertise. To address this gap, we conduct a design inquiry into interactive model cards, which augment traditionally static model cards with affordances for exploring model documentation and interacting with the models themselves. Our investigation consists of an initial conceptual study with experts in ML, NLP, and AI Ethics, followed by a separate evaluative study with non-expert analysts who use ML models in their work. Using a semi-structured interview format coupled with a think-aloud protocol, we collected feedback from a total of 30 participants who engaged with different versions of standard and interactive model cards. Through a thematic analysis of the collected data, we identified several conceptual dimensions that summarize the strengths and limitations of standard and interactive model cards, including: stakeholders; design; guidance; understandability & interpretability; sensemaking & skepticism; and trust & safety. Our findings demonstrate the importance of carefully considered design and interactivity for orienting and supporting non-expert analysts using deep learning models, along with a need for consideration of broader sociotechnical contexts and organizational dynamics. We have also identified design elements, such as language, visual cues, and warnings, among others, that support interactivity and make non-interactive content accessible. We summarize our findings as design guidelines and discuss their implications for a human-centered approach towards AI/ML documentation.
翻译:自然语言处理(NLP)的深层次学习模式日益被分析家采纳和部署,而没有在NLP或机器学习(ML)方面进行正式培训。然而,旨在传达模型细节和适当使用的文件主要针对具有ML或NLP专门知识的个人。为了弥补这一差距,我们对互动式模型卡进行了设计调查,该模型卡增加了传统静态模型卡,为探索示范文件和与模型本身互动提供了机会。我们的调查包括:利益攸关方;设计;指导;可理解性和可解释性;感知性和怀疑性;以及信任与在工作中使用ML模型的非专家分析员进行单独评价研究。我们利用半结构化访谈格式以及思维-语言协议,从总共30名使用不同版本的标准和互动模型卡的参与者中收集了反馈意见。我们通过对所收集的数据进行专题分析,确定了一些概念层面,总结了标准和互动模型卡的长处和局限性,包括:利益攸关方;设计;指导;可理解性和可理解性;感知性与怀疑性;以及信任与安全。我们的调查结果表明,在设计过程中必须认真思考和动态分析,并分析其设计中,作为深层次分析背景分析,我们所理解性分析,并分析,并分析,我们的结论分析其为深思地理解性分析,并分析,并分析,并分析其为深研研研研研研研研研。