We introduce SparcAssist, a general-purpose risk assessment tool for the machine learning models trained for language tasks. It evaluates models' risk by inspecting their behavior on counterfactuals, namely out-of-distribution instances generated based on the given data instance. The counterfactuals are generated by replacing tokens in rational subsequences identified by ExPred, while the replacements are retrieved using HotFlip or Masked-Language-Model-based algorithms. The main purpose of our system is to help the human annotators to assess the model's risk on deployment. The counterfactual instances generated during the assessment are the by-product and can be used to train more robust NLP models in the future.
翻译:我们引入了Sparcassist, 这是一种通用的风险评估工具,用于为语言任务培训的机器学习模型,通过检查模型在反事实方面的行为来评估风险,即根据给定数据实例产生的分配外事件。反事实通过替换ExPred所查明的合理次序列中的标牌产生,而替换则使用HotFlip或蒙面语言模型算法来检索。我们系统的主要目的是帮助人类告示员评估模型的部署风险。评估过程中产生的反事实是副产品,今后可以用来培训更强有力的NLP模型。