Programmatic Weak Supervision (PWS) aggregates the source votes of multiple weak supervision sources into probabilistic training labels, which are in turn used to train an end model. With its increasing popularity, it is critical to have some tool for users to understand the influence of each component (e.g., the source vote or training data) in the pipeline and interpret the end model behavior. To achieve this, we build on Influence Function (IF) and propose source-aware IF, which leverages the generation process of the probabilistic labels to decompose the end model's training objective and then calculate the influence associated with each (data, source, class) tuple. These primitive influence score can then be used to estimate the influence of individual component of PWS, such as source vote, supervision source, and training data. On datasets of diverse domains, we demonstrate multiple use cases: (1) interpreting incorrect predictions from multiple angles that reveals insights for debugging the PWS pipeline, (2) identifying mislabeling of sources with a gain of 9%-37% over baselines, and (3) improving the end model's generalization performance by removing harmful components in the training objective (13%-24% better than ordinary IF).
翻译:程序薄弱监督(PWS) 将多个薄弱监督源的来源票集中到概率性的培训标签中, 后者又被用来培训最终模式。 随着其日益普及, 用户必须掌握一些工具, 了解管道中每个组成部分的影响( 如源投票或培训数据), 并解释最终模式行为 。 为了实现这一目标, 我们以影响函数( IF) 为基础, 并提议源意识综合框架, 利用概率性标签的生成过程来分解最终模式的培训目标, 然后再计算与每个单元( 数据、 来源、 类) 相关的影响。 这些原始影响评分可以用来估计PWS各个组成部分( 如源投票、 监督来源和培训数据) 的影响。 关于不同领域的数据集, 我们展示了多种使用案例:(1) 从多个角度解释不正确的预测, 从而揭示对 PWS 管道进行调试的洞察, (2) 查明来源的错误标签, 其收益比基线高出9%-37%, (3) 改进了最终模型的一般性( 13 % ), 通过消除培训中有害部分的常规性( IF) (13 %) 。