Interpretable entity representations (IERs) are sparse embeddings that are "human-readable" in that dimensions correspond to fine-grained entity types and values are predicted probabilities that a given entity is of the corresponding type. These methods perform well in zero-shot and low supervision settings. Compared to standard dense neural embeddings, such interpretable representations may permit analysis and debugging. However, while fine-tuning sparse, interpretable representations improves accuracy on downstream tasks, it destroys the semantics of the dimensions which were enforced in pre-training. Can we maintain the interpretable semantics afforded by IERs while improving predictive performance on downstream tasks? Toward this end, we propose Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL). ItsIRL realizes improved performance over prior IERs on biomedical tasks, while maintaining "interpretability" generally and their ability to support model debugging specifically. The latter is enabled in part by the ability to perform "counterfactual" fine-grained entity type manipulation, which we explore in this work. Finally, we propose a method to construct entity type based class prototypes for revealing global semantic properties of classes learned by our model.
翻译:解释性实体表示( IERs) 是“ 人类可读” 的稀薄嵌入, 与细微的实体类型相对应, 而值则预测出某个特定实体属于相应类型的概率。 这些方法在零点和低监管环境中运行良好。 与标准的密度神经嵌入器相比, 这种可解释性表示可能允许分析和调试。 但是, 虽然微调稀释和可解释的表示法提高了下游任务的准确性, 但它摧毁了在培训前执行的维度的语义。 我们能否在改进下游任务的预测性能的同时保持IERs提供的可解释性语义? 我们为此建议基于中间 entity 的 Sproass Interproduction Production 学习( ITsIRL) 。 与标准密集性神经嵌入器相比, 这种可解释性表示法可以改进生物医学任务上以前的IERs的性能, 同时保持一般的“ 可解释性” 以及具体支持模型调控的能力。 后者部分是由于有能力进行“ 量化” 微缩实体类型操纵, 我们在这个类中探索了我们所学的原型企业的模型, 最后, 我们建议一种方法来构建一个模型。