Recent causal probing literature reveals when language models and syntactic probes use similar representations. Such techniques may yield "false negative" causality results: models may use representations of syntax, but probes may have learned to use redundant encodings of the same syntactic information. We demonstrate that models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic information present in embeddings. Using these probes, we find evidence for the use of syntax in models where prior methods did not, allowing us to boost model performance by injecting syntactic information into representations.
翻译:最近的因果调查文献揭示了语言模型和合成探测器使用类似表达方式时,语言模型和合成探测器使用类似的表达方式。这些技术可能产生“虚假负”因果关系结果:模型可能使用语法表达方式,但探测器可能已经学会使用同一合成信息的冗余编码。我们证明模型对合成信息进行了多余的编码,并引入了一种新的探测设计,指导探测器考虑嵌入中的所有合成信息。我们利用这些探测器,发现在先前方法没有使用过的模型中使用语法的证据,使我们能够通过将合成信息注入演示方式来提升模型的性能。