Characterizing neural networks in terms of better-understood formal systems has the potential to yield new insights into the power and limitations of these networks. Doing so for transformers remains an active area of research. Bhattamishra and others have shown that transformer encoders are at least as expressive as a certain kind of counter machine, while Merrill and Sabharwal have shown that fixed-precision transformer encoders recognize only languages in uniform $TC^0$. We connect and strengthen these results by identifying a variant of first-order logic with counting quantifiers that is simultaneously an upper bound for fixed-precision transformer encoders and a lower bound for transformer encoders. This brings us much closer than before to an exact characterization of the languages that transformer encoders recognize.
翻译:将神经网络定性为更清楚的正规系统,有可能对这些网络的力量和局限性产生新的洞察力。对于变压器来说,这样做仍是一个活跃的研究领域。 Bhattmishra等人已经表明,变压器编码器至少像某种反制机器一样能表达,而Merrill和Sabharwal则表明,固定精密变压器编码器只承认统一为$TC$0的语文。我们将这些结果连接起来并加强这些结果,方法是找出一种第一阶逻辑的变式,用计数量化符同时对固定精密变压器编码器和变压器编码器进行上限。这使我们比以前更接近于变压器所认识的语文的确切特征。