How can Transformers model and learn enumerative geometry? What is a robust procedure for using Transformers in abductive knowledge discovery within a mathematician-machine collaboration? In this work, we introduce a new paradigm in computational enumerative geometry in analyzing the $\psi$-class intersection numbers on the moduli space of curves. By formulating the enumerative problem as a continuous optimization task, we develop a Transformer-based model for computing $\psi$-class intersection numbers based on the underlying quantum Airy structure. For a finite range of genera, our model is capable of regressing intersection numbers that span an extremely wide range of values, from $10^{-45}$ to $10^{45}$. To provide a proper inductive bias for capturing the recursive behavior of intersection numbers, we propose a new activation function, Dynamic Range Activator (DRA). Moreover, given the severe heteroscedasticity of $\psi$-class intersections and the required precision, we quantify the uncertainty of the predictions using Conformal Prediction with a dynamic sliding window that is aware of the number of marked points. Next, we go beyond merely computing intersection numbers and explore the enumerative "world-model" of the Transformers. Through a series of causal inference and correlational interpretability analyses, we demonstrate that Transformers are actually modeling Virasoro constraints in a purely data-driven manner. Additionally, we provide evidence for the comprehension of several values appearing in the large genus asymptotic of $\psi$-class intersection numbers through abductive hypothesis testing.
翻译:暂无翻译