Building on work of Charton, we train small transformer models to calculate the M\"obius function $\mu(n)$ and the squarefree indicator function $\mu^2(n)$. The models attain nontrivial predictive power. We then iteratively train additional models to understand how the model functions, ultimately finding a theoretical explanation.
翻译:暂无翻译