Machine Learning for Modular Forms: GNNs, Spectral Methods, and L-Function Phenomenology — Comprehensive Preprint | Research

Published on Zenodo: 10.5281/zenodo.20479512

This preprint synthesises the complete Riemann Project — 11 experiment tracks, 7 GNN architectures, 200,000 weight-2 newforms from the LMFDB, and a formal theoretical analysis of why message-passing GNNs fail on algebraic graphs.

What the Preprint Covers

The 12-page preprint (32 references) consolidates every experiment track and theoretical contribution:

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

GNNs systematically fail on vertex-transitive algebraic graphs. Across GCN, GAT, GIN, ChebConv, and GraphSage — on subgraph and full-graph scales — no architecture achieves ΔR² > +0.042 over a trivial log(N) baseline. The root cause is vertex-transitivity: every node is structurally identical, collapsing local neighborhood information that message-passing depends on. Formalised via the Weisfeiler-Leman hierarchy: MPNNs on vertex-transitive graphs are limited to functions of diameter and degree.

Positive Results (Thread B — LMFDB-Scale Learning)

When the data has structure, ML succeeds:

Hecke trace regression: R² = 0.987 on 53,000 forms using 100 Hecke traces (sklearn GBR). Data quantity, not model architecture, was the bottleneck.
Analytic rank classification: F₁ = 0.970 (MLP/RF/GB ensemble on 100 Hecke traces).
L-function zero prediction: R² = 0.731 (GAT on 1000-node trace-index graphs).
CM form detection: F₁ = 0.919 (Sato-Tate moment features, M₄/M₂ ratio primary).

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

Connes CvS operator extracts ζ zeros to 10⁻¹⁶ machine precision at N=100.
Two-population GUE statistics: dim=1 → GUE, dim≥2 → GOE across 63,844 forms (Cohen's d = 8.808).
Galois correlation: ρ₂ = −0.607 ± 0.012, dilution law ρ_d ∼ d^.
Sato-Tate moment diagrams: clear convergence gaps between CM and non-CM families.

New in the Preprint (Not Previously Published)

GAT attention analysis (Section 4.10): 56M+ edge observations across 2000 test graphs. Prime bias 6.1% (ratio 1.061), Cohen's d = 0.035 — GAT does not learn the Ramanujan–Petersson bound. Layer 2 focused (entropy 7.15, sparsity 0.62), layers 0/1 diffuse.
FunSearch specifications (Thread K): Two convergent search targets for arithmetic function discovery — multiplicative Hecke trace models and spectral gap formulas on LPS graphs.
Spectral rigidity analysis: GUE eigenvalue spacing statistics (Δ₃, P(s)) across dimensions, with the dim=1/≥2 transition.
Expanded theoretical framework: Weisfeiler-Leman proof of MPNN limitations on vertex-transitive graphs formalised as Theorem 4.2.

Key Experimental Constants

Experiment	Sample	Best Result
Cayley spectral gap (GNN)	27 primes, 1M+ node graphs	ΔR² = +0.042 (chance-level)
Hecke trace regression	53,000 forms	R² = 0.987 (sklearn GBR)
Analytic rank	46,347 forms	F₁ = 0.970
L-zero prediction (GAT)	46,347 forms, 1000-node graphs	R² = 0.731
CM classification	46,347 forms	F₁ = 0.919
GAT attention (prime bias)	2000 graphs, 56M edges	d = 0.035 (negligible)
ζ zeros via CvS	N = 100	10⁻¹⁶ machine precision

How to Cite

Weiss, T. (2026). Machine Learning for Modular Forms: Graph Neural Networks, Spectral Methods, and L-Function Phenomenology. Zenodo. https://doi.org/10.5281/zenodo.20479512

Repository

All code, data pipelines, and experimental scripts are available in the Riemann Project repository (Dockerised, with Makefile targets for reproduction). The repository includes the full knowledge graph (Cypher/Neo4j), 7 GNN implementations, eigenvalue computation via sparse Lanczos, and Sato-Tate moment analysis.

Published on Zenodo: 10.5281/zenodo.20479512

What the Preprint Covers

The 12-page preprint (32 references) consolidates every experiment track and theoretical contribution:

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

Positive Results (Thread B — LMFDB-Scale Learning)

When the data has structure, ML succeeds:

Hecke trace regression: R² = 0.987 on 53,000 forms using 100 Hecke traces (sklearn GBR). Data quantity, not model architecture, was the bottleneck.
Analytic rank classification: F₁ = 0.970 (MLP/RF/GB ensemble on 100 Hecke traces).
L-function zero prediction: R² = 0.731 (GAT on 1000-node trace-index graphs).
CM form detection: F₁ = 0.919 (Sato-Tate moment features, M₄/M₂ ratio primary).

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

Connes CvS operator extracts ζ zeros to 10⁻¹⁶ machine precision at N=100.
Two-population GUE statistics: dim=1 → GUE, dim≥2 → GOE across 63,844 forms (Cohen's d = 8.808).
Galois correlation: ρ₂ = −0.607 ± 0.012, dilution law ρ_d ∼ d^.
Sato-Tate moment diagrams: clear convergence gaps between CM and non-CM families.

New in the Preprint (Not Previously Published)

GAT attention analysis (Section 4.10): 56M+ edge observations across 2000 test graphs. Prime bias 6.1% (ratio 1.061), Cohen's d = 0.035 — GAT does not learn the Ramanujan–Petersson bound. Layer 2 focused (entropy 7.15, sparsity 0.62), layers 0/1 diffuse.
FunSearch specifications (Thread K): Two convergent search targets for arithmetic function discovery — multiplicative Hecke trace models and spectral gap formulas on LPS graphs.
Spectral rigidity analysis: GUE eigenvalue spacing statistics (Δ₃, P(s)) across dimensions, with the dim=1/≥2 transition.
Expanded theoretical framework: Weisfeiler-Leman proof of MPNN limitations on vertex-transitive graphs formalised as Theorem 4.2.

Key Experimental Constants

Experiment	Sample	Best Result
Cayley spectral gap (GNN)	27 primes, 1M+ node graphs	ΔR² = +0.042 (chance-level)
Hecke trace regression	53,000 forms	R² = 0.987 (sklearn GBR)
Analytic rank	46,347 forms	F₁ = 0.970
L-zero prediction (GAT)	46,347 forms, 1000-node graphs	R² = 0.731
CM classification	46,347 forms	F₁ = 0.919
GAT attention (prime bias)	2000 graphs, 56M edges	d = 0.035 (negligible)
ζ zeros via CvS	N = 100	10⁻¹⁶ machine precision

How to Cite

Weiss, T. (2026). Machine Learning for Modular Forms: Graph Neural Networks, Spectral Methods, and L-Function Phenomenology. Zenodo. https://doi.org/10.5281/zenodo.20479512

Cart

Wishlist

Your wishlist is empty

Machine Learning for Modular Forms: GNNs, Spectral Methods, and L-Function Phenomenology — Comprehensive Preprint

What the Preprint Covers

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

Positive Results (Thread B — LMFDB-Scale Learning)

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

New in the Preprint (Not Previously Published)

Key Experimental Constants

How to Cite

Repository

Machine Learning for Modular Forms: GNNs, Spectral Methods, and L-Function Phenomenology — Comprehensive Preprint

What the Preprint Covers

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

Positive Results (Thread B — LMFDB-Scale Learning)

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

New in the Preprint (Not Previously Published)

Key Experimental Constants

How to Cite

Repository

What the Preprint Covers

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

Positive Results (Thread B — LMFDB-Scale Learning)

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

New in the Preprint (Not Previously Published)

Key Experimental Constants

How to Cite

Repository

Never miss a deep-dive

What the Preprint Covers

Negative Results (Thread A — Cayley Graphs of SL(2,Fₚ))

Positive Results (Thread B — LMFDB-Scale Learning)

Structural Discoveries (Thread C — CvS, GUE, Sato-Tate)

New in the Preprint (Not Previously Published)

Key Experimental Constants

How to Cite

Repository

Never miss a deep-dive