Predicting L-Function Properties from Trace-Index Graphs using Graph Neural Networks
AI & Machine LearningDownload Full Paper (PDF, 12 pages)
This paper is the positive counterpart to our negative-result study: When Graph Neural Networks Meet the Riemann Hypothesis: A Systematic Negative Study
Abstract
Can machine learning predict arithmetic properties of modular forms by operating on graph-structured representations of Fourier coefficient data? We investigate this question using Graph Neural Networks on trace-index graphs: 1000-node graph representations of individual newforms, where each node corresponds to a Fourier index and edges encode sequential adjacency, primality structure, and k-nearest-neighbor similarity in coefficient space.
On 46,347 weight-2 newforms from the LMFDB, a 3-layer Chebyshev spectral filter network (K=5) predicts:
- First L-function zero: R² = 0.625
- Analytic rank: 94.16% accuracy, macro F₁ = 89.22%
- CM status: 100% accuracy
Spectral filters consistently outperform plain GCN, with the largest gains on rare-class detection (+38.87 pp in class-2 F₁). Cross-level generalization shows regression degrades modestly (-14% in R²) while classification suffers more severely. Our Sato-Tate moment analysis across 53,779 forms confirms the empirical trace distribution matches SU(2) theory, with CM forms clearly distinguished by their U(1) moments.
Key Results
| Target | Metric | GCN Baseline | ChebConv K=5 |
|---|---|---|---|
| z₁ (first L-function zero) | R² | 0.559 | 0.625 |
| Analytic rank (3-class) | Accuracy | 91.27% | 94.16% |
| Analytic rank | F₁ macro | 74.61% | 89.22% |
| Analytic rank (class ≥2) | F₁ | 40.00% | 78.87% |
| CM status (binary) | Accuracy | 99.96% | 100.00% |
Why This Is Not Trivial
This work follows a systematic negative result: GNNs on Cayley graphs of SL(2,Fₚ) cannot predict spectral properties because vertex-transitivity forces every node to be structurally identical. A message-passing GNN receives the same information from every node, making graph-level prediction impossible. Across seven experiment tracks, every GNN configuration failed or provided only marginal improvement over baselines.
The solution: instead of constructing graphs from algebraic structure (the group SL(2,Fₚ)), construct them from Fourier coefficient data itself. This gives the GNN:
- Heterogeneous node features — each Fourier index carries a unique coefficient value
- Data-dependent topology — kNN edges connect indices with similar trace magnitudes
- Fixed graph size — all 46,347 samples have exactly 1000 nodes
The Sato-Tate Connection
The paper includes a statistical analysis of normalized Hecke traces xₚ = aₚ(f) / (2√p) across 53,779 forms:
- Non-CM forms follow the SU(2) distribution (Sato-Tate semi-circle) with moments matching Catalan numbers
- CM forms follow the U(1) distribution with characteristic peaks near ±1
- The finite-sample fluctuations from these limiting distributions are precisely what the GNN learns to exploit for zero prediction
This connects two central themes in the analytic theory of modular forms: the equidistribution of Hecke eigenvalues and L-function zero statistics.
Approach
Trace-Index Graph Construction
For each modular form f, we construct a graph G_f with:
- 1000 nodes — one per index n = 1, ..., 1000, each carrying 5-dimensional features
- ~9,500 edges from three sources:
- Sequential: consecutive indices connect (i to i+1)
- Prime: both nodes are prime indices (168 prime-indexed nodes form a clique)
- kNN: j is among the k=3 nearest indices in coefficient-value space
Architecture
- GCN: 3-layer graph convolution, hidden 128, BatchNorm, mean+max readout (baseline)
- ChebConv: 3-layer Chebyshev spectral filter, hidden 128, polynomial order K ∈
All models trained with AdamW, CosineAnnealingLR, early stopping, stratified 80/10/10 split.
Cross-Level Generalization
Training on conductors ≤ 3000 (low level) and testing on conductors > 4000 (high level) reveals an asymmetry:
- Regression generalizes well: z₁ R² drops only 14% (0.625 → 0.538)
- Classification degrades: Rank accuracy drops from 94.16% to 87.58%
- Rare class collapses: Class-2 F₁ drops from 78.87% to 25.66%
This suggests the GNN learns conductor-independent patterns for zero prediction but conductor-dependent patterns for rare-class rank classification.
Limitations
- Below-sklearn regression: R² = 0.625 is lower than tree ensembles on raw Fourier coefficients (R² 0.73-0.96 from Experiment 11)
- Weight-2 only: Generalization to other weights is untested
- Rare-class sensitivity: Class-2 F₁ collapses on unseen conductors
- No causal claims: The GNN learns statistical patterns, not proofs of arithmetic theorems
Latest Update: CM Classification with Sato-Tate Moments (June 2026)
Building on the trace-index graph work, a new study achieves F1=0.900 for CM detection in weight-2 newforms using Gradient Boosting Machines trained Prime-indexed Fourier coefficients combined with 11 Sato-Tate moments:
- Top predictor: M₄/M₂ ratio (importance 0.157)
- Dataset: 53,779 forms (213 CM, 53,566 non-CM)
- Method: 36-dimensional feature vector (25 traces + 11 moments)
- Error rate: 8 misclassifications out of 10,756 (0.074%)
Key insight: The M₄/M₂ dimensional dilation ratio captures discriminative information about CM structure, while individual trace coefficients at p=23, 41, and 7 provide complementary signal. This demonstrates that CM is learnable from small-dimensional feature sets without expensive Elliptic Curve analysis.
Published: Data-Driven Detection of Complex Multiplication in Weight 2 Cusp Forms (Zenodo 10.5281/zenodo.20555502)
Original Trace-Index Graph Results
This work follows a systematic negative result: GNNs on Cayley graphs of SL(2,Fₚ) cannot predict spectral properties because vertex-transitivity forces every node to be structurally identical. A message-passing GNN receives the same information from every node, making graph-level prediction impossible. Across seven experiment tracks, every GNN configuration failed or provided only marginal improvement over baselines.
The solution: instead of constructing graphs from algebraic structure (the group SL(2,Fₚ)), construct them from Fourier coefficient data itself. This gives the GNN:
- Heterogeneous node features — each Fourier index carries a unique coefficient value
- Data-dependent topology — kNN edges connect indices with similar trace magnitudes
- Fixed graph size — all 46,347 samples have exactly 1000 nodes