Predicting L-Function Properties from Trace-Index Graphs using Graph Neural Networks
Abstract
We investigate whether graph neural networks can predict arithmetic properties of modular forms by operating on graphs constructed from Fourier coefficient data. For each modular form , we build a trace-index graph: 1000 nodes (one per index , corresponding to Fourier coefficients ), connected by sequential, primality, and -nearest-neighbor edges, with node features encoding the coefficient values and their normalizations. On 46,347 weight-2 newforms from the LMFDB, a 3-layer Chebyshev convolution network () predicts the first -function zero with , analytic rank with 94.16% accuracy, and CM status with 100% accuracy.
Key Results
| Target | Metric | GCN Baseline | ChebConv |
|---|---|---|---|
| (first L-function zero) | 0.559 | 0.625 | |
| Analytic rank (3-class) | Accuracy | 91.27% | 94.16% |
| Analytic rank | macro | 74.61% | 89.22% |
| Analytic rank (class ≥ 2) | 40.00% | 78.87% | |
| CM status (binary) | Accuracy | 99.96% | 100.00% |
Approach
Trace-Index Graph Construction
For each modular form , we construct a graph with:
- 1000 nodes — one per index , each carrying 5-dimensional features:
- ~9,500 edges from three sources:
- Sequential: for consecutive indices
- Prime: when both are prime (168 prime-indexed nodes)
- NN: when is among the nearest indices in coefficient-value space
This is fundamentally different from prior work using Cayley graphs of , which are vertex-transitive and give GNNs no local diversity to exploit.
Why This Works (and Cayley Graphs Don't)
| Property | Cayley | Trace-Index |
|---|---|---|
| Vertex-transitive | Yes | No |
| Node features | Identical (structural) | Unique (Fourier coefficients) |
| Graph topology | Algebraic (group) | Data-driven (NN) |
| Best (test) | (all experiments) | 0.625 |
Cross-Level Generalization
Training on conductors ≤ 3000 and testing on conductors > 4000 reveals an interesting asymmetry:
- Regression generalizes well: drops only 14% (0.625 → 0.538)
- Classification degrades: Rank accuracy drops from 94.16% to 87.58%, and rare class 2 collapses from 78.87% to 25.66%
This suggests the GNN learns conductor-independent patterns for regression but conductor-dependent patterns for classification.
Limitations
- Below-sklearn regression: is lower than tree ensembles on raw Fourier coefficients ( 0.73–0.96)
- Weight-2 only: Generalization to other weights is untested
- Rare-class sensitivity: Class 2 collapses on unseen conductors
- No causal claims: The GNN learns statistical patterns, not proofs of arithmetic theorems
Data source: The L-Functions and Modular Forms Database (LMFDB)