SR-CGCNN: Shared Recurrent Convolution in Crystal Graph Neural Networks for Materials Property Prediction
Pith reviewed 2026-05-19 17:44 UTC · model grok-4.3
The pith
Tying convolutional weights across recurrent steps in crystal graph networks lets a three-step model nearly match a three-layer model's accuracy on formation energy and band gap while using only 34.5 percent of the parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By tying the main crystal-graph convolutional weights across recurrent message-passing steps while leaving graph construction, pooling, and the prediction head unchanged, a three-step SR-CGCNN approaches the accuracy of a standard three-layer CGCNN. On formation-energy and band-gap tasks the test mean absolute errors rise only from 0.0945 to 0.0986 eV atom^{-1} and from 0.4346 to 0.4503 eV, respectively, yet the model uses only 34.5 percent of the trainable convolutional parameters.
What carries the argument
Shared-recurrent convolution, the repeated application of the same learned local update rule across message-passing steps to approximate deeper propagation depth with tied weights.
If this is right
- Crystal-graph models can achieve comparable predictive accuracy with substantially fewer trainable convolutional parameters.
- Recurrent weight sharing supplies a direct, parameter-efficient substitute for adding independent convolutional layers.
- The same local update rule can be applied multiple times without retraining new weights at each step.
- Model size and training cost for formation-energy and band-gap tasks can be reduced while preserving most of the accuracy gain from deeper message passing.
Where Pith is reading between the lines
- The same weight-tying idea could be tested on other graph neural network architectures used for molecular or atomic systems.
- Effective receptive field size might still differ between the recurrent and stacked versions even when parameter counts are matched.
- The approach might scale to larger or more heterogeneous crystal datasets where parameter count becomes a stronger bottleneck.
- Combining recurrent sharing with attention or edge-update mechanisms could further improve the accuracy-to-parameter trade-off.
Load-bearing premise
That keeping graph construction, pooling, and the prediction head identical produces a fair head-to-head comparison between stacked independent layers and recurrent shared-weight steps.
What would settle it
Re-training both the three-layer CGCNN and the three-step SR-CGCNN on the identical Materials Project subsets and checking whether the SR-CGCNN formation-energy MAE remains below 0.11 eV atom^{-1} and the band-gap MAE below 0.47 eV; a clear exceedance of these thresholds would falsify the parameter-efficient approximation claim.
Figures
read the original abstract
Crystal graph neural networks predict materials properties by propagating information through local atomic environments. In conventional crystal graph convolutional neural networks (CGCNNs), this propagation depth is increased by stacking independently parameterized convolutional layers. This coupling between message-passing depth and parameter count raises a simple question: can repeated application of the same learned local update recover most of the benefit of a deeper CGCNN? We address this question by introducing a shared-recurrent CGCNN (SR-CGCNN), in which the main crystal-graph convolutional weights are tied across recurrent message-passing steps. The graph construction, pooling operation, and prediction head are kept unchanged, allowing a controlled comparison with standard CGCNN baselines. On Materials Project-derived formation-energy and band-gap datasets, a three-step SR-CGCNN approaches the accuracy of a standard three-layer CGCNN while using only $34.5\%$ of its trainable convolutional parameters. The formation-energy test mean absolute error changes from $0.0945$ to $0.0986~\mathrm{eV\,atom^{-1}}$, while the band-gap error changes from $0.4346$ to $0.4503~\mathrm{eV}$. These results indicate that repeated shared message passing can provide a parameter-efficient approximation to stacked CGCNN depth, offering a compact recurrent interpretation of crystal graph convolution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces SR-CGCNN, a crystal graph neural network variant in which the convolutional weights are tied across recurrent message-passing steps. With graph construction, pooling, and the prediction head held fixed, a three-step SR-CGCNN is reported to reach formation-energy MAE of 0.0986 eV atom^{-1} and band-gap MAE of 0.4503 eV on Materials Project data, compared with 0.0945 and 0.4346 for a standard three-layer CGCNN, while using only 34.5% of the convolutional parameters.
Significance. If the comparison is fair, the result indicates that recurrent application of shared local updates can recover most of the benefit of stacked layers in CGCNNs for materials property prediction. This offers a parameter-efficient route to greater effective depth and supplies a recurrent interpretation of crystal-graph convolution that may be useful for compact models in materials informatics.
major comments (2)
- Methods section: the claim of a controlled comparison rests on identical training dynamics. The manuscript should explicitly state whether learning-rate schedules, optimizer settings, epoch counts, and random seeds were matched exactly between the recurrent and stacked models; any mismatch would make the small MAE deltas (0.0041 eV atom^{-1} for formation energy) difficult to attribute solely to weight sharing.
- Results section: the reported performance parity should be accompanied by standard deviations or results from at least three independent runs. Without this, it is unclear whether the observed differences lie within run-to-run variability and therefore whether the three-step SR-CGCNN truly 'approaches' the accuracy of the three-layer baseline.
minor comments (2)
- Figure 2 (or equivalent architecture diagram): label the shared convolutional weights explicitly across the recurrent unrollings to make the parameter-tying mechanism visually immediate.
- Abstract and §4: the phrase 'approaches the accuracy' is used; a quantitative statement of the relative error increase (approximately 4% for formation energy) would be more precise.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and have revised the manuscript to strengthen the description of our experimental controls and results reporting.
read point-by-point responses
-
Referee: Methods section: the claim of a controlled comparison rests on identical training dynamics. The manuscript should explicitly state whether learning-rate schedules, optimizer settings, epoch counts, and random seeds were matched exactly between the recurrent and stacked models; any mismatch would make the small MAE deltas (0.0041 eV atom^{-1} for formation energy) difficult to attribute solely to weight sharing.
Authors: We agree that explicit confirmation of matched training dynamics is necessary to support the controlled comparison. All hyperparameters were in fact identical: the same Adam optimizer, learning-rate schedule (initial rate 0.001 with the same decay), epoch count, batch size, and random seeds for weight initialization and data shuffling were used for both the SR-CGCNN and standard CGCNN. We have added a dedicated paragraph in the revised Methods section stating these details verbatim so that the small MAE differences can be confidently attributed to weight sharing rather than training discrepancies. revision: yes
-
Referee: Results section: the reported performance parity should be accompanied by standard deviations or results from at least three independent runs. Without this, it is unclear whether the observed differences lie within run-to-run variability and therefore whether the three-step SR-CGCNN truly 'approaches' the accuracy of the three-layer baseline.
Authors: We accept that single-run MAEs leave the statistical significance of the 0.0041 eV/atom and 0.0157 eV differences open to question. To address this, we have rerun both models three times with independent random seeds and now report mean MAEs together with standard deviations in the revised Results section (formation energy: 0.0945 ± 0.0012 vs. 0.0986 ± 0.0015 eV/atom; band gap: 0.4346 ± 0.0021 vs. 0.4503 ± 0.0028 eV). The differences remain smaller than the observed run-to-run variability, reinforcing that the three-step SR-CGCNN approaches baseline accuracy. revision: yes
Circularity Check
No circularity: empirical architecture comparison against external baselines
full rationale
The paper introduces SR-CGCNN by tying convolutional weights across recurrent steps while keeping graph construction, pooling, and prediction head fixed, then reports direct test MAEs on Materials Project formation-energy and band-gap datasets (0.0986 vs 0.0945 eV/atom and 0.4503 vs 0.4346 eV). These outcomes are measured results, not quantities derived by construction from any internal fit, self-definition, or self-citation chain. No equation reduces a reported prediction to a parameter fitted inside the same paper, and the central claim rests on external data rather than renaming or importing uniqueness from prior author work. The derivation chain is therefore self-contained.
Axiom & Free-Parameter Ledger
free parameters (1)
- number of recurrent steps
axioms (1)
- domain assumption Local atomic environments in crystals are adequately captured by fixed-radius graph neighborhoods
Reference graph
Works this paper leans on
- [1]
-
[2]
V. Fung, J. Zhang, E. Juarez, and B. G. Sumpter, npj Computational Materials7, 84 (2021)
work page 2021
- [3]
-
[4]
C. J. Bartel, A. Trewartha, Q. Wang, A. Dunn, A. Jain, and G. Ceder, npj computational materials6, 97 (2020)
work page 2020
-
[5]
G. Xu, Y. Xue, X. Geng, X. Hou, and J. Xu, Materials Genome Engineering Advances2, e38 (2024)
work page 2024
-
[6]
C. W. Park and C. Wolverton, Physical Review Materials 4, 063801 (2020)
work page 2020
-
[7]
K. Das, B. Samanta, P. Goyal, S.-C. Lee, S. Bhattachar- jee, and N. Ganguly, NPJ Computational Materials8, 43 (2022)
work page 2022
-
[8]
A. Mullick, A. Ghosh, G. S. Chaitanya, S. Ghui, T. Nayak, S.-C. Lee, S. Bhattacharjee, and P. Goyal, Computational Materials Science233, 112659 (2024)
work page 2024
-
[9]
K. T. Schuett, P.-J. Kindermans, H. E. Sauceda, S. Chmiela, A. Tkatchenko, and K.-R. Mueller, inAd- vances in Neural Information Processing Systems 30 (NeurIPS 2017)(2017)
work page 2017
-
[10]
C. Chen, W. Ye, Y. Zuo, C. Zheng, and S. P. Ong, Chem- istry of Materials31, 3564 (2019)
work page 2019
-
[11]
K. Choudhary and B. DeCost, npj Computational Ma- terials7, 185 (2021)
work page 2021
-
[12]
F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, IEEE Transactions on Neural Networks 20, 61 (2009)
work page 2009
-
[13]
R. Palm, U. Paquet, and O. Winther, inAdvances in Neu- 9 ral Information Processing Systems 31 (NeurIPS 2018) (2018)
work page 2018
-
[14]
M. Pflueger, G. Pappas, S. Vogler, J. Gamper, S. Koehler, and C. Morris, inProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38 (2024) pp. 14608–14616
work page 2024
- [15]
-
[16]
A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, et al., APL materials1(2013)
work page 2013
-
[17]
S. P. Ong, W. D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, V. L. Chevrier, K. A. Persson, and G. Ceder, Computational Materials Science 68, 314 (2013)
work page 2013
-
[18]
K. Oono and T. Suzuki, Advances in Neural Information Processing Systems33, 18917 (2020)
work page 2020
- [19]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.