A Scalable Translationally Invariant Variational Theory of Ab Initio Polarons
Pith reviewed 2026-05-08 08:55 UTC · model grok-4.3
The pith
Momentum-projected variational wavefunctions enable scalable ab initio polaron calculations in the thermodynamic limit.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that their momentum-projected Toyozawa-type variational wavefunction, paired with low-rank factorization of the electron-phonon kernel, yields a translationally invariant and scalable approach to computing ab initio polaron properties. This produces reliable binding energies, thermodynamic-limit band structures, and real-space polaron sizes for materials including LiF and TiO2, and highlights a potential bias in diagrammatic Monte Carlo for the strong-coupling hole polaron in LiF.
What carries the argument
momentum-projected Toyozawa-type wavefunction with low-rank factorization of the electron-phonon kernel, which enforces translational invariance and near-linear scaling with k-points
If this is right
- Polaron binding energies and band structures become available for real materials directly in the thermodynamic limit.
- Both weak- and strong-coupling regimes are handled within the same variational framework without separate approximations.
- Real-space measures of polaron extent provide direct, interpretable quantities for comparison with experiment.
- The approach serves as a benchmark that can identify biases in other techniques such as diagrammatic Monte Carlo.
- Systematic improvement follows from enlarging the variational space or refining the kernel factorization.
Where Pith is reading between the lines
- The near-linear scaling suggests the method can be applied to larger or lower-symmetry unit cells that remain inaccessible to supercell techniques.
- The noted discrepancy with diagrammatic Monte Carlo implies variational methods may become the preferred route for strong-coupling polarons where stochastic sampling faces bias issues.
- Transparent real-space polaron sizes could be used to predict signatures in angle-resolved photoemission or optical spectra.
- The same wavefunction form offers a natural starting point for computing polaron mobilities or finite-temperature effects in the same materials.
Load-bearing premise
The low-rank factorization of the electron-phonon kernel combined with the Toyozawa-type ansatz remains accurate and unbiased in the strong-coupling regime.
What would settle it
A converged reference calculation of the LiF hole-polaron binding energy via diagrammatic Monte Carlo or exact methods on extrapolatable clusters that differs substantially from the variational result would falsify the claim of unbiased accuracy in strong coupling.
Figures
read the original abstract
We introduce a scalable, translationally invariant variational theory for ab initio polarons that remains applicable across coupling regimes without resorting to supercells. Our approach combines a momentum-projected Toyozawa-type wavefunction with a low-rank factorization of the electron-phonon kernel, enabling near-linear scaling with the number of $\mathbf{k}$-points while capturing both delocalized and self-trapped carriers. Benchmarks for the Fr\"ohlich model, LiF, and anatase and rutile TiO$_2$ yield accurate polaron binding energies, thermodynamic-limit band structures, and transparent real-space measures of polaron extent. For LiF, comparison with first-principles diagrammatic Monte Carlo (DiagMC) reveals close agreement for the weak-coupling electron-polaron ground state and band structure. However, in the hole-polaron of LiF, which is in the strong-coupling regime, we found a significant bias in DiagMC results. These results establish momentum-projected variational wavefunctions as a systematically improvable route to thermodynamic limit studies of polarons in real materials.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a scalable, translationally invariant variational method for ab initio polarons that combines a momentum-projected Toyozawa-type wavefunction with low-rank factorization of the electron-phonon kernel. This enables near-linear scaling with k-points and applicability across coupling strengths without supercells. Benchmarks on the Fröhlich model, LiF (electron and hole polarons), and TiO2 (anatase/rutile) report polaron binding energies, thermodynamic-limit band structures, and real-space extents, with close agreement to DiagMC for the weak-coupling LiF electron polaron but a claimed significant DiagMC bias for the strong-coupling hole polaron.
Significance. If the ansatz completeness and low-rank factorization accuracy hold, the approach offers a systematically improvable route to thermodynamic-limit polaron studies in real materials, addressing a key limitation of supercell-based methods. The variational character and explicit scaling claims are strengths, as is the provision of transparent real-space diagnostics.
major comments (2)
- [LiF hole polaron results section] § on LiF hole polaron results (near abstract claim of 'significant bias in DiagMC'): The attribution of the discrepancy solely to DiagMC bias is load-bearing for the claim of applicability 'across coupling regimes,' yet no explicit convergence data with respect to factorization rank, number of variational parameters, or additional correlation channels is provided to rule out ansatz incompleteness. Because the method is variational, any missing channels produce a systematic upper-bound error whose magnitude remains uncontrolled precisely where the discrepancy appears.
- [Method / low-rank factorization subsection] Low-rank factorization of the e-ph kernel (method section): The paper asserts that the factorization remains accurate and unbiased in the strong-coupling regime, but no independent validation (e.g., comparison to known strong-coupling analytic limits, larger-rank calculations, or energy lowering upon rank increase) is shown for the LiF hole case. This leaves open whether the reported energies are converged or limited by the rank truncation.
minor comments (2)
- [Wavefunction ansatz definition] The original Toyozawa ansatz reference is cited but the precise momentum-projection implementation and its relation to the low-rank kernel could be clarified with an explicit equation for the projected wavefunction.
- [Figures on real-space measures] Figure captions for real-space polaron extent plots should include the specific k-point sampling density used to reach the thermodynamic limit.
Simulated Author's Rebuttal
We thank the referee for their detailed and insightful comments on our manuscript. We appreciate the opportunity to clarify and strengthen our presentation of the results, particularly regarding the convergence and validation of the method in the strong-coupling regime. Below, we provide point-by-point responses to the major comments.
read point-by-point responses
-
Referee: [LiF hole polaron results section] § on LiF hole polaron results (near abstract claim of 'significant bias in DiagMC'): The attribution of the discrepancy solely to DiagMC bias is load-bearing for the claim of applicability 'across coupling regimes,' yet no explicit convergence data with respect to factorization rank, number of variational parameters, or additional correlation channels is provided to rule out ansatz incompleteness. Because the method is variational, any missing channels produce a systematic upper-bound error whose magnitude remains uncontrolled precisely where the discrepancy appears.
Authors: We agree that explicit convergence studies are essential to robustly attribute the observed discrepancy to a bias in the DiagMC results rather than incompleteness in our variational ansatz. In the revised version of the manuscript, we will include additional data demonstrating the convergence of the LiF hole polaron binding energy with respect to the factorization rank and the number of variational parameters. We will also address the potential for additional correlation channels and their impact on the energy. These additions will provide better control over the variational upper bound and support our claim of applicability across coupling regimes, especially given the excellent agreement with DiagMC in the weak-coupling electron polaron case. revision: yes
-
Referee: [Method / low-rank factorization subsection] Low-rank factorization of the e-ph kernel (method section): The paper asserts that the factorization remains accurate and unbiased in the strong-coupling regime, but no independent validation (e.g., comparison to known strong-coupling analytic limits, larger-rank calculations, or energy lowering upon rank increase) is shown for the LiF hole case. This leaves open whether the reported energies are converged or limited by the rank truncation.
Authors: We acknowledge the need for more direct validation of the low-rank factorization in the strong-coupling regime for the LiF hole polaron. While analytic strong-coupling limits are not straightforward to apply to ab initio systems like LiF, we will add to the revised manuscript results from calculations with increased factorization ranks for the LiF hole polaron. These will show the energy lowering upon rank increase and demonstrate convergence, thereby confirming that the reported energies are not limited by the rank truncation and that the factorization remains accurate. revision: yes
Circularity Check
No circularity in variational ansatz or benchmarks
full rationale
The paper introduces an explicitly variational ansatz (momentum-projected Toyozawa-type wavefunction plus low-rank kernel factorization) whose energy is obtained by direct minimization; this construction does not reduce to its own inputs by definition, nor does it rename fitted parameters as predictions. Benchmarks are performed against independent external methods (DiagMC, Fröhlich model limits) rather than self-referential data. No load-bearing self-citation, uniqueness theorem imported from the authors' prior work, or ansatz smuggled via citation appears in the provided text; the discrepancy with DiagMC is presented as an empirical observation whose interpretation does not alter the internal derivation of the variational energies or band structures.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
L. D. Landau, Phys. Z. Sowjet.3, 664 (1933)
work page 1933
- [2]
-
[3]
J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev.108, 1175 (1957)
work page 1957
- [4]
-
[5]
H. Y. Fan, Phys. Rev.82, 900 (1951)
work page 1951
-
[6]
A. B. Migdal, Sov. Phys. JETP7, 996 (1958)
work page 1958
-
[7]
P. B. Allen and V. Heine, J. Phys. C: Solid State Phys. 9, 2305 (1976)
work page 1976
-
[8]
P. B. Allen and M. Cardona, Phys. Rev. B23, 1495 (1981)
work page 1981
-
[9]
S. M. Story, J. J. Kas, F. D. Vila, M. J. Verstraete, and J. J. Rehr, Phys. Rev. B90, 195135 (2014)
work page 2014
- [10]
-
[11]
W. H. Sio, C. Verdi, S. Poncé, and F. Giustino, Phys. Rev. Lett.122, 246403 (2019)
work page 2019
-
[12]
W. H. Sio, C. Verdi, S. Poncé, and F. Giustino, Phys. Rev. B99, 235139 (2019)
work page 2019
-
[13]
Y. Luo, B. K. Chang, and M. Bernardi, Phys. Rev. B 105, 155132 (2022)
work page 2022
-
[14]
A.Mahajan, P.J.Robinson, J.Lee, andD.R.Reichman, Phys. Rev. B112, 184310 (2025)
work page 2025
-
[15]
P.J.Robinson, J.Lee, A.Mahajan, andD.R.Reichman, Phys. Rev. B111, 054304 (2025)
work page 2025
-
[16]
J. Lafuente-Bartolome, C. Lian, W. H. Sio, I. G. Gur- tubay, A. Eiguren, and F. Giustino, Phys. Rev. B106, 075119 (2022)
work page 2022
-
[17]
J. Lafuente-Bartolome, C. Lian, W. H. Sio, I. G. Gur- tubay, A. Eiguren, and F. Giustino, Phys. Rev. Lett. 129, 076402 (2022)
work page 2022
-
[18]
Y. Luo, J. Park, and M. Bernardi, Nat. Phys.21, 1275 (2025)
work page 2025
- [19]
- [20]
-
[21]
Y. Zhao, J. Chem. Phys.158, 080901 (2023)
work page 2023
- [22]
-
[23]
Y. Luo, D. Desai, B. K. Chang, J. Park, and M. Bernardi, Phys. Rev. X14, 021023 (2024)
work page 2024
-
[24]
C. Franchini, M. Reticcioli, M. Setvin, and U. Diebold, Nat. Rev. Mater.6, 560 (2021)
work page 2021
-
[25]
L. D. Landau and S. I. Pekar, Zh. Eksp. Teor. Fiz.18, 419 (1948)
work page 1948
-
[26]
R. E. Peierls and J. Yoccoz, Proc. Phys. Soc. A70, 381 (1957)
work page 1957
-
[27]
T. Hahn, S. Klimin, J. Tempere, J. T. Devreese, and C. Franchini, Phys. Rev. B97, 134305 (2018)
work page 2018
- [28]
-
[29]
V. M. Buimistrov and S. I. Pekar, Sov. Phys. JETP6, 977 (1958)
work page 1958
-
[30]
A. S. Davydov, J. Theor. Biol.38, 559 (1973)
work page 1973
-
[31]
Y. Zhao, D. W. Brown, and K. Lindenberg, J. Chem. Phys.107, 3159 (1997)
work page 1997
- [32]
- [33]
- [34]
- [35]
-
[36]
J. Park, J.-J. Zhou, V. A. Jhalani, C. E. Dreyer, and M. Bernardi, Phys. Rev. B102, 125203 (2020)
work page 2020
- [37]
-
[38]
J. Lee, S. Zhang, and D. R. Reichman, Phys. Rev. B 103, 115123 (2021)
work page 2021
- [39]
-
[40]
P. O. J. Scherer and S. F. Fischer, Chem. Phys.86, 269 (1984)
work page 1984
-
[41]
A. H. Romero, D. W. Brown, and K. Lindenberg, Phys. Lett. A254, 287 (1999)
work page 1999
- [42]
-
[43]
F. Liu, Z. Yang, Y. Luo, S. Guo, C. Zhang, S. Choo, X. Xu, S. G. Jeong, J. S. Kumar, X. Wang, A. Mkhoyan, M. Bernardi, and B. Jalan, Rep. Prog. Phys.89, 028003 (2026)
work page 2026
-
[44]
E. Epifanovsky, A. T. B. Gilbert, X. Feng, J. Lee, Y. Mao, N. Mardirossian, P. Pokhilko, A. F. White, M. P. Coons, A. L. Dempwolff, Z. Gan, D. Hait, P. R. Horn, L. D. Jacobson, I. Kaliman, J. Kussmann, A. W. Lange, K. U. Lao, D. S. Levine, J. Liu, S. C. McKen- zie, A. F. Morrison, K. D. Nanda, F. Plasser, D. R. Rehn, M. L. Vidal, Z.-Q. You, Y. Zhu, B. Ala...
work page 2021
- [45]
-
[46]
P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli, G. L. Chiarotti, M. Cococ- cioni, I. Dabo, A. Dal Corso, S. de Gironcoli, S. Fabris, G. Fratesi, R. Gebauer, U. Gerstmann, C. Gougoussis, A. Kokalj, M. Lazzeri, L. Martin-Samos, N. Marzari, F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello, L. Paulatto, C. Sbraccia, S. S...
work page 2009
-
[47]
J.-J. Zhou, J. Park, I.-T. Lu, I. Maliyov, X. Tong, and M. Bernardi, Comput. Phys. Commun.264, 107970 (2021)
work page 2021
- [48]
-
[49]
J. Sjakste, N. Vast, M. Calandra, and F. Mauri, Phys. Rev. B92, 054307 (2015)
work page 2015
- [50]
-
[51]
D. R. Hamann, Phys. Rev. B88, 085117 (2013)
work page 2013
-
[52]
G. Pizzi, V. Vitale, R. Arita, S. Blügel, F. Freimuth, G. Géranton, M. Gibertini, D. Gresch, C. Johnson, T. Koretsune, J. Ibañez-Azpiroz, H. Lee, J.-M. Lihm, D. Marchand, A. Marrazzo, Y. Mokrousov, J. I. Mustafa, Y. Nohara, Y. Nomura, L. Paulatto, S. Poncé, T. Pon- weiser, J. Qiao, F. Thöle, S. S. Tsirkin, M. Wierzbowska, N. Marzari, D. Vanderbilt, I. Sou...
work page 2020
-
[53]
The Broken Symmetry of Strong Coupling In the strong-coupling regime (α≫1), the standard variational approach is the Landau–Pekar (LP) product ansatz: |ΨLP⟩=|ψ e⟩ ⊗ |f⟩(A4) where|ψ e⟩is a localized electron state (often modeled as a Gaussian,ψe(r)∝e − 1 2 µr2 ), and|f⟩= exp( P q fqa† q − f ∗ qaq)|0⟩is a coherent state representing the lattice deformation....
-
[54]
The Peierls–Yoccoz Projection To restore translational symmetry, we employ the Peierls–Yoccoz (PY) projection operator. Since the Hamiltonian is translationally invariant, states localized at different positionsRmust be degenerate. We construct a momentum eigenstate by integrating over all continuous spatial translations of the localized LP wavepacket: |Ψ...
-
[55]
Variation After Projection (VAP) To correctly capture the crossover regime (α≈6), one must use the Variation After Projection (VAP) method, as opposed to the projection after variation method. In VAP, the exact expectation value of the Hamiltonian is evaluated using the fully projected state: EPY = ⟨ΨPY(0)|H|ΨPY(0)⟩ ⟨ΨPY(0)|ΨPY(0)⟩ (A7) 9 This yields a mu...
-
[56]
Analytic Matrix Elements The localized electronic trial state is constructed as a linear combination of Gaussians,ϕ(r) =P3 i=1 cie−µir2/2. Because the generator coordinate shiftRacts as a simple translation, the overlap and kinetic energy matrix elements between the localized state and its shifted counterpartϕR(r) =ϕ(r−R)can be evaluated completely analyt...
-
[57]
Numerical Quadrature for Phonon and Interaction Matrix Elements The localized trial state assumes the lattice relaxes into a coherent state characterized by the displacement ampli- tudesf q. This displacement is phenomenologically tied to the exact electronic density via a single momentum-space low-pass filter: fq =− Vq ℏωq ρq 1 1 +βq 2 ≡ − Vq ℏωq ρqgq,(A...
-
[58]
As the polaron localizes,µ→ ∞, causing the physical radius to shrink drastically
Scale-Invariant Quadrature The final PY energy requires integrating over the generator coordinateR: EPY = R dR R2 exp(G(R)) [Tel(R) +Iel(R)Uph(R) +Vint(R)]R dR R2 exp(G(R))Iel(R) .(A22) 11 A naive integration overRandqfails in the strong-coupling limit (α≫1). As the polaron localizes,µ→ ∞, causing the physical radius to shrink drastically. This causes the...
-
[59]
For a non-magnetic crystal, dynamical matrices (and therefore phonon eigenmodes) obeyD(q) =D∗(−q)
Alignment of Degenerate Modes We found that in order to converge our variational calculations efficiently and produce a reliable finite-size ex- trapolation degenerate modes need to be aligned betweenqand−qwhen generating the electron–phonon coupling elements. For a non-magnetic crystal, dynamical matrices (and therefore phonon eigenmodes) obeyD(q) =D∗(−q...
-
[60]
Gradient We derive the analytical gradients for the dD2 wavefunction below. The corresponding D2 expressions are obtained immediately by settingD k = 1. These derivatives assume a Hermitian electron–phonon coupling tensor. If the full tensorg mnν(k,q)is stored explicitly, hermiticity can be enforced directly. Otherwise, for example when working with singu...
-
[61]
Diagonal Elements of Hessian For approximate second-order methods such as the GDM, the diagonal elements of the Hessian are useful for preconditioning. In the followings, we list necessary equations for implementing the diagonal elements of the Hessian. ∂2EdD2 ∂Re{A nk}2 = 2⟨Ψ|Ψ⟩−1 " εnk −E dD2(A, B)−Re{A nk} ∂EdD2(A, B) ∂Re{A nk} ! Dk + X νq ωνq|Bνq|2Dk+...
-
[62]
Phonon Occupations We compute the average number of phonons⟨Nph⟩in the ground state as well as the second central moment of the corresponding probability distribution: Nph =⟨Ψ|Ψ⟩ −1 X n eiK·Rne P νq |Bνq |2e−iq·Rn X jk |Ajk|2e−ik·Rn X ν′q′ |Bν′q′|2e−iq′·Rn (H1) Nph − Nph⟩ 2 =⟨Ψ|Ψ⟩ −1 X n eiK·Rne P νq |Bνq |2e−iq·Rn X jk |Ajk|2...
-
[63]
Electronic densities are shown in Fig
Carrier Occupations To further characterize the ground states obtained, we compute the momentum-resolved electronic densities, ⟨c† nkcnk⟩=⟨Ψ|Ψ⟩ −1 |Ank|2Dk.(H4) Since ourk-mesh is uniformly discretized, we use cubic splines to subsample along the high-symmetry points within the unit cell. Electronic densities are shown in Fig. H2. 21 0 5 10 150.00 0.05 0....
-
[64]
Polaron Extent In order to make a prediction on the polaron size of a system we investigate the density-displacement correlation dn,κα(Rp) = *X Re ˆnnRe ˆuκα Re+Rp + ,(H5) where ˆnnRe =c † nRe cnRe = 1 Nk X k1k2j1j2 e−iRe·(k1−k2)U ∗ j1n(k1)Uj2n(k2)c† j1k1 cj2k2 (H6) ˆuκα Re+Rp = X νq s ℏ 2Mκωνq eν κα(q)eiq·(Re+Rp) √Nk bνq +b † ν−q .(H7) We can sum outRe, ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.