Recognition: 2 theorem links
· Lean TheoremFast Generation of Pipek-Mezey Wannier Functions via the Co-Iterative Augmented Hessian Method
Pith reviewed 2026-05-16 05:11 UTC · model grok-4.3
The pith
k-CIAH extends second-order CIAH optimization to k-point Pipek-Mezey Wannier functions while achieving O(N_k² n³) scaling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By extending the CIAH procedure to k-space and evaluating the Hessian-vector product efficiently, k-CIAH performs second-order optimization of the Pipek-Mezey localization functional at O(N_k² n³) cost in both time and memory. This scaling matches that of existing first-order k-space methods yet delivers superior convergence behavior, producing speedups of roughly 2-3 fold for systems with 1000-5000 orbitals and orders-of-magnitude gains relative to Γ-point CIAH. The optimized PMWFs are shown to yield accurate band structures when used for Wannier interpolation on a range of solids.
What carries the argument
The k-CIAH optimizer, which minimizes the Pipek-Mezey functional via repeated, low-cost Hessian-vector multiplications that avoid the cubic scaling in k-points of the naive second-order approach.
If this is right
- Localization of 1000-5000 orbitals becomes 2-3 times faster than first-order k-space methods while preserving second-order convergence.
- Computational cost drops by orders of magnitude compared with Γ-point CIAH for large orbital counts.
- Generated PMWFs produce accurate electronic band structures via Wannier interpolation.
- Robust convergence holds for insulators, semiconductors, metals, and surfaces.
Where Pith is reading between the lines
- The same efficient Hessian-vector technique could accelerate other second-order localization criteria beyond Pipek-Mezey.
- The reduced scaling may enable routine use of denser k-meshes or larger unit cells in solid-state simulations that rely on localized bases.
- High-quality PMWFs from this route could improve downstream calculations of transport or electron-phonon properties that depend on accurate localized representations.
Load-bearing premise
The efficient Hessian-vector product evaluation maintains numerical stability and fast convergence without hidden costs or extra adjustments across all classes of solids.
What would settle it
A dense k-point sampling of a metallic surface where k-CIAH either diverges or exhibits scaling worse than O(N_k² n³) due to instability in the product evaluation.
Figures
read the original abstract
We report a $k$-point extension of the second-order co-iterative augmented Hessian (CIAH) algorithm, termed $k$-CIAH, for Pipek-Mezey (PM) localization of Wannier functions (WFs). By exploiting an efficient evaluation of the Hessian-vector product, $k$-CIAH achieves $O(N_k^2 n^3)$ scaling in both CPU time and memory, matching that of previously reported first-order $k$-space approaches while improving upon the $O(N_k^3 n^3)$ scaling of $\Gamma$-point CIAH, where $N_k$ denotes the number of $k$-points sampling the first Brillouin zone and $n$ characterizes the unit-cell size. Benchmark calculations on a diverse set of solids -- including insulators, semiconductors, metals, and surfaces -- demonstrate the fast and robust convergence of $k$-CIAH-based PMWF optimization, which yields an overall computational efficiency approximately 2-3--fold higher than first-order $k$-space methods and orders of magnitude higher than $\Gamma$-point CIAH for localizing 1000-5000 orbitals. The quality of the resulting PMWFs is further validated by accurate electronic band structures obtained via PMWF-based Wannier interpolation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a k-point extension (k-CIAH) of the co-iterative augmented Hessian algorithm for Pipek-Mezey localization of Wannier functions. By deriving an efficient Hessian-vector product, the method achieves O(N_k² n³) scaling in CPU time and memory, matching first-order k-space approaches while improving on the O(N_k³ n³) scaling of the Γ-point CIAH. Benchmarks on insulators, semiconductors, metals, and surfaces demonstrate fast convergence and 2-3× overall efficiency gains for 1000-5000 orbitals, with quality validated through PMWF-based Wannier interpolation of electronic band structures.
Significance. If the reported scaling and convergence hold, the work supplies a practical second-order optimizer for periodic PMWF generation that reduces the cost of dense k-point sampling without sacrificing localization quality. The explicit Hessian-vector product derivation, reproducible benchmark timings across chemically diverse solids, and direct band-structure validation constitute concrete strengths that would make the algorithm immediately useful for large-scale solid-state calculations.
minor comments (3)
- [Abstract] Abstract: the symbol n is introduced only as 'characterizes the unit-cell size'; a parenthetical definition (e.g., 'n = number of orbitals per unit cell') would remove ambiguity for readers scanning the scaling claims.
- The manuscript would benefit from a single summary table (e.g., Table X) that lists, for each benchmark system, the material class, N_k, n, wall-time ratios versus first-order k-space and Γ-point CIAH, and the final spread functional value; this would make the '2-3-fold' and 'orders of magnitude' statements directly verifiable.
- Section on numerical stability: while the text states robust convergence for metals, a short paragraph or supplementary note quantifying the condition number of the Hessian-vector product or the number of line-search steps required for metallic cases would strengthen the claim that no hidden costs affect the reported scaling.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the k-CIAH method and for recommending minor revision. We are pleased that the reported scaling, convergence behavior, and practical utility for dense k-point sampling in solids are recognized as strengths.
Circularity Check
No significant circularity detected
full rationale
The derivation chain relies on standard second-order optimization (augmented Hessian) extended to k-space with explicit algebraic derivations for the Hessian-vector product, yielding the stated O(N_k^2 n^3) scaling analytically rather than by fitting or redefinition. Benchmarks on insulators, semiconductors, metals and surfaces, plus band-structure validation via Wannier interpolation, supply independent empirical support. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear; the central efficiency claims rest on algorithmic analysis and timing data external to any circular reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The co-iterative augmented Hessian method converges robustly for the Pipek-Mezey localization functional when extended to k-points
Reference graph
Works this paper leans on
-
[1]
to a single com- plex, translationally invariant real-space unitary acting on the supercell WFs. The special case in which this real-space unitary is constrained to be real-valued is discussed in sec- tion II G. D. Optimization of PMWFs using the co-iterative augmented Hessian (CIAH) method In this work, we determine the unitary rotations that trans- form...
-
[2]
using the second-order co-iterative augmented Hes- sian (CIAH) algorithm. 43 CIAH is a modified trust-region Newton method46 that has been successfully applied to orbital localization in molecules 43,44 and in periodic solids with Γ - point Brillouin zone sampling. 22– 24,45 Near a local minimum, CIAH exhibits quadratic convergence while maintaining suf- fi...
-
[3]
(The Hessian diagonals are used to precondition the Davidson update
with the Davidson algorithm are derived in the Supporting Informa- tion. (The Hessian diagonals are used to precondition the Davidson update
-
[4]
We summarize the working equations be- low. We first define two types of matrix elements of the atomic projection operators, (PTTT A,kkkkkk′)i j = 1 Nk ⟨φ kkki|ˆPTTT A|φ kkk′j⟩ = 1 Nk ∑ µ ∈ A O∗ TTT µ ,kkki OTTT µ ,kkk′j, (PTTT A,kkk000)i j = 1√ Nk ⟨φ kkk j|ˆPTTT A|w000i⟩ = 1√ Nk ∑ µ ∈ A O∗ TTT µ ,kkki OTTT µ ,000 j, (20) where OTTT µ ,kkki = ⟨χ TTT µ |φ kk...
-
[5]
also enables PMWF optimization using gradient-based methods, such as the BFGS quasi-Newton algorithm, 39– 42 as explored in previ- ous studies. 35,38 In section IV, we compare the performance and computational efficiency of k-CIAH- and k-BFGS-based PMWF optimization. E. Cost of k-CIAH-based PMWF optimization The computational cost of k-CIAH-based PMWF opti...
-
[6]
( 23) when solving the augmented Hessian eigenvalue problem in eq
and the Hessian–vector product in eq. ( 23) when solving the augmented Hessian eigenvalue problem in eq. (
-
[7]
with the Davidson algorithm. The cost of the gra- dient evaluation—and therefore of gradient-based methods such as k-BFGS 35,38—is controlled by the construction of the projection matrix (PTTT A,kkk000)i j via eq. ( 20), which requires O(N2 k nprojn2 orb) ∼ O(N2 k n3) CPU time and the same order of memory to store the overlap matrix OTTT µ ,kkki in eq. (2...
-
[8]
The connected sym- metric contribution in eq
Connected symmetric term. The connected sym- metric contribution in eq. ( 25) can be assembled at O(N2 k natomn2 orb) CPU cost once the projection–vector product (Pv)TTT A,kkki j = Nk ∑ kkk′ norb ∑ m (PTTT A,kkkkkk′)im vkkk′m j (29) is available. Equation (
-
[9]
can itself be evaluated at O(N2 k nprojn2 orb) by exploiting the factorized form of the projection operators in eqs. (
-
[10]
and ( 7). For orthonor- mal projectors, (Pv)TTT A,kkki j = ∑ µ ∈ A O∗ TTT µ ,kkki ( Nk ∑ kkk′ norb ∑ m OTTT µ ,kkk′m vkkk′m j ) , (30) with analogous expressions for biorthogonalized pro- jectors. We emphasize that evaluating (Pv)TTT A,kkki j via eq. (
-
[11]
avoids the unfavorable O(N3 k n3) memory cost associated with explicitly storing the full projection ma- trix (PTTT A,kkkkkk′)i j, and thus constitutes a key performance optimization of the present implementation
-
[12]
The disconnected contribution in eq
Disconnected term. The disconnected contribution in eq. (
-
[13]
( 30), Nk ∑ kkk′ ℜ [ (v† kkk′PTTT A,kkk′000) j j ] = Nk ∑ kkk (Pv)TTT A,kkk j j
can also be constructed at O(N2 k natomn2 orb) CPU cost once the second term is evaluated efficiently using the precomputed projection–vector product in eq. ( 30), Nk ∑ kkk′ ℜ [ (v† kkk′PTTT A,kkk′000) j j ] = Nk ∑ kkk (Pv)TTT A,kkk j j. (31)
-
[14]
The connected asym- metric part in eq
Connected asymmetric term. The connected asym- metric part in eq. (
-
[15]
can be evaluated at O(Nkn3 orb) cost as ˜ σ c-asymm kkki j = − p norb ∑ m ( Akkkim v∗ kkk jm + v∗ kkkmi Akkkm j ) , (32) with intermediates Akkki j = Natom ∑ TTT A QTTT A,000 j (PTTT A,kkk000)i j, (33) which can be formed at O(N2 k natomn2 orb) CPU cost. As in the gradient evaluation, the memory footprint of the Hessian–vector product is dominated by stor...
-
[16]
can be chosen to be real-valued by imposing time-reversal symmetry (TRS) on the k-space generators, κ kkk = κ ∗ − kkk. (37) Throughout this section, − kkk is understood as − kkk mod GGG for an appropriate reciprocal lattice vector GGG that maps − kkk back into the first Brillouin zone. The TRS constraint in eq. ( 37) reduces the total number of independent...
-
[17]
(40) The Hessian diagonal elements are modified analogously from eqs
and (23) by symmetrizing with respect to kkk ↔ − kkk, Gkkki j = Gkkki j + (1 − δ kkk,− kkk) G∗ − kkki j, (39) ( Hv)kkki j = (Hv)kkki j + (1 − δ kkk,− kkk) (Hv)∗ − kkki j. (40) The Hessian diagonal elements are modified analogously from eqs. (
-
[18]
Specifically, PTTT A,kkk000 in eq
and ( 28). Specifically, PTTT A,kkk000 in eq. ( 28) is replaced by its TRS-packed form, PTTT A,kkk000 = PTTT A,kkk000 + (1 − δ kkk,− kkk) P∗ TTT A,− kkk000, (41) and PTTT A,kkkkkk is replaced by PTTT A,kkkkkk = PTTT A,kkkkkk +(1− δ kkk,− kkk) [ PTTT A,− kkk(− kkk) +P∗ TTT A,− kkkkkk +P∗ TTT A,kkk(− kkk) ] . (42) In practice, for all systems tested we found...
-
[19]
with the following diagonal approximation, PTTT A,kkkkkk ≈ PTTT A,kkkkkk + (1 − δ kkk,− kkk) PTTT A,− kkk(− kkk), (43) does not measurably affect the convergence rate of k-CIAH- based PMWF optimization. This is expected because the Hes- sian diagonals serve only as a preconditioner for solving the augmented Hessian eigenvalue problem in eq. ( 18). All nu-...
-
[20]
Connected symmetric term 6
- [21]
-
[22]
Connected asymmetric term 8
-
[23]
Connected symmetric term 8 III. Jacobi sweep 9 A. General consideration 9 B. Working equations for p = 2 9 C. Extension for p > 2 10 D. Efficient implementation and computational cost 11 2 I. STRUCTURE FILES The structure files for all solid-state systems studied in this work are available in the following GitHub repository: https://github.com/hongzhouye/su...
work page 2026
-
[24]
Disconnected term Σ X kkkmn = − 1 2 p(p − 1) ∑ TTT Ai Qp− 2 TTT A,000i ∂ QTTT A,000i ∂ Xkkkmn ∑ kkk′pq ( ∂ QTTT A,000i ∂ Xkkk′pq vX kkk′pq + ∂ QTTT A,000i ∂ Ykkk′pq vY kkk′pq ) , Σ Y mn = − 1 2 p(p − 1) ∑ TTT Ai Qp− 2 TTT A,000i ∂ QTTT A,000i ∂ Ykkkmn ∑ kkk′pq ( ∂ QTTT A,000i ∂ Xkkk′pq vX kkk′pq + ∂ QTTT A,000i ∂ Ykkk′pq vY kkk′pq ) (14) where the term in...
-
[25]
Connected symmetric term Σ X kkkmn = − p ∑ TTT A,i Qp− 1 TTT A,000iℜ ∂ U † kkk ∂ Xkkkmn PTTT A,kkkkkk′ ∑ kkk′pq ( ∂ Ukkk′ ∂ Xkkk′pq vX kkk′pq + ∂ Ukkk′ ∂ Ykkk′pq vY kkk′pq ) ii , Σ Y kkkmn = − p ∑ TTT A,i Qp− 1 TTT A,000iℜ ∂ U † kkk ∂ Ykkkmn PTTT A,kkkkkk′ ∑ kkk′pq ( ∂ Ukkk′ ∂ Xkkk′pq vX kkk′pq + ∂ Ukkk′ ∂ Ykkk′pq vY kkk′pq ) ii , (17) where...
-
[26]
Connected asymmetric term Σ X kkkmn = − 1 2 p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk { ∂ Ukkk ∂ Xkkkmn ,∑ pq ∂ Ukkk ∂ Xkkkpq vX kkkpq + ∂ Ukkk ∂ Ykkkpq vY kkkpq }] ii Σ Y kkkmn = − 1 2 p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk { ∂ Ukkk ∂ Ykkkmn ,∑ pq ∂ Ukkk ∂ Xkkkpq vX kkkpq + ∂ Ukkk ∂ Ykkkpq vY kkkpq }] ii (20) Using eq. (18) to simplify terms i...
-
[27]
Disconnected term DX kkkmn = − p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i ( ∂ QTTT A,000i ∂ Xkkkmn )2 = − 4p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i [ ℜ (PTTT A,000kkk)mn ]2 δ mi + (m ↔ n) = − 4p(p − 1)∑ TTT A Qp− 2 TTT A,000m [ ℜ (PTTT A,000kkk)mn ]2 + (m ↔ n) (24) DY kkkmn = − p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i (∂ QTTT A,000i ∂ Ykkkmn )2 = − 4p(p − 1) ∑ TTT A,i Qp−...
-
[28]
Connected asymmetric term DX kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkk ∂ 2Ukkk ∂ X 2 kkkmn ) ii = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk(Emn − Enm)2] ii = 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkkEmm ) ii + (m ↔ n) (27) DY kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkk ∂ 2Ukkk ∂ Y 2 kkkmn ) ii = 2p ∑ TTT A,i Qp− 1 T...
-
[29]
Connected symmetric term DX kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( ∂ U † kkk ∂ Xkkkmn PTTT A,kkkkkk ∂ Ukkk ∂ Xkkkmn ) ii = 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ (Emn − Enm)PTTT A,kkkkkk(Emn − Enm) ] ii = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( EmnPTTT A,kkkkkkEnm ) ii + (m ↔ n) (30) DY kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( ∂ U † kkk ∂ Ykkkmn PTTT A,kkkkkk ...
-
[30]
yields the corresponding k-space formulation, [ ˜φ kkki ˜φ kkk j ] = [ φ kkki φ kkk j ] cos θ eikkk·RRR sin θ − e− ikkk·RRR sin θ cos θ , (34) which corresponds to a k-dependent 2 × 2 rotation between Bloch orbitals i and j. B. Working equations for p = 2 To determine the optimal rotation angle θ , consider the PM objective for a pair of WFs (000i...
-
[31]
( 40), implying that the right-hand side vanishes
Since the input orbitals are already at a stationary point of the PM functional, θ = 0 is a trivial solution of eq. ( 40), implying that the right-hand side vanishes. The non-trivial solutions therefore satisfy tan 4θ = 0, i.e., 4θ = nπ with n ̸= 0
-
[32]
We therefore conclude that the only distinct non-trivial solutions are θ = π 4 , π 2 , 3π 4
Furthermore, it is sufficient to restrict θ to the interval [0,π ), because a rotation by θ = π corresponds to a simultaneous sign change of the two orbitals, which leaves the PM objective invariant, and any θ > π can be mapped back into [0,π ). We therefore conclude that the only distinct non-trivial solutions are θ = π 4 , π 2 , 3π 4 . (41) In practice, ...
-
[33]
for determining θ except that the two intermediates are defined as (A000RRR)i j = 4∑ TTT A [ (PTTT A,000RRR)2 ii − (PTTT A,000RRR)2 j j ] ℜ (PTTT A,000RRR)i j, (B000RRR)i j = ∑ TTT A [ (PTTT A,000RRR)ii + (PTTT A,000RRR) j j ]{[ (PTTT A,000RRR)ii − (PTTT A,000RRR) j j ]2 − [ 2ℜ (PTTT A,000RRR)i j ]2} , (42) For general p > 3, an analytic derivation becomes...
-
[34]
Let NR denote the number of lattice vectors RRR satisfying eq
to satisfy |RRR|< Rmax, (44) for a chosen cutoff Rmax, which provides a controllable reduction in computational cost. Let NR denote the number of lattice vectors RRR satisfying eq. ( 44). Under this restriction, the cost of the Jacobi stability check becomes O(Nk NR nproj n2 orb), (45) which is linear in Nk when NR is independent of Nk. We choose Rmax = 1...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.