arxiv: 2602.12382 · v2 · submitted 2026-02-12 · ⚛️ physics.chem-ph · cond-mat.mtrl-sci

Recognition: 2 theorem links

· Lean Theorem

Fast Generation of Pipek-Mezey Wannier Functions via the Co-Iterative Augmented Hessian Method

Gengzhi Yang , Hong-Zhou Ye

Authors on Pith no claims yet

Pith reviewed 2026-05-16 05:11 UTC · model grok-4.3

classification ⚛️ physics.chem-ph cond-mat.mtrl-sci

keywords Pipek-Mezey localizationWannier functionsk-point samplingsecond-order optimizationHessian-vector productWannier interpolationcomputational scalingsolid-state electronic structure

0 comments

The pith

k-CIAH extends second-order CIAH optimization to k-point Pipek-Mezey Wannier functions while achieving O(N_k² n³) scaling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces k-CIAH as a k-point extension of the co-iterative augmented Hessian algorithm for optimizing Pipek-Mezey localized Wannier functions. It uses an efficient Hessian-vector product to reach quadratic scaling in the number of k-points while retaining cubic scaling in orbital count. Benchmarks across insulators, semiconductors, metals, and surfaces show rapid convergence and 2-3 times higher overall efficiency than first-order k-space methods, with much larger gains over the original Γ-point version. The resulting functions support accurate electronic band structures through Wannier interpolation. This combination of second-order convergence and improved scaling addresses a practical bottleneck in generating localized bases for periodic systems.

Core claim

By extending the CIAH procedure to k-space and evaluating the Hessian-vector product efficiently, k-CIAH performs second-order optimization of the Pipek-Mezey localization functional at O(N_k² n³) cost in both time and memory. This scaling matches that of existing first-order k-space methods yet delivers superior convergence behavior, producing speedups of roughly 2-3 fold for systems with 1000-5000 orbitals and orders-of-magnitude gains relative to Γ-point CIAH. The optimized PMWFs are shown to yield accurate band structures when used for Wannier interpolation on a range of solids.

What carries the argument

The k-CIAH optimizer, which minimizes the Pipek-Mezey functional via repeated, low-cost Hessian-vector multiplications that avoid the cubic scaling in k-points of the naive second-order approach.

If this is right

Localization of 1000-5000 orbitals becomes 2-3 times faster than first-order k-space methods while preserving second-order convergence.
Computational cost drops by orders of magnitude compared with Γ-point CIAH for large orbital counts.
Generated PMWFs produce accurate electronic band structures via Wannier interpolation.
Robust convergence holds for insulators, semiconductors, metals, and surfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same efficient Hessian-vector technique could accelerate other second-order localization criteria beyond Pipek-Mezey.
The reduced scaling may enable routine use of denser k-meshes or larger unit cells in solid-state simulations that rely on localized bases.
High-quality PMWFs from this route could improve downstream calculations of transport or electron-phonon properties that depend on accurate localized representations.

Load-bearing premise

The efficient Hessian-vector product evaluation maintains numerical stability and fast convergence without hidden costs or extra adjustments across all classes of solids.

What would settle it

A dense k-point sampling of a metallic surface where k-CIAH either diverges or exhibits scaling worse than O(N_k² n³) due to instability in the product evaluation.

Figures

Figures reproduced from arXiv: 2602.12382 by Gengzhi Yang, Hong-Zhou Ye.

**Figure 2.** Figure 2: FIG. 2. CPU time (seconds) for PMWF optimization using [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. (A–D) Electronic band structure of h-BN obtained from reference non-SCF calculations (gray) and from Wannier interpolation using [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

We report a $k$-point extension of the second-order co-iterative augmented Hessian (CIAH) algorithm, termed $k$-CIAH, for Pipek-Mezey (PM) localization of Wannier functions (WFs). By exploiting an efficient evaluation of the Hessian-vector product, $k$-CIAH achieves $O(N_k^2 n^3)$ scaling in both CPU time and memory, matching that of previously reported first-order $k$-space approaches while improving upon the $O(N_k^3 n^3)$ scaling of $\Gamma$-point CIAH, where $N_k$ denotes the number of $k$-points sampling the first Brillouin zone and $n$ characterizes the unit-cell size. Benchmark calculations on a diverse set of solids -- including insulators, semiconductors, metals, and surfaces -- demonstrate the fast and robust convergence of $k$-CIAH-based PMWF optimization, which yields an overall computational efficiency approximately 2-3--fold higher than first-order $k$-space methods and orders of magnitude higher than $\Gamma$-point CIAH for localizing 1000-5000 orbitals. The quality of the resulting PMWFs is further validated by accurate electronic band structures obtained via PMWF-based Wannier interpolation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

k-CIAH extends second-order CIAH to k-point PM Wannier localization with O(N_k^2 n^3) scaling and 2-3x speedups backed by benchmarks.

read the letter

The main thing to know is that this paper gives a k-point version of the co-iterative augmented Hessian optimizer for Pipek-Mezey Wannier functions. It uses an efficient Hessian-vector product to reach O(N_k^2 n^3) scaling in time and memory, which beats the Gamma-point CIAH scaling while matching first-order k-space methods but with faster convergence. Benchmarks on insulators, semiconductors, metals, and surfaces show 2-3 fold overall gains for 1000-5000 orbitals, and the localized functions produce accurate band structures via Wannier interpolation.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces a k-point extension (k-CIAH) of the co-iterative augmented Hessian algorithm for Pipek-Mezey localization of Wannier functions. By deriving an efficient Hessian-vector product, the method achieves O(N_k² n³) scaling in CPU time and memory, matching first-order k-space approaches while improving on the O(N_k³ n³) scaling of the Γ-point CIAH. Benchmarks on insulators, semiconductors, metals, and surfaces demonstrate fast convergence and 2-3× overall efficiency gains for 1000-5000 orbitals, with quality validated through PMWF-based Wannier interpolation of electronic band structures.

Significance. If the reported scaling and convergence hold, the work supplies a practical second-order optimizer for periodic PMWF generation that reduces the cost of dense k-point sampling without sacrificing localization quality. The explicit Hessian-vector product derivation, reproducible benchmark timings across chemically diverse solids, and direct band-structure validation constitute concrete strengths that would make the algorithm immediately useful for large-scale solid-state calculations.

minor comments (3)

[Abstract] Abstract: the symbol n is introduced only as 'characterizes the unit-cell size'; a parenthetical definition (e.g., 'n = number of orbitals per unit cell') would remove ambiguity for readers scanning the scaling claims.
The manuscript would benefit from a single summary table (e.g., Table X) that lists, for each benchmark system, the material class, N_k, n, wall-time ratios versus first-order k-space and Γ-point CIAH, and the final spread functional value; this would make the '2-3-fold' and 'orders of magnitude' statements directly verifiable.
Section on numerical stability: while the text states robust convergence for metals, a short paragraph or supplementary note quantifying the condition number of the Hessian-vector product or the number of line-search steps required for metallic cases would strengthen the claim that no hidden costs affect the reported scaling.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the k-CIAH method and for recommending minor revision. We are pleased that the reported scaling, convergence behavior, and practical utility for dense k-point sampling in solids are recognized as strengths.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation chain relies on standard second-order optimization (augmented Hessian) extended to k-space with explicit algebraic derivations for the Hessian-vector product, yielding the stated O(N_k^2 n^3) scaling analytically rather than by fitting or redefinition. Benchmarks on insulators, semiconductors, metals and surfaces, plus band-structure validation via Wannier interpolation, supply independent empirical support. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear; the central efficiency claims rest on algorithmic analysis and timing data external to any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard mathematical assumptions of second-order optimization convergence applied to the Pipek-Mezey functional; no free parameters, new entities, or ad-hoc axioms are introduced beyond domain-standard techniques.

axioms (1)

domain assumption The co-iterative augmented Hessian method converges robustly for the Pipek-Mezey localization functional when extended to k-points
Invoked implicitly in the description of fast and robust convergence for diverse solids.

pith-pipeline@v0.9.0 · 5536 in / 1257 out tokens · 57104 ms · 2026-05-16T05:11:33.636827+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

The special case in which this real-space unitary is constrained to be real-valued is discussed in sec- tion II G

to a single com- plex, translationally invariant real-space unitary acting on the supercell WFs. The special case in which this real-space unitary is constrained to be real-valued is discussed in sec- tion II G. D. Optimization of PMWFs using the co-iterative augmented Hessian (CIAH) method In this work, we determine the unitary rotations that trans- form...

work page
[2]

using the second-order co-iterative augmented Hes- sian (CIAH) algorithm. 43 CIAH is a modiﬁed trust-region Newton method46 that has been successfully applied to orbital localization in molecules 43,44 and in periodic solids with Γ - point Brillouin zone sampling. 22– 24,45 Near a local minimum, CIAH exhibits quadratic convergence while maintaining suf- ﬁ...

work page
[3]

(The Hessian diagonals are used to precondition the Davidson update

with the Davidson algorithm are derived in the Supporting Informa- tion. (The Hessian diagonals are used to precondition the Davidson update

work page
[4]

We summarize the working equations be- low. We ﬁrst deﬁne two types of matrix elements of the atomic projection operators, (PTTT A,kkkkkk′)i j = 1 Nk ⟨φ kkki|ˆPTTT A|φ kkk′j⟩ = 1 Nk ∑ µ ∈ A O∗ TTT µ ,kkki OTTT µ ,kkk′j, (PTTT A,kkk000)i j = 1√ Nk ⟨φ kkk j|ˆPTTT A|w000i⟩ = 1√ Nk ∑ µ ∈ A O∗ TTT µ ,kkki OTTT µ ,000 j, (20) where OTTT µ ,kkki = ⟨χ TTT µ |φ kk...

work page
[5]

35,38 In section IV, we compare the performance and computational efﬁciency of k-CIAH- and k-BFGS-based PMWF optimization

also enables PMWF optimization using gradient-based methods, such as the BFGS quasi-Newton algorithm, 39– 42 as explored in previ- ous studies. 35,38 In section IV, we compare the performance and computational efﬁciency of k-CIAH- and k-BFGS-based PMWF optimization. E. Cost of k-CIAH-based PMWF optimization The computational cost of k-CIAH-based PMWF opti...

work page
[6]

( 23) when solving the augmented Hessian eigenvalue problem in eq

and the Hessian–vector product in eq. ( 23) when solving the augmented Hessian eigenvalue problem in eq. (

work page
[7]

with the Davidson algorithm. The cost of the gra- dient evaluation—and therefore of gradient-based methods such as k-BFGS 35,38—is controlled by the construction of the projection matrix (PTTT A,kkk000)i j via eq. ( 20), which requires O(N2 k nprojn2 orb) ∼ O(N2 k n3) CPU time and the same order of memory to store the overlap matrix OTTT µ ,kkki in eq. (2...

work page
[8]

The connected sym- metric contribution in eq

Connected symmetric term. The connected sym- metric contribution in eq. ( 25) can be assembled at O(N2 k natomn2 orb) CPU cost once the projection–vector product (Pv)TTT A,kkki j = Nk ∑ kkk′ norb ∑ m (PTTT A,kkkkkk′)im vkkk′m j (29) is available. Equation (

work page
[9]

can itself be evaluated at O(N2 k nprojn2 orb) by exploiting the factorized form of the projection operators in eqs. (

work page
[10]

and ( 7). For orthonor- mal projectors, (Pv)TTT A,kkki j = ∑ µ ∈ A O∗ TTT µ ,kkki ( Nk ∑ kkk′ norb ∑ m OTTT µ ,kkk′m vkkk′m j ) , (30) with analogous expressions for biorthogonalized pro- jectors. We emphasize that evaluating (Pv)TTT A,kkki j via eq. (

work page
[11]

avoids the unfavorable O(N3 k n3) memory cost associated with explicitly storing the full projection ma- trix (PTTT A,kkkkkk′)i j, and thus constitutes a key performance optimization of the present implementation

work page
[12]

The disconnected contribution in eq

Disconnected term. The disconnected contribution in eq. (

work page
[13]

( 30), Nk ∑ kkk′ ℜ [ (v† kkk′PTTT A,kkk′000) j j ] = Nk ∑ kkk (Pv)TTT A,kkk j j

can also be constructed at O(N2 k natomn2 orb) CPU cost once the second term is evaluated efﬁciently using the precomputed projection–vector product in eq. ( 30), Nk ∑ kkk′ ℜ [ (v† kkk′PTTT A,kkk′000) j j ] = Nk ∑ kkk (Pv)TTT A,kkk j j. (31)

work page
[14]

The connected asym- metric part in eq

Connected asymmetric term. The connected asym- metric part in eq. (

work page
[15]

can be evaluated at O(Nkn3 orb) cost as ˜ σ c-asymm kkki j = − p norb ∑ m ( Akkkim v∗ kkk jm + v∗ kkkmi Akkkm j ) , (32) with intermediates Akkki j = Natom ∑ TTT A QTTT A,000 j (PTTT A,kkk000)i j, (33) which can be formed at O(N2 k natomn2 orb) CPU cost. As in the gradient evaluation, the memory footprint of the Hessian–vector product is dominated by stor...

work page
[16]

(37) Throughout this section, − kkk is understood as − kkk mod GGG for an appropriate reciprocal lattice vector GGG that maps − kkk back into the ﬁrst Brillouin zone

can be chosen to be real-valued by imposing time-reversal symmetry (TRS) on the k-space generators, κ kkk = κ ∗ − kkk. (37) Throughout this section, − kkk is understood as − kkk mod GGG for an appropriate reciprocal lattice vector GGG that maps − kkk back into the ﬁrst Brillouin zone. The TRS constraint in eq. ( 37) reduces the total number of independent...

work page
[17]

(40) The Hessian diagonal elements are modiﬁed analogously from eqs

and (23) by symmetrizing with respect to kkk ↔ − kkk, Gkkki j = Gkkki j + (1 − δ kkk,− kkk) G∗ − kkki j, (39) ( Hv)kkki j = (Hv)kkki j + (1 − δ kkk,− kkk) (Hv)∗ − kkki j. (40) The Hessian diagonal elements are modiﬁed analogously from eqs. (

work page
[18]

Speciﬁcally, PTTT A,kkk000 in eq

and ( 28). Speciﬁcally, PTTT A,kkk000 in eq. ( 28) is replaced by its TRS-packed form, PTTT A,kkk000 = PTTT A,kkk000 + (1 − δ kkk,− kkk) P∗ TTT A,− kkk000, (41) and PTTT A,kkkkkk is replaced by PTTT A,kkkkkk = PTTT A,kkkkkk +(1− δ kkk,− kkk) [ PTTT A,− kkk(− kkk) +P∗ TTT A,− kkkkkk +P∗ TTT A,kkk(− kkk) ] . (42) In practice, for all systems tested we found...

work page
[19]

RMWCsbGQApzsSkiEKUfcBenVu4A=

with the following diagonal approximation, PTTT A,kkkkkk ≈ PTTT A,kkkkkk + (1 − δ kkk,− kkk) PTTT A,− kkk(− kkk), (43) does not measurably affect the convergence rate of k-CIAH- based PMWF optimization. This is expected because the Hes- sian diagonals serve only as a preconditioner for solving the augmented Hessian eigenvalue problem in eq. ( 18). All nu-...

work page arXiv 2000
[20]

Connected symmetric term 6

work page
[21]

Hessian diagonal 7

Connected asymmetric term 6 F. Hessian diagonal 7

work page
[22]

Connected asymmetric term 8

work page
[23]

Jacobi sweep 9 A

Connected symmetric term 8 III. Jacobi sweep 9 A. General consideration 9 B. Working equations for p = 2 9 C. Extension for p > 2 10 D. Efﬁcient implementation and computational cost 11 2 I. STRUCTURE FILES The structure ﬁles for all solid-state systems studied in this work are available in the following GitHub repository: https://github.com/hongzhouye/su...

work page 2026
[24]

(8)] gives Σ kkk = ˜Σ kkk − ˜Σ kkk, ˜Σ kkkmn = 4p(p − 1)∑ TTT A Qp− 2 TTT A,000m ∑ kkk′ ℜ (PTTT A,000kkk′vkkk′)mmPTTT A,000m,kkkn (16) 5

Disconnected term Σ X kkkmn = − 1 2 p(p − 1) ∑ TTT Ai Qp− 2 TTT A,000i ∂ QTTT A,000i ∂ Xkkkmn ∑ kkk′pq ( ∂ QTTT A,000i ∂ Xkkk′pq vX kkk′pq + ∂ QTTT A,000i ∂ Ykkk′pq vY kkk′pq ) , Σ Y mn = − 1 2 p(p − 1) ∑ TTT Ai Qp− 2 TTT A,000i ∂ QTTT A,000i ∂ Ykkkmn ∑ kkk′pq ( ∂ QTTT A,000i ∂ Xkkk′pq vX kkk′pq + ∂ QTTT A,000i ∂ Ykkk′pq vY kkk′pq ) (14) where the term in...

work page
[25]

(1)] gives Σ kkk = ˜Σ kkk − ˜Σ † kkk, ˜Σ kkkmn = − 2p∑ TTT A Qp− 1 TTT A,000n ∑ kkk′ ( PTTT A,kkkkkk′vkkk′ ) mn (19)

Connected symmetric term Σ X kkkmn = − p ∑ TTT A,i Qp− 1 TTT A,000iℜ   ∂ U † kkk ∂ Xkkkmn PTTT A,kkkkkk′ ∑ kkk′pq ( ∂ Ukkk′ ∂ Xkkk′pq vX kkk′pq + ∂ Ukkk′ ∂ Ykkk′pq vY kkk′pq )  ii , Σ Y kkkmn = − p ∑ TTT A,i Qp− 1 TTT A,000iℜ   ∂ U † kkk ∂ Ykkkmn PTTT A,kkkkkk′ ∑ kkk′pq ( ∂ Ukkk′ ∂ Xkkk′pq vX kkk′pq + ∂ Ukkk′ ∂ Ykkk′pq vY kkk′pq )  ii , (17) where...

work page
[26]

Connected asymmetric term Σ X kkkmn = − 1 2 p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk { ∂ Ukkk ∂ Xkkkmn ,∑ pq ∂ Ukkk ∂ Xkkkpq vX kkkpq + ∂ Ukkk ∂ Ykkkpq vY kkkpq }] ii Σ Y kkkmn = − 1 2 p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk { ∂ Ukkk ∂ Ykkkmn ,∑ pq ∂ Ukkk ∂ Xkkkpq vX kkkpq + ∂ Ukkk ∂ Ykkkpq vY kkkpq }] ii (20) Using eq. (18) to simplify terms i...

work page
[27]

Disconnected term DX kkkmn = − p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i ( ∂ QTTT A,000i ∂ Xkkkmn )2 = − 4p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i [ ℜ (PTTT A,000kkk)mn ]2 δ mi + (m ↔ n) = − 4p(p − 1)∑ TTT A Qp− 2 TTT A,000m [ ℜ (PTTT A,000kkk)mn ]2 + (m ↔ n) (24) DY kkkmn = − p(p − 1) ∑ TTT A,i Qp− 2 TTT A,000i (∂ QTTT A,000i ∂ Ykkkmn )2 = − 4p(p − 1) ∑ TTT A,i Qp−...

work page
[28]

Connected asymmetric term DX kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkk ∂ 2Ukkk ∂ X 2 kkkmn ) ii = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ PTTT A,000kkk(Emn − Enm)2] ii = 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkkEmm ) ii + (m ↔ n) (27) DY kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( PTTT A,000kkk ∂ 2Ukkk ∂ Y 2 kkkmn ) ii = 2p ∑ TTT A,i Qp− 1 T...

work page
[29]

JACOBI SWEEP A

Connected symmetric term DX kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( ∂ U † kkk ∂ Xkkkmn PTTT A,kkkkkk ∂ Ukkk ∂ Xkkkmn ) ii = 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ [ (Emn − Enm)PTTT A,kkkkkk(Emn − Enm) ] ii = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( EmnPTTT A,kkkkkkEnm ) ii + (m ↔ n) (30) DY kkkmn = − 2p ∑ TTT A,i Qp− 1 TTT A,000iℜ ( ∂ U † kkk ∂ Ykkkmn PTTT A,kkkkkk ...

work page
[30]

yields the corresponding k-space formulation, [ ˜φ kkki ˜φ kkk j ] = [ φ kkki φ kkk j ]   cos θ eikkk·RRR sin θ − e− ikkk·RRR sin θ cos θ   , (34) which corresponds to a k-dependent 2 × 2 rotation between Bloch orbitals i and j. B. Working equations for p = 2 To determine the optimal rotation angle θ , consider the PM objective for a pair of WFs (000i...

work page
[31]

( 40), implying that the right-hand side vanishes

Since the input orbitals are already at a stationary point of the PM functional, θ = 0 is a trivial solution of eq. ( 40), implying that the right-hand side vanishes. The non-trivial solutions therefore satisfy tan 4θ = 0, i.e., 4θ = nπ with n ̸= 0

work page
[32]

We therefore conclude that the only distinct non-trivial solutions are θ = π 4 , π 2 , 3π 4

Furthermore, it is sufﬁcient to restrict θ to the interval [0,π ), because a rotation by θ = π corresponds to a simultaneous sign change of the two orbitals, which leaves the PM objective invariant, and any θ > π can be mapped back into [0,π ). We therefore conclude that the only distinct non-trivial solutions are θ = π 4 , π 2 , 3π 4 . (41) In practice, ...

work page
[33]

for determining θ except that the two intermediates are deﬁned as (A000RRR)i j = 4∑ TTT A [ (PTTT A,000RRR)2 ii − (PTTT A,000RRR)2 j j ] ℜ (PTTT A,000RRR)i j, (B000RRR)i j = ∑ TTT A [ (PTTT A,000RRR)ii + (PTTT A,000RRR) j j ]{[ (PTTT A,000RRR)ii − (PTTT A,000RRR) j j ]2 − [ 2ℜ (PTTT A,000RRR)i j ]2} , (42) For general p > 3, an analytic derivation becomes...

work page
[34]

Let NR denote the number of lattice vectors RRR satisfying eq

to satisfy |RRR|< Rmax, (44) for a chosen cutoff Rmax, which provides a controllable reduction in computational cost. Let NR denote the number of lattice vectors RRR satisfying eq. ( 44). Under this restriction, the cost of the Jacobi stability check becomes O(Nk NR nproj n2 orb), (45) which is linear in Nk when NR is independent of Nk. We choose Rmax = 1...

work page