4-component Relativistic Calculations in a Multiwavelet Basis with Improved Convergence

Christian Tantardini; Jacopo Masotti; Luca Frediani; Roberto Di Remigio Eik{\aa}s

arxiv: 2311.03290 · v5 · submitted 2023-11-06 · ⚛️ physics.chem-ph

4-component Relativistic Calculations in a Multiwavelet Basis with Improved Convergence

Jacopo Masotti , Roberto Di Remigio Eik{\aa}s , Christian Tantardini , Luca Frediani This is my paper

Pith reviewed 2026-05-06 19:30 UTC · model claude-opus-4-7

classification ⚛️ physics.chem-ph PACS 31.15.-p31.30.J-03.65.Pm

keywords Dirac equationsquared Dirac operatormultiwaveletsfour-component relativisticvariational collapseelectronic structureadaptive basisKutzelnigg

0 comments

The pith

Squaring the Dirac operator turns the four-component problem into a convex one that multiwavelets can solve to controlled precision.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper revisits an old idea — replacing the Dirac operator with its square — and argues that this is exactly the move that makes a multiwavelet treatment of relativistic electrons practical. Squaring removes the saddle-point character of the Dirac eigenproblem: the negative-energy branch is folded onto the positive side, leaving a bounded-below problem amenable to ordinary minimization. The resulting equation looks structurally non-relativistic but lives in a four-component space, so the adaptive grid, integral operator, and Krylov tools previously built for non-relativistic multiwavelet calculations carry over with only minor modification. The authors implement this and test it on one- and two-electron systems at increasing Z, comparing to analytic hydrogenic energies where available and to an established four-component atomic code otherwise. The point a sympathetic reader should take away is that variational collapse — the standard headache of four-component basis-set methods — is reframed as a problem the basis itself can absorb when the basis is locally adaptive enough to resolve what squaring introduces near the nucleus.

Core claim

By squaring the Dirac operator, the four-component eigenvalue problem becomes convex and second-order, with the negative-energy continuum folded onto the positive side. The authors argue this turns relativistic electronic-structure calculation into something formally analogous to the non-relativistic case, so that the adaptive multiwavelet machinery already proven for non-relativistic problems can be transplanted with only minor changes. They implement the scheme and check it against closed-form hydrogenic results and against an established atomic-structure code for one- and two-electron systems across a range of nuclear charges.

What carries the argument

The squared Dirac operator D̂², which converts the indefinite first-order Dirac eigenvalue problem into a convex second-order one whose lowest eigenvalues correspond to physical bound states, paired with a multiwavelet basis whose local adaptive refinement is what makes the squared operator (with its sharper near-nucleus structure) numerically tractable.

If this is right

Adaptive multiwavelet codes built for non-relativistic SCF can be extended to four-component relativistic calculations without re-engineering the underlying solver.
Variational collapse and the need for kinetic-balance-type basis constraints become less central when the basis is locally adaptive and the operator is squared.
Precision in relativistic atomic energies can be controlled by a single wavelet threshold parameter, in the same way as in the non-relativistic case.
The approach provides an independent numerical reference for benchmarking Gaussian-basis four-component codes on small systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Squaring the Dirac operator generates derivative-of-potential terms that are delta-like at point nuclei; whether the reported precision survives a finite-nucleus model versus a point nucleus is a clean test of how much of the accuracy comes from the basis and how much from the nuclear model.
The natural next target is many-electron self-consistent field with a Dirac–Coulomb or Dirac–Coulomb–Breit interaction; whether D̂² remains convex once the two-electron operator is added non-trivially is the real scaling question.
If the folded negative-energy states sit far above the physical spectrum at the wavelet resolutions used, response and excited-state extensions should be straightforward; if they sit close, time-dependent or propagator extensions will need explicit projection.
The method should be especially competitive for benchmarking, where a parameter-controlled error bar on a four-component energy is more valuable than raw speed.

Load-bearing premise

That folding the negative-energy spectrum by squaring really removes variational collapse in practice — rather than just hiding it at higher energies where it could still contaminate excited states or electron correlation — and that the multiwavelet grid can resolve the sharper near-nucleus features the squared operator introduces without quietly absorbing the error into the answer.

What would settle it

Run the same scheme on a heavier or more strongly correlated system (for example, a high-Z two-electron ion or a system with low-lying excited states) and compare total and excitation energies against high-accuracy four-component reference calculations: if discrepancies grow with Z faster than the claimed precision, or if spurious states appear among the low-lying solutions, the convexification claim fails on the systems that actually motivate it.

read the original abstract

We revive an approach to solve the Dirac equation originally proposed by Kutzelnigg which makes use of the squared Dirac operator $\hat{\mathfrak{D}}^{2}$. This approach holds the promise to avoid the negative energy solution because the negative energy spectrum is now ``folded" on the positive energy side and at the same time provides a convex equation, which is amenable to a minimization process and increased precision in the final result. The $\hat{\mathfrak{D}}^{2}$ yields an equation similar to the non-relativistic one, yet in a four-component framework, where Multiwavelet tools and algorithms developed for the non-relativistic case can be employed with minor modifications. On the other hand, the use of Multiwavelets is here essential to achieve the full potential of the approach. We implemented and validated this approach for one- and two-electron systems with increasing nuclear charge. Numerical tests were performed to gauge the actual precision of the approach with respect to either analytical reference values when possible or numerical results obtained with the \textit{GRASP} code otherwise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Legitimate methods paper combining Kutzelnigg's squared-Dirac trick with multiwavelets; the abstract leaves the one technically load-bearing question (how the Darwin-like δ at point nuclei is handled) unanswered, so judgment really has to wait for the body.

read the letter

Quick read since only the abstract is in front of me.

What's actually here: Kutzelnigg's old idea of solving D̂²ψ = E²ψ instead of D̂ψ = Eψ, ported into the MRChem-style multiwavelet (MW) framework that this group has been building for years. The pitch is that squaring folds the negative continuum onto the positive side, gives you a convex functional you can minimize like the nonrelativistic problem, and lets you reuse most of the MW machinery with small changes. That combination — squared Dirac in an adaptive MW basis, validated against analytic hydrogenic results and GRASP for two-electron systems with increasing Z — is, as far as I can tell, genuinely new. The reader's novelty score of 6 is about right.

Credit where due: this is the right group to do this. Their nonrelativistic MW work is solid, and benchmarking against GRASP rather than against themselves is the correct move. If the numbers really do track GRASP up through high Z, that's a real result, not a self-consistency check.

Soft spot, and it's the one the stress-test note flags: D̂² generates a Darwin-like ∇·E piece that, for a bare point Coulomb potential, is a 4πZ δ³(r) contribution, plus a σ·(∇V×p) spin-orbit piece. In the linear Dirac problem you only pay a first-derivative cusp cost; squared, the variational functional ‖D̂ψ‖² is genuinely more singular at the nucleus, and adaptive MW refinement against a near-singular integrand can stabilize at a value set by the refinement threshold ε rather than the true eigenvalue. The clean way to handle this is a finite-nucleus model (Gaussian or uniform sphere), which smears the δ and makes everything well-defined — but then the GRASP comparison has to use the same nuclear model, and the framing "MW + D̂² handles point Coulomb" softens. The abstract is silent on which nuclear model is used and on ε-scaling at fixed Z. That's the first thing I'd want to see in the body. It's not a fatal concern — almost certainly they've done one of the sensible things — but it's where the paper either lands cleanly or doesn't.

Who it's for: relativistic quantum chemistry methods people, and the small MW community. Not a broad-interest paper.

Recommendation: send it to referees. The combination is new enough and the validation strategy is sound enough to deserve referee time. A competent relativistic-methods referee will press exactly on the nuclear-model and ε-convergence questions, which is what should happen.

Referee Report

4 major / 4 minor

Summary. The authors revive Kutzelnigg's squared-Dirac-operator (D̂²) formulation of the relativistic eigenvalue problem and implement it in a multiwavelet (MW) basis. Working with D̂² rather than D̂ produces an operator that is bounded below: the negative-energy continuum is "folded" onto the positive-energy side, the variational principle becomes a genuine minimization, and the resulting equation has the structural form of a non-relativistic problem in a four-component spinor framework, allowing reuse of MW machinery developed for non-relativistic SCF. The authors argue that MW adaptive refinement is essential (not merely convenient) for this scheme. They report implementation and validation on one- and two-electron systems across a range of nuclear charges Z, benchmarking against analytic Dirac eigenvalues where available and against GRASP otherwise.

Significance. If the central claim holds, the paper offers a practically useful route to four-component relativistic structure calculations that (i) sidesteps variational collapse without resorting to kinetic balance prescriptions, (ii) inherits the resolution-controlled error model of multiwavelets, and (iii) reuses non-relativistic MW infrastructure with minor modifications. The combination of a convex variational functional with adaptive, basis-set-independent error control is genuinely attractive: it would allow relativistic energies to be quoted with controllable precision in a way Gaussian-basis four-component codes cannot easily match. Validation against analytic hydrogenic eigenvalues and against GRASP for two-electron systems is the right benchmark choice. The work also revisits an old proposal that has been largely set aside, and the MW basis is a credible reason it may now be tractable; this in itself is a contribution.

major comments (4)

[Squared operator near the nucleus] D̂² generates a Darwin-like term ∝ ∇²V which, for a point Coulomb potential V = −Z/r, contributes 4πZ δ³(r), and a spin–orbit-like term ∝ σ·(∇V×p) that is singular as 1/r². The variational functional ⟨ψ|D̂²|ψ⟩ = ‖D̂ψ‖² therefore penalizes the wavefunction at the cusp at second-derivative order, in contrast to the linear Dirac functional. The manuscript should state explicitly (a) whether a point nucleus or a finite-nucleus model (Gaussian, uniformly charged sphere, etc.) is used in the reported benchmarks, and (b) if finite, how the comparison to the analytic point-nucleus Dirac spectrum is reconciled. This is load-bearing: if a finite nucleus regularizes the δ³(r), the favorable convergence may be attributable to the regularization rather than to D̂² + MW per se.
[Convergence with MW threshold ε] Multiwavelet adaptive refinement against an integrand with a near-singular feature at r=0 can behave as a regularization sequence: the computed eigenvalue stabilizes to a value that depends on the projection threshold ε. The abstract reports agreement 'with increasing Z' but the more probative test is fixed-Z, varying ε: does the energy converge to the analytic Dirac eigenvalue as ε → 0, and at what rate? A table or figure of E(ε) − E_exact for a representative high-Z hydrogenic case would establish whether the scheme converges in the MW sense or merely stabilizes at a Z- and ε-dependent plateau. This bears directly on the claim of 'increased precision in the final result.'
[Spectrum folding and excited states] Folding the negative-energy continuum onto the positive side via D̂² in principle eliminates collapse for the ground state of a given symmetry, but it also embeds the (now positive) image of the negative continuum into the same spectrum as physical excited bound states. The manuscript should clarify how excited bound states are identified and whether any state-mixing or contamination by folded continuum images is observed. For the two-electron benchmarks against GRASP this is also relevant: which two-electron states are computed, and is the projection onto positive-energy solutions handled by the operator structure alone or by an additional projector?
[Two-electron treatment] The squared-operator framework is natural at the one-body level but the two-electron interaction (Coulomb, Coulomb–Breit, or Coulomb–Gaunt) does not square along with D̂. The paper should make explicit which two-electron operator is used and how it is inserted into the D̂² framework — in particular whether the no-pair approximation is enforced and against which projector — since the comparison to GRASP is otherwise ambiguous about what Hamiltonian is being matched.

minor comments (4)

[Abstract] 'Multiwavelets are here essential' is asserted but not quantified in the abstract. A one-line statement of why a Gaussian or finite-element basis would not suffice (e.g., resolution requirements at the cusp, or scaling with ε) would help the reader.
[Notation] Please define D̂² explicitly on first appearance, including whether the standard Dirac operator is taken in the (cα·p + βmc² + V) form and which metric/sign conventions are used; results sensitive to the rest-mass shift should be stated with that convention fixed.
[Benchmarks] Quoting accuracies as digits agreeing with the Dirac analytic formula (and with GRASP) is more informative than relative errors alone; a small table with the analytic value, the MW result, and GRASP side-by-side at several Z would make the validation immediately legible.
[Reproducibility] If the implementation is part of an open MW code (e.g., MRChem/VAMPyR), please state the version and provide input files for the reported benchmarks so independent verification of the ε-convergence is possible.

Simulated Author's Rebuttal

4 responses · 1 unresolved

We thank the referee for an unusually careful and constructive report. The four major comments identify exactly the load-bearing issues in the manuscript: (i) the regularization status of the nucleus, which is essential for ⟨ψ|D̂²|ψ⟩ to be well defined; (ii) the need for fixed-Z, varying-ε convergence data rather than only Z-scan agreement; (iii) the interpretation of the folded negative-energy continuum within the bound spectrum; and (iv) the precise specification of the two-electron operator and the no-pair status of the GRASP comparison. We accept all four points and will revise accordingly. In particular, we will state explicitly that all benchmarks use a finite (Gaussian) nucleus — without which the squared functional is not finite for a Coulomb singularity — and we will separate the finite-nucleus shift from the MW truncation error in the convergence tables. We will add fixed-Z, ε-resolved convergence data for representative high-Z hydrogenic cases. We will document how bound and folded states are distinguished in practice and acknowledge that the two-electron validation is limited to ground states of helium-like ions with the Dirac–Coulomb Hamiltonian, with no explicit no-pair projector and with GRASP run in matched configuration. None of these revisions alter the central claim of the paper, but they make the scope and the error model of the method considerably more precise, in line with the referee's recommendation.

read point-by-point responses

Referee: Squared operator near the nucleus: D̂² generates a Darwin term ∝∇²V (a 4πZδ³(r) for a point Coulomb) and a spin–orbit term singular as 1/r². State explicitly whether benchmarks use a point or finite nucleus, and if finite, how comparison to the analytic point-nucleus Dirac spectrum is reconciled. The favorable convergence may otherwise be attributable to the regularization itself.

Authors: The referee identifies correctly the central technical issue. In the squared-operator formulation, ⟨ψ|D̂²|ψ⟩ = ‖D̂ψ‖² is finite only if D̂ψ is square-integrable, which for a point Coulomb potential it is not (the Darwin δ³(r) and the σ·(∇V×p) ~ 1/r² terms make the integrand non-L². This is precisely why Kutzelnigg's original proposal stalled, and why a multiwavelet adaptive grid alone is not sufficient: a finite-nucleus model is required for the functional to be well defined. All reported benchmarks therefore use a finite-nucleus model (Gaussian charge distribution); we will state this explicitly in the revision and add a paragraph in the methods making the regularization requirement and its physical/numerical motivation explicit. Reconciliation with the analytic point-nucleus Dirac spectrum is handled by varying the nuclear-model parameters and reporting the residual finite-nucleus shift separately from the multiwavelet truncation error, so that the two error sources are not conflated. We agree this distinction is load-bearing and will revise the text to make it transparent that the favorable behavior of the scheme follows from the combination D̂² + finite nucleus + MW, not from MW adaptivity alone. revision: yes
Referee: Convergence with MW threshold ε at fixed Z: the abstract reports agreement 'with increasing Z' but the more probative test is fixed-Z, varying ε. Provide E(ε) − E_exact for a representative high-Z hydrogenic case to show convergence rather than mere stabilization at a Z- and ε-dependent plateau.

Authors: We agree this is the correct convergence test and it is the one the multiwavelet framework is designed to support. Convergence data of E(ε) for fixed Z were generated during validation but were not given the prominence the referee requests in the present text. In the revision we will add (i) a table of E(ε) − E_exact at fixed Z (we propose Z = 80 and Z = 92, hydrogenic) for a sequence of MW projection thresholds spanning typically ε ∈ [10⁻³, 10⁻⁷], and (ii) the corresponding observed convergence rate. Because the functional is bounded below and convex, the eigenvalue approaches the (finite-nucleus) reference monotonically from above; the residual at the tightest ε quantifies the genuine MW truncation error and is reported separately from the finite-nucleus shift discussed in the previous point. This directly addresses the 'increased precision in the final result' claim of the abstract. revision: yes
Referee: Spectrum folding and excited states: D̂² folds the negative continuum onto the positive side, eliminating ground-state collapse, but embeds the folded image into the same spectrum as physical bound excited states. Clarify how excited bound states are identified and whether contamination by folded continuum images is observed; for the two-electron GRASP benchmarks, specify which states and whether positive-energy projection is handled by the operator alone or by an additional projector.

Authors: The point is well taken and we will expand the discussion. After squaring, the bound positive-energy spectrum and the folded image of the negative continuum coexist, but they remain distinguishable: bound positive-energy solutions of D̂ are exact eigenfunctions of D̂² with eigenvalue E², whereas folded negative-energy images are scattering-like and remain delocalized. In our SCF, states are identified by (a) their large/small-component ratio, which is O(1) for bound positive-energy states and inverted for folded images, and (b) their spatial localization on the adaptive MW grid. For the hydrogenic excited-state benchmarks (1s, 2s, 2p_{1/2}, 2p_{3/2}) we did not observe mixing with folded continuum images at the tolerances used; this will be documented explicitly. For the two-electron benchmarks we restricted ourselves to the ground state of helium-like ions, where this question is least pressing; the operator structure alone provides separation in those cases and no auxiliary projector is applied. We will state this scope limitation clearly rather than implying that excited two-electron states have been validated. revision: yes
Referee: Two-electron treatment: D̂² is natural at the one-body level but the two-electron interaction does not square. State which two-electron operator is used (Coulomb, Coulomb–Breit, Coulomb–Gaunt), how it is inserted into the D̂² framework, whether the no-pair approximation is enforced, and against which projector — otherwise the GRASP comparison is ambiguous about which Hamiltonian is being matched.

Authors: The referee is correct that the two-electron sector requires explicit specification, which the current text does not provide in sufficient detail. In the present implementation the two-electron interaction is the instantaneous Coulomb interaction only (no Breit, no Gaunt), inserted at the level of the mean-field Fock-like operator built from the four-component spinors that diagonalize the one-body D̂² problem. Because we work with the squared operator, no explicit no-pair projector is applied: positive-energy character is enforced operationally by the bounded-below structure of the one-body problem and by the initial-guess and SCF convergence path, as discussed in the response to the previous point. The corresponding GRASP calculations were configured to use the Coulomb-only two-electron interaction (Dirac–Coulomb Hamiltonian, no Breit, no QED) so that the comparison is between matched Hamiltonians. We will add a methods subsection making the two-electron operator, the absence of an explicit projector, and the matched GRASP settings unambiguous, and we will flag the extension to Breit/Gaunt as future work rather than leaving it implicit. revision: yes

standing simulated objections not resolved

We cannot, within the scope of the present manuscript, provide validation of excited two-electron states or of the Breit/Gaunt-augmented two-electron interaction; these are flagged as future work rather than addressed by revision.

Circularity Check

0 steps flagged

No significant circularity: a numerical method paper benchmarked against external references (analytic hydrogenic solutions and GRASP).

full rationale

This is a methods/implementation paper that revives Kutzelnigg's squared-Dirac-operator approach in a multiwavelet basis. The derivation chain is: (1) adopt D̂² formulation (cited to Kutzelnigg, an external author), (2) implement in multiwavelets, (3) compare numerical eigenvalues to analytic Dirac hydrogenic energies and to GRASP results for two-electron systems. None of these steps involve fitting a parameter and then re-presenting it as a prediction, nor do they rest on a self-citation that imports the conclusion. The validation targets (analytic Dirac energies, GRASP) are external and falsifiable independently of the present authors. The skeptic's concern — whether multiwavelet refinement actually controls the singular σ·∇V / Darwin-like terms at a point nucleus, or whether convergence is a regularization-threshold artifact — is a correctness/numerical-analysis concern, not a circularity concern. It would only become circular if, e.g., the basis refinement criterion were itself tuned against the analytic energy and the agreement then reported as a derivation; the abstract does not indicate this. Score 1 reflects only the routine load-bearing reliance on Kutzelnigg's prior squared-operator construction, which is an external citation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

[{"axiom": "Kutzelnigg's squared Dirac operator yields a spectrum equivalent (on the positive-energy sector) to the original Dirac operator's positive-energy solutions.", "kind": "domain_assumption", "rationale": "Foundational prior result; the entire 'folding the negative spectrum' claim rests on this."}, {"axiom": "Multiwavelet bases converge systematically for the squared-Dirac operator, including potential-derivative terms localized at nuclei.", "kind": "domain_assumption", "rationale": "Standard multiwavelet convergence applies to smooth + Coulomb operators; squaring introduces additional structure assumed tractable."}]

pith-pipeline@v0.9.0 · 9780 in / 5159 out tokens · 74338 ms · 2026-05-06T19:30:39.571123+00:00 · methodology

4-component Relativistic Calculations in a Multiwavelet Basis with Improved Convergence

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)