pith. sign in

arxiv: 2511.02153 · v3 · submitted 2025-11-04 · 🧮 math.NA · cs.NA· math.OC

A Joint Variational Framework for Multimodal X-ray Ptychography and Fluorescence Reconstruction

Pith reviewed 2026-05-18 01:58 UTC · model grok-4.3

classification 🧮 math.NA cs.NAmath.OC
keywords X-ray ptychographyX-ray fluorescencejoint variational frameworkmultimodal reconstructionnonlinear least-squaresinverse problemscomputational imaging
0
0 comments X

The pith

A joint nonlinear least-squares problem with shared spatial variables couples X-ray ptychography and fluorescence to enforce cross-modal consistency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formulates ptychography and fluorescence reconstruction as a single optimization task rather than two independent inverses. Shared spatial variables link the phase and absorption maps from diffraction data to the elemental maps from fluorescence, so that structural and compositional estimates must remain consistent with each other. This coupling is intended to improve the conditioning of the overall inverse problem and to stabilize the nonlinear solver. Tests on simulated measurements show the joint solve reaches lower error in fewer iterations than separate reconstructions of each modality. The result illustrates how variational fusion of complementary X-ray contrasts can reduce the ill-posedness typical of high-resolution imaging.

Core claim

By placing both modalities inside one nonlinear least-squares objective that shares the underlying spatial variables, the framework enforces consistency between the complex transmission function recovered by ptychography and the quantitative elemental distributions recovered by fluorescence, thereby improving conditioning, convergence speed, and reconstruction accuracy relative to independent inversions.

What carries the argument

A single nonlinear least-squares problem whose objective couples ptychography diffraction terms and fluorescence emission terms through shared spatial variables.

If this is right

  • Joint optimization converges faster than separate ptychography and fluorescence inversions.
  • Reconstructions become sharper and more quantitatively accurate when cross-modal consistency is enforced.
  • Relative error drops compared with independent processing of each data set.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same shared-variable strategy could be applied to other pairs of X-ray contrasts, such as coherent diffraction and absorption tomography.
  • Real experimental data may reveal whether the enforced consistency preserves modality-specific noise characteristics or introduces systematic offsets.
  • Extension to time-resolved or three-dimensional measurements would test whether the conditioning benefit scales with problem size.

Load-bearing premise

Enforcing consistency by sharing spatial variables between the two modalities will improve problem conditioning without introducing new inconsistencies or biases in the structural and compositional estimates.

What would settle it

A side-by-side comparison of relative reconstruction error and convergence rate on matched experimental (not simulated) ptychography-plus-fluorescence datasets would show whether the joint formulation retains its reported gains.

Figures

Figures reproduced from arXiv: 2511.02153 by Chengru Eric Zou, Elle Buser, Yuanzhe Xi, Zichao Wendy Di.

Figure 1
Figure 1. Figure 1: An illustration of a joint ptychographic and fluorescence experiment. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Normalized eigenvalue distribution comparison in reconstruction. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: True point gradient norm perturbation log-scale plot with illustration. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Loss surface around θ ⋆ along two sharp directions (eigenvectors for the 200th and 201st largest eigenvalues). Comparison between ptychography (left) and joint (right). The joint landscape exhibits a steeper, more decisive basin. Probe and scanning setup. The probe is simulated as a Fresnel zone plate following [22]. Scanning positions form a regular grid with N = 64, corresponding to a 50% overlap ratio b… view at source ↗
Figure 5
Figure 5. Figure 5: Loss surface around θ ⋆ along two flat directions (eigenvectors for the 4000th and 4001st largest eigenvalues). Comparison between ptychography (left) and joint (right). The joint landscape is smoother and more convex-like [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Magnitude (left) and phase (right) of the simulated probe matrix used in numerical [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Reconstruction results for ptychography only, fluorescence only and the joint method [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Reconstruction results for ptychography and the joint method on a larger synthetic [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Log–log plots of loss (left) and reconstruction error (right) for ptychography and the [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Log–log plots of loss (left) and reconstruction error (right) for ptychography [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Log–log plots of objective loss (top) and reconstruction error (bottom) for fluores [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Reconstruction results for ptychography and joint method applied to a synthetic [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Log–log plots of loss (top) and reconstruction error (bottom) for ptychography and [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Log–log plots of loss (top) and reconstruction error (bottom) for fluorescence and [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Reconstruction results for synthetic objects of different sizes with Noise Level = [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Log–log plots of objective loss (top) and reconstruction error (bottom) for pty [PITH_FULL_IMAGE:figures/full_fig_p022_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Log–log plots of objective loss (top) and reconstruction error (bottom) for fluores [PITH_FULL_IMAGE:figures/full_fig_p023_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Joint and ptychography reconstructions running 200 iterations on synthetic objects [PITH_FULL_IMAGE:figures/full_fig_p024_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Reconstruction results for the Cameraman–Baboon dataset under Noise Level = [PITH_FULL_IMAGE:figures/full_fig_p024_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Log–log plots of objective loss (top row) and reconstruction error (bottom row) for [PITH_FULL_IMAGE:figures/full_fig_p025_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Log–log plots of objective loss (top row) and reconstruction error (bottom row) for [PITH_FULL_IMAGE:figures/full_fig_p026_21.png] view at source ↗
read the original abstract

Recovering high-resolution structural and compositional information from coherent X-ray measurements involves solving coupled, nonlinear, and ill-posed inverse problems. Ptychography reconstructs a complex transmission function from overlapping diffraction patterns, while X-ray fluorescence provides quantitative, element-specific contrast at lower spatial resolution. We formulate a joint variational framework that integrates these two modalities into a single nonlinear least-squares problem with shared spatial variables. This formulation enforces cross-modal consistency between structural and compositional estimates, improving conditioning and promoting stable convergence. The resulting optimization couples complementary contrast mechanisms (i.e., phase and absorption from ptychography, elemental composition from fluorescence) within a unified inverse model. Numerical experiments on simulated data demonstrate that the joint reconstruction achieves faster convergence, sharper and more quantitative reconstructions, and lower relative error compared with separate inversions. The proposed approach illustrates how multimodal variational formulations can enhance stability, resolution, and interpretability in computational X-ray imaging.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a joint variational framework for multimodal X-ray ptychography and fluorescence reconstruction. It formulates the inverse problem as a single nonlinear least-squares optimization with shared spatial variables to enforce cross-modal consistency between the complex transmission function (ptychography) and element-specific compositional contrast (fluorescence). Numerical experiments on simulated data are reported to show faster convergence, sharper and more quantitative reconstructions, and lower relative error relative to separate inversions of each modality.

Significance. If the numerical improvements hold under more rigorous validation, the work would contribute a practical variational coupling strategy for multimodal X-ray imaging that leverages complementary contrast mechanisms. The shared-variable formulation is a direct and standard way to promote consistency without introducing additional regularizers, and the reported gains in conditioning and convergence align with expectations for joint inverse problems. The absence of real-data experiments and detailed simulation protocols currently limits the strength of the claims.

major comments (2)
  1. [Numerical Experiments] Numerical Experiments section: the reported improvements in convergence speed and relative error are presented without error bars, details on the forward-model simulation parameters (e.g., noise levels, overlap ratios, or photon counts), or explicit exclusion criteria for the test cases. This leaves the quantitative comparison only moderately supported and makes it difficult to assess robustness of the central claim that the joint formulation improves stability.
  2. [Formulation and Numerical Experiments] The formulation assumes that shared spatial variables will improve conditioning without introducing new biases between structural and compositional estimates, yet no sensitivity analysis or consistency check (e.g., comparison of recovered absorption vs. fluorescence-derived elemental maps on the same phantom) is provided to verify this assumption holds in the reported experiments.
minor comments (2)
  1. [Abstract and Introduction] The abstract and introduction would benefit from a brief statement of the precise optimization variables and the form of the joint objective function to make the contribution clearer to readers unfamiliar with the specific modalities.
  2. [Methods] Notation for the shared spatial variables and the individual forward operators should be introduced consistently in the methods section to avoid ambiguity when comparing the joint versus separate formulations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and describe the revisions planned for the next version.

read point-by-point responses
  1. Referee: [Numerical Experiments] Numerical Experiments section: the reported improvements in convergence speed and relative error are presented without error bars, details on the forward-model simulation parameters (e.g., noise levels, overlap ratios, or photon counts), or explicit exclusion criteria for the test cases. This leaves the quantitative comparison only moderately supported and makes it difficult to assess robustness of the central claim that the joint formulation improves stability.

    Authors: We agree that more details on the simulation setup are needed to support the quantitative claims. In the revised manuscript we will add error bars computed over multiple independent noise realizations, specify the exact noise levels, overlap ratios, and photon counts used in the forward model, and describe the selection criteria for the test phantoms. These changes will allow readers to better evaluate the robustness of the reported improvements in convergence and relative error. revision: yes

  2. Referee: [Formulation and Numerical Experiments] The formulation assumes that shared spatial variables will improve conditioning without introducing new biases between structural and compositional estimates, yet no sensitivity analysis or consistency check (e.g., comparison of recovered absorption vs. fluorescence-derived elemental maps on the same phantom) is provided to verify this assumption holds in the reported experiments.

    Authors: The shared-variable formulation directly couples the modalities through the common spatial support, which is expected to improve conditioning while preserving consistency because fluorescence supplies independent elemental contrast that aligns with ptychographic absorption. Although an explicit sensitivity analysis was not included in the original submission, the lower relative errors of the joint reconstructions versus separate inversions already indicate that no substantial new biases are introduced. We will add a direct consistency check comparing the recovered absorption map from the joint reconstruction against the fluorescence-derived elemental maps on the same phantom. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a new joint variational framework formulated as a single nonlinear least-squares problem with shared spatial variables to couple ptychography and fluorescence modalities. This is presented as a standard variational coupling of complementary contrast mechanisms, with performance claims (faster convergence, lower relative error) supported by numerical experiments on simulated data rather than by algebraic reduction to inputs. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain; the formulation and empirical validation remain independent of the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard assumptions from variational inverse problems in imaging. The joint formulation itself is the primary modeling choice; no explicit free parameters, invented entities, or additional axioms are detailed in the abstract.

axioms (1)
  • domain assumption Cross-modal consistency between structural and compositional estimates can be effectively enforced by sharing spatial variables in the nonlinear least-squares objective.
    This modeling premise underpins the claimed improvement in conditioning and convergence.

pith-pipeline@v0.9.0 · 5700 in / 1113 out tokens · 38065 ms · 2026-05-18T01:58:38.243491+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    Hierarchical matrix approximations of hessians arising in inverse problems governed by pdes, 2020

    Ilona Ambartsumyan, Wajih Boukaram, Tan Bui-Thanh, Omar Ghattas, David Keyes, Georg Stadler, George Turkiyyah, and Stefano Zampini. Hierarchical matrix approximations of hessians arising in inverse problems governed by pdes, 2020

  2. [2]

    Springer Science & Business Media, 2007

    Burkhard Beckhoff, Birgit Kanngießer, Norbert Langhoff, Reiner Wedell, and Helmut Wolff.Handbook of practical X-ray fluorescence analysis. Springer Science & Business Media, 2007

  3. [3]

    A stochastic admm algorithm for large-scale ptychography with weighted difference of anisotropic and isotropic total variation.Inverse Problems, 40(5):055006, 2024

    Kevin Bui and Zichao Wendy Di. A stochastic admm algorithm for large-scale ptychography with weighted difference of anisotropic and isotropic total variation.Inverse Problems, 40(5):055006, 2024

  4. [4]

    Blind ptychographic phase retrieval via conver- gent alternating direction method of multipliers.SIAM Journal on Imaging Sciences, 12(1):153–185, 2019

    Huibin Chang, Pablo Enfedaque, and Stefano Marchesini. Blind ptychographic phase retrieval via conver- gent alternating direction method of multipliers.SIAM Journal on Imaging Sciences, 12(1):153–185, 2019

  5. [5]

    Overlapping domain decomposition methods for ptychographic imaging, 2021

    Huibin Chang, Roland Glowinski, Stefano Marchesini, Xue cheng Tai, Yang Wang, and Tieyong Zeng. Overlapping domain decomposition methods for ptychographic imaging, 2021

  6. [6]

    GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks

    Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. Gradnorm: Gradient normal- ization for adaptive loss balancing in deep multitask networks.CoRR, abs/1711.02257, 2017. JOINT RECONSTRUCTION FOR X-RAY PTYCHOGRAPHY AND FLUORESCENCE 19 Noise Level = 0%: loss (top), error (bottom) Noise Level = 10%: loss (top), error (bottom) Figure 13: Log–...

  7. [7]

    Vine, Si Chen, Qiaoling Jin, Youssef S

    Junjing Deng, David J. Vine, Si Chen, Qiaoling Jin, Youssef S. ˜G. Nashed, Tom Peterka, Stefan Vogt, and Chris Jacobsen. X-ray ptychographic and fluorescence microscopy of frozen-hydrated cells using continuous scanning.Scientific Reports, 7(1):445, March 2017

  8. [8]

    Optimization-based approach for joint x-ray fluorescence and transmission tomographic inversion.SIAM Journal on Imaging Sciences, 9(1):1–23, 2016

    Zichao Di, Sven Leyffer, and Stefan M Wild. Optimization-based approach for joint x-ray fluorescence and transmission tomographic inversion.SIAM Journal on Imaging Sciences, 9(1):1–23, 2016

  9. [9]

    Zichao Wendy Di, Si Chen, Young Pyo Hong, Chris Jacobsen, Sven Leyffer, and Stefan M. Wild. Joint reconstruction of x-ray fluorescence and transmission tomography.Opt. Express, 25(12):13107–13124, Jun 2017. 20 E. ZOU, E. BUSER, Z. W. DI, Y. XI Noise Level = 0% Noise Level = 10% Figure 14: Log–log plots of loss (top) and reconstruction error (bottom) for f...

  10. [10]

    Phase retrieval by iterated projections.Journal of the Optical Society of America A, 20(1):40– 55, 2003

    Veit Elser. Phase retrieval by iterated projections.Journal of the Optical Society of America A, 20(1):40– 55, 2003

  11. [11]

    Multigrid optimization for large-scale ptychographic phase retrieval

    Samy Wu Fung and Zichao Wendy Di. Multigrid optimization for large-scale ptychographic phase retrieval. SIAM Journal on Imaging Sciences, 13(1):214–233, 2020

  12. [12]

    Society for Industrial and Applied Mathematics, USA, 2010

    Per Christian Hansen.Discrete Inverse Problems: Insight and Algorithms. Society for Industrial and Applied Mathematics, USA, 2010

  13. [13]

    Robert Hesse, D Russell Luke, Shoham Sabach, and Matthew K Tam. Proximal heterogeneous block JOINT RECONSTRUCTION FOR X-RAY PTYCHOGRAPHY AND FLUORESCENCE 21 n= 334,m= 64, 53% overlap n= 650,m= 100, 50% overlap Figure 15: Reconstruction results for synthetic objects of different sizes with Noise Level = 3%. The joint method produces sharper and more accura...

  14. [14]

    Visualizing the loss landscape of neural nets, 2018

    Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the loss landscape of neural nets, 2018

  15. [15]

    Zhenyu Liao and Michael W. Mahoney. Hessian eigenspectra of more realistic nonlinear models, 2021

  16. [16]

    Further improvements to the ptychographical iterative engine.Optica, 4(7):736–745, 2017

    Andrew Maiden, Daniel Johnson, and Peng Li. Further improvements to the ptychographical iterative engine.Optica, 4(7):736–745, 2017

  17. [17]

    Maiden, Martin J

    Andrew M. Maiden, Martin J. Humphry, Fucai Zhang, and John M. Rodenburg. Superresolution imaging via ptychography.J. Opt. Soc. Am. A, 28(4):604–612, Apr 2011

  18. [18]

    A survey of truncated-Newton methods.Journal of computational and applied mathe- matics, 124(1-2):45–59, 2000

    Stephen G Nash. A survey of truncated-Newton methods.Journal of computational and applied mathe- matics, 124(1-2):45–59, 2000

  19. [19]

    Iterative least-squares solver for general- ized maximum-likelihood ptychography.Optics express, 26(3):3108–3123, 2018

    Michal Odstrˇ cil, Andreas Menzel, and Manuel Guizar-Sicairos. Iterative least-squares solver for general- ized maximum-likelihood ptychography.Optics express, 26(3):3108–3123, 2018

  20. [20]

    X-ray ptychography.Nature Photonics, 12(1):9–17, 2018

    Franz Pfeiffer. X-ray ptychography.Nature Photonics, 12(1):9–17, 2018

  21. [21]

    A phase retrieval algorithm for shifting illumination.Applied physics letters, 85(20):4795–4797, 2004

    John M Rodenburg and Helen ML Faulkner. A phase retrieval algorithm for shifting illumination.Applied physics letters, 85(20):4795–4797, 2004

  22. [22]

    Ptychography and related diffractive imaging methods.Advances in Imaging and Electron Physics, 150:87–184, 2008

    John Marius Rodenburg. Ptychography and related diffractive imaging methods.Advances in Imaging and Electron Physics, 150:87–184, 2008

  23. [23]

    Eigenvalues of the hessian in deep learning: Singularity and beyond, 2017

    Levent Sagun, Leon Bottou, and Yann LeCun. Eigenvalues of the hessian in deep learning: Singularity and beyond, 2017

  24. [24]

    Imaging atomic-scale chemistry from fused multi-modal electron microscopy.npj Computational Materials, 8(1):16, 2022

    Jonathan Schwartz, Zichao Wendy Di, Yi Jiang, Alyssa J Fielitz, Don-Hyung Ha, Sanjaya D Perera, Ismail El Baggari, Richard D Robinson, Jeffrey A Fessler, Colin Ophus, et al. Imaging atomic-scale chemistry from fused multi-modal electron microscopy.npj Computational Materials, 8(1):16, 2022

  25. [25]

    Imaging 3d chemistry at 1 nm resolution with fused multi-modal electron tomography.Nature Communications, 15(1):3555, 2024

    Jonathan Schwartz, Zichao Wendy Di, Yi Jiang, Jason Manassa, Jacob Pietryga, Yiwen Qian, Min Gee Cho, Jonathan L Rowell, Huihuo Zheng, Richard D Robinson, et al. Imaging 3d chemistry at 1 nm resolution with fused multi-modal electron tomography.Nature Communications, 15(1):3555, 2024

  26. [26]

    Jacob Seifert, Yifeng Shao, and Allard P. Mosk. Noise-robust latent vector reconstruction in ptychography using deep generative models.Opt. Express, 32(1):1020–1033, Jan 2024

  27. [27]

    Tianxiao Sun, Gang Sun, Fuda Yu, Yongzhi Mao, Renzhong Tai, Xiangzhi Zhang, Guangjie Shao, Zhenbo 22 E. ZOU, E. BUSER, Z. W. DI, Y. XI n= 334,m= 64, overlap = 0.53 n= 650,m= 100, overlap = 0.5 Figure 16: Log–log plots of objective loss (top) and reconstruction error (bottom) for pty- chography and the joint method under Noise Level = 3% noise. The joint m...

  28. [28]

    Simultaneous x-ray fluorescence and ptychographic microscopy of cyclotella meneghiniana.Optics Express, 20(16):18287–18296, 2012

    David J Vine, Daniele Pelliccia, Christian Holzner, Stephen B Baines, Andrew Berry, Ian McNulty, Stefan Vogt, Andrew G Peele, and Keith A Nugent. Simultaneous x-ray fluorescence and ptychographic microscopy of cyclotella meneghiniana.Optics Express, 20(16):18287–18296, 2012

  29. [29]

    Liqi Zhou, Jingdong Song, Judy S Kim, Xudong Pei, Chen Huang, Mark Boyce, Luiza Mendon¸ ca, Daniel Clare, Alistair Siebert, Christopher S Allen, et al. Low-dose phase retrieval of biological specimens JOINT RECONSTRUCTION FOR X-RAY PTYCHOGRAPHY AND FLUORESCENCE 23 n= 334,m= 64, overlap = 0.53 n= 650,m= 100, overlap = 0.5 Figure 17: Log–log plots of object...