pith. sign in

arxiv: 2604.21810 · v1 · submitted 2026-04-23 · 💻 cs.CV · cs.GR

Multiscale Super Resolution without Image Priors

Pith reviewed 2026-05-09 22:33 UTC · model grok-4.3

classification 💻 cs.CV cs.GR
keywords super-resolutionmultiscale imagingcoprime pixel sizesFourier reconstructionleast squaresCCD binningimage priors
0
0 comments X

The pith

Low-resolution images at pairwise coprime pixel sizes yield a stable inverse for super-resolution without priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the super-resolution problem under translation becomes well-posed when low-resolution images are captured at multiple scales with pairwise coprime pixel sizes. This setup produces a linear system that admits a stable inverse, so high-resolution images can be recovered directly through Fourier-domain operations or iterative least-squares minimization. The analysis supplies an explicit formula for the expected reconstruction error under i.i.d. noise, which makes the noise-resolution tradeoff quantitative. Experiments that use CCD binning to realize a wide range of effective pixel sizes confirm that the method works in both one and two dimensions and outperforms single-scale approaches on real targets.

Core claim

Images acquired with pairwise coprime pixel sizes lead to a system with a stable inverse, and super-resolution images can be reconstructed efficiently using Fourier domain techniques or iterative least squares methods. The mathematical analysis provides an expression for the expected error of the least squares reconstruction for large signals assuming i.i.d. noise.

What carries the argument

The linear multiscale imaging model whose frequency-domain matrix becomes invertible when the pixel sizes are pairwise coprime.

If this is right

  • Super-resolution becomes possible without image priors or regularization terms.
  • Reconstruction can be performed directly in the Fourier domain or via standard least-squares solvers.
  • An explicit formula quantifies how noise limits achievable resolution.
  • Hardware binning experiments demonstrate practical gains on one- and two-dimensional targets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Existing multi-sensor cameras could adopt coprime pixel sizes to improve resolution without added computational cost.
  • The same linear-inversion principle may apply to other modalities where scale can be varied controllably.
  • Small deviations from perfect registration or exact coprimeness would be expected to degrade the conditioning of the inverse.

Load-bearing premise

The different-scale captures are exactly registered up to known translation, the pixel sizes are precisely known and coprime, and the imaging process is linear with additive i.i.d. noise.

What would settle it

Reconstruction of a known high-resolution test pattern from non-coprime pixel-size captures that shows persistent ambiguity or large error growth even as noise approaches zero.

Figures

Figures reproduced from arXiv: 2604.21810 by Daniel Fu, Gabby Litterio, Pedro Felzenszwalb, Rashid Zia.

Figure 1
Figure 1. Figure 1: Multiscale super-resolution demo using 200+ [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Capturing (I ⊗ bk) with a low-resolution sensor involves interlacing multiple low-resolution images. simplifies the analysis, and it also represents a robust model for CCD cameras in which photoelectrons accumulated within individual pixels can be binned into a contiguous rectangular form prior to readout. Such CCD hardware binning is regularly used to improve noise statistics for low-signal measurement, a… view at source ↗
Figure 4
Figure 4. Figure 4: Examples showing how single-pixel values can be [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: A 2D example showing how single-pixel values can [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The magnitude of the periodic sinc f7. The discrete Fourier transform (DFT) of a box of width 7 is obtained as a discrete sampling of this function. The maximum of |fk| occurs at ω = 0 with value equal to 1, and the zeros occur at non-zero multiples of 2π/k [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: shows two plots of (tr((T ′T) −1 )/n) 1 2 as a func￾tion of n for one-dimensional signals with box sizes 9 and 11. We plot the square root of the normalized trace because it better corresponds to an RMS value for a reconstruction with least squares. We consider maps defined via cyclic or valid convolution separately. The trace in each case can be computed numerically via explicit construction of an imaging… view at source ↗
Figure 8
Figure 8. Figure 8: Experimental measurements of 1D linearly increasing [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Example 1D reconstructions using adjacent pairs of bin sizes 12/13 (left/right) and 15/16 (top/down). When the bin [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of (a) predicted reconstruction error [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Three experimental demonstrations of mutliscale super-resolution using CCD binning at three image scales. [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Experimental demonstration of how using multiple [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
read the original abstract

We address the ambiguities in the super-resolution problem under translation. We demonstrate that combinations of low-resolution images at different scales can be used to make the super-resolution problem well posed. Such differences in scale can be achieved using sensors with different pixel sizes (as demonstrated here) or by varying the effective pixel size through changes in optical magnification (e.g., using a zoom lens). We show that images acquired with pairwise coprime pixel sizes lead to a system with a stable inverse, and furthermore, that super-resolution images can be reconstructed efficiently using Fourier domain techniques or iterative least squares methods. Our mathematical analysis provides an expression for the expected error of the least squares reconstruction for large signals assuming i.i.d. noise that elucidates the noise-resolution tradeoff. These results are validated through both one- and two-dimensional experiments that leverage charge-coupled device (CCD) hardware binning to explore reconstructions over a large range of effective pixel sizes. Finally, two-dimensional reconstructions for a series of targets are used to demonstrate the advantages of multiscale super-resolution, and implications of these results for common imaging systems are discussed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper derives the stable inverse property for pairwise coprime pixel sizes and the expected error expression directly from linear algebra on the multiscale forward model combined with Fourier analysis of the sampling process. These results follow from the stated assumptions on the imaging model (linear, i.i.d. noise, exact registration up to known translation) without any parameter fitting to target data, self-referential definitions, or load-bearing self-citations. Experiments validate the math rather than supplying inputs to it. The derivation chain is self-contained against external benchmarks of linear systems theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on a linear imaging model, the existence of a stable inverse when pixel sizes are pairwise coprime, and the i.i.d. noise assumption used to derive the expected least-squares error; no free parameters or new entities are introduced.

axioms (2)
  • domain assumption Noise is independent and identically distributed
    Invoked to obtain the closed-form expression for expected reconstruction error of the least-squares solution.
  • domain assumption Imaging process is linear and shift-invariant within each scale
    Required for the Fourier-domain analysis and the claim of a stable inverse.

pith-pipeline@v0.9.0 · 5491 in / 1389 out tokens · 32720 ms · 2026-05-09T22:33:47.266618+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    Limits on super-resolution and how to break them,

    S. Baker and T. Kanade, “Limits on super-resolution and how to break them,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 09, pp. 1167–1183, Sep. 2002

  2. [2]

    Fundamental limits of reconstruction- based superresolution algorithms under local translation,

    Z. Lin and H.-Y. Shum, “Fundamental limits of reconstruction- based superresolution algorithms under local translation,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 83–97, 2004

  3. [3]

    Super-resolution with structured motion,

    G. Litterio, J.-D. Lizarazo-Ferro, P . Felzenszwalb, and R. Zia, “Super-resolution with structured motion,” inIEEE International Conference on Computational Photography, Jul. 2025

  4. [4]

    Super-resolution from internet-scale scene matching,

    L. Sun and J. Hays, “Super-resolution from internet-scale scene matching,” inIEEE International Conference on Computational Pho- tography, 2012, pp. 1–12

  5. [5]

    Image super-resolution using deep convolutional networks,

    C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2015

  6. [6]

    Super-resolution from a single image,

    D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” inIEEE International Conference on Computer Vision. IEEE, 2009, pp. 349–356

  7. [7]

    Neural fields in visual computing and beyond,

    Y. Xie, T. Takikawa, S. Saito, O. Litany, S. Yan, N. Khan, F. Tombari, J. Tompkin, V . Sitzmann, and S. Sridhar, “Neural fields in visual computing and beyond,” inComputer Graphics Forum, vol. 41. Wiley Online Library, 2022, pp. 641–676, issue: 2

  8. [8]

    J. R. Janesick,Photon T ransfer. SPIE Press, 2007

  9. [9]

    Deconvolution with a box,

    P . F. Felzenszwalb, “Deconvolution with a box,” 2024, Technical Report arXiv:2407.11685

  10. [10]

    G. A. Jones and J. M. Jones,Elementary Number Theory, 1st ed., ser. Springer Undergraduate Mathematics Series. Springer London, 1998

  11. [11]

    Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed

    ˚A. Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed. Society for Industrial and Applied Mathematics, 1996

  12. [12]

    LSQR: An algorithm for sparse linear equations and sparse least squares,

    C. C. Paige and M. A. Saunders, “LSQR: An algorithm for sparse linear equations and sparse least squares,”ACM T ransactions on Mathematical Software, vol. 8, pp. 43 – 71, 1982

  13. [13]

    Box-filtering techniques,

    M. McDonnell, “Box-filtering techniques,”Computer Graphics and Image Processing, vol. 17, no. 1, pp. 65–70, 1981. 12

  14. [14]

    Summed-area tables for texture mapping,

    F. C. Crow, “Summed-area tables for texture mapping,”Proceedings of the 11th Annual Conference on Computer Graphics and Interactive T echniques, 1984

  15. [15]

    Spot: sliced partial optimal transport.ACM Trans

    B. Wronski, I. Garcia-Dorado, M. Ernst, D. Kelly, M. Krainin, C.-K. Liang, M. Levoy, and P . Milanfar, “Handheld multi-frame super-resolution,”ACM T rans. Graph., vol. 38, no. 4, Jul. 2019. [Online]. Available: https://doi.org/10.1145/3306346.3323024