Multiscale Super Resolution without Image Priors

Daniel Fu; Gabby Litterio; Pedro Felzenszwalb; Rashid Zia

arxiv: 2604.21810 · v1 · submitted 2026-04-23 · 💻 cs.CV · cs.GR

Multiscale Super Resolution without Image Priors

Daniel Fu , Gabby Litterio , Pedro Felzenszwalb , Rashid Zia This is my paper

Pith reviewed 2026-05-09 22:33 UTC · model grok-4.3

classification 💻 cs.CV cs.GR

keywords super-resolutionmultiscale imagingcoprime pixel sizesFourier reconstructionleast squaresCCD binningimage priors

0 comments

The pith

Low-resolution images at pairwise coprime pixel sizes yield a stable inverse for super-resolution without priors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the super-resolution problem under translation becomes well-posed when low-resolution images are captured at multiple scales with pairwise coprime pixel sizes. This setup produces a linear system that admits a stable inverse, so high-resolution images can be recovered directly through Fourier-domain operations or iterative least-squares minimization. The analysis supplies an explicit formula for the expected reconstruction error under i.i.d. noise, which makes the noise-resolution tradeoff quantitative. Experiments that use CCD binning to realize a wide range of effective pixel sizes confirm that the method works in both one and two dimensions and outperforms single-scale approaches on real targets.

Core claim

Images acquired with pairwise coprime pixel sizes lead to a system with a stable inverse, and super-resolution images can be reconstructed efficiently using Fourier domain techniques or iterative least squares methods. The mathematical analysis provides an expression for the expected error of the least squares reconstruction for large signals assuming i.i.d. noise.

What carries the argument

The linear multiscale imaging model whose frequency-domain matrix becomes invertible when the pixel sizes are pairwise coprime.

If this is right

Super-resolution becomes possible without image priors or regularization terms.
Reconstruction can be performed directly in the Fourier domain or via standard least-squares solvers.
An explicit formula quantifies how noise limits achievable resolution.
Hardware binning experiments demonstrate practical gains on one- and two-dimensional targets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Existing multi-sensor cameras could adopt coprime pixel sizes to improve resolution without added computational cost.
The same linear-inversion principle may apply to other modalities where scale can be varied controllably.
Small deviations from perfect registration or exact coprimeness would be expected to degrade the conditioning of the inverse.

Load-bearing premise

The different-scale captures are exactly registered up to known translation, the pixel sizes are precisely known and coprime, and the imaging process is linear with additive i.i.d. noise.

What would settle it

Reconstruction of a known high-resolution test pattern from non-coprime pixel-size captures that shows persistent ambiguity or large error growth even as noise approaches zero.

Figures

Figures reproduced from arXiv: 2604.21810 by Daniel Fu, Gabby Litterio, Pedro Felzenszwalb, Rashid Zia.

**Figure 2.** Figure 2: Capturing (I ⊗ bk) with a low-resolution sensor involves interlacing multiple low-resolution images. simplifies the analysis, and it also represents a robust model for CCD cameras in which photoelectrons accumulated within individual pixels can be binned into a contiguous rectangular form prior to readout. Such CCD hardware binning is regularly used to improve noise statistics for low-signal measurement, a… view at source ↗

**Figure 4.** Figure 4: Examples showing how single-pixel values can be [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: A 2D example showing how single-pixel values can [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: The magnitude of the periodic sinc f7. The discrete Fourier transform (DFT) of a box of width 7 is obtained as a discrete sampling of this function. The maximum of |fk| occurs at ω = 0 with value equal to 1, and the zeros occur at non-zero multiples of 2π/k [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: shows two plots of (tr((T ′T) −1 )/n) 1 2 as a function of n for one-dimensional signals with box sizes 9 and 11. We plot the square root of the normalized trace because it better corresponds to an RMS value for a reconstruction with least squares. We consider maps defined via cyclic or valid convolution separately. The trace in each case can be computed numerically via explicit construction of an imaging… view at source ↗

**Figure 8.** Figure 8: Experimental measurements of 1D linearly increasing [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Example 1D reconstructions using adjacent pairs of bin sizes 12/13 (left/right) and 15/16 (top/down). When the bin [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of (a) predicted reconstruction error [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 11.** Figure 11: Three experimental demonstrations of mutliscale super-resolution using CCD binning at three image scales. [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗

**Figure 12.** Figure 12: Experimental demonstration of how using multiple [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

read the original abstract

We address the ambiguities in the super-resolution problem under translation. We demonstrate that combinations of low-resolution images at different scales can be used to make the super-resolution problem well posed. Such differences in scale can be achieved using sensors with different pixel sizes (as demonstrated here) or by varying the effective pixel size through changes in optical magnification (e.g., using a zoom lens). We show that images acquired with pairwise coprime pixel sizes lead to a system with a stable inverse, and furthermore, that super-resolution images can be reconstructed efficiently using Fourier domain techniques or iterative least squares methods. Our mathematical analysis provides an expression for the expected error of the least squares reconstruction for large signals assuming i.i.d. noise that elucidates the noise-resolution tradeoff. These results are validated through both one- and two-dimensional experiments that leverage charge-coupled device (CCD) hardware binning to explore reconstructions over a large range of effective pixel sizes. Finally, two-dimensional reconstructions for a series of targets are used to demonstrate the advantages of multiscale super-resolution, and implications of these results for common imaging systems are discussed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows coprime pixel sizes can make multiscale super-resolution well-posed without priors, but the stable inverse rests on exact registration and a linear i.i.d. noise model.

read the letter

The main new piece here is the claim that pairwise coprime pixel sizes turn the multiscale super-resolution problem into one with a stable inverse, allowing reconstruction via Fourier methods or plain least squares and giving an explicit expected-error formula under i.i.d. noise. That is not just a routine extension of single-scale work; the coprimeness argument plus the noise-resolution tradeoff expression is the concrete contribution. The hardware experiments using CCD binning to sweep effective pixel sizes are a practical plus, since they test the idea over a range of scales without relying on simulated data alone. The abstract presents this as internally consistent linear algebra and Fourier analysis, and the 1D/2D validation is reported as supporting the math. That is worth crediting: they actually ran the hardware checks rather than stopping at theory. The soft spot is exactly the one the stress-test flags. Everything depends on the captures being exactly registered up to known translation and the forward model being known perfectly. With no image priors, there is no regularization to absorb small alignment errors or deviations from linearity, and those errors would directly inflate the condition number and break the derived error bound. The abstract states the assumption plainly, but the experiments would need to show that the binning setup keeps translations stable enough for the claimed stability to hold in practice. If the full derivations and quantitative tables confirm the error formula matches the observed residuals, the math side is solid; otherwise the central claim stays partly unverified. This is for readers working on sensor hardware or multiscale imaging systems who want a prior-free route. It is not a broad theory paper, but the idea is specific enough and the experiments concrete enough that it deserves a serious referee who can check the derivations and the registration sensitivity in the data. I would send it to review rather than desk reject.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper derives the stable inverse property for pairwise coprime pixel sizes and the expected error expression directly from linear algebra on the multiscale forward model combined with Fourier analysis of the sampling process. These results follow from the stated assumptions on the imaging model (linear, i.i.d. noise, exact registration up to known translation) without any parameter fitting to target data, self-referential definitions, or load-bearing self-citations. Experiments validate the math rather than supplying inputs to it. The derivation chain is self-contained against external benchmarks of linear systems theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on a linear imaging model, the existence of a stable inverse when pixel sizes are pairwise coprime, and the i.i.d. noise assumption used to derive the expected least-squares error; no free parameters or new entities are introduced.

axioms (2)

domain assumption Noise is independent and identically distributed
Invoked to obtain the closed-form expression for expected reconstruction error of the least-squares solution.
domain assumption Imaging process is linear and shift-invariant within each scale
Required for the Fourier-domain analysis and the claim of a stable inverse.

pith-pipeline@v0.9.0 · 5491 in / 1389 out tokens · 32720 ms · 2026-05-09T22:33:47.266618+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

Limits on super-resolution and how to break them,

S. Baker and T. Kanade, “Limits on super-resolution and how to break them,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 09, pp. 1167–1183, Sep. 2002

work page 2002
[2]

Fundamental limits of reconstruction- based superresolution algorithms under local translation,

Z. Lin and H.-Y. Shum, “Fundamental limits of reconstruction- based superresolution algorithms under local translation,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 83–97, 2004

work page 2004
[3]

Super-resolution with structured motion,

G. Litterio, J.-D. Lizarazo-Ferro, P . Felzenszwalb, and R. Zia, “Super-resolution with structured motion,” inIEEE International Conference on Computational Photography, Jul. 2025

work page 2025
[4]

Super-resolution from internet-scale scene matching,

L. Sun and J. Hays, “Super-resolution from internet-scale scene matching,” inIEEE International Conference on Computational Pho- tography, 2012, pp. 1–12

work page 2012
[5]

Image super-resolution using deep convolutional networks,

C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2015

work page 2015
[6]

Super-resolution from a single image,

D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” inIEEE International Conference on Computer Vision. IEEE, 2009, pp. 349–356

work page 2009
[7]

Neural fields in visual computing and beyond,

Y. Xie, T. Takikawa, S. Saito, O. Litany, S. Yan, N. Khan, F. Tombari, J. Tompkin, V . Sitzmann, and S. Sridhar, “Neural fields in visual computing and beyond,” inComputer Graphics Forum, vol. 41. Wiley Online Library, 2022, pp. 641–676, issue: 2

work page 2022
[8]

J. R. Janesick,Photon T ransfer. SPIE Press, 2007

work page 2007
[9]

Deconvolution with a box,

P . F. Felzenszwalb, “Deconvolution with a box,” 2024, Technical Report arXiv:2407.11685

work page arXiv 2024
[10]

G. A. Jones and J. M. Jones,Elementary Number Theory, 1st ed., ser. Springer Undergraduate Mathematics Series. Springer London, 1998

work page 1998
[11]

Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed

˚A. Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed. Society for Industrial and Applied Mathematics, 1996

work page 1996
[12]

LSQR: An algorithm for sparse linear equations and sparse least squares,

C. C. Paige and M. A. Saunders, “LSQR: An algorithm for sparse linear equations and sparse least squares,”ACM T ransactions on Mathematical Software, vol. 8, pp. 43 – 71, 1982

work page 1982
[13]

Box-filtering techniques,

M. McDonnell, “Box-filtering techniques,”Computer Graphics and Image Processing, vol. 17, no. 1, pp. 65–70, 1981. 12

work page 1981
[14]

Summed-area tables for texture mapping,

F. C. Crow, “Summed-area tables for texture mapping,”Proceedings of the 11th Annual Conference on Computer Graphics and Interactive T echniques, 1984

work page 1984
[15]

Spot: sliced partial optimal transport.ACM Trans

B. Wronski, I. Garcia-Dorado, M. Ernst, D. Kelly, M. Krainin, C.-K. Liang, M. Levoy, and P . Milanfar, “Handheld multi-frame super-resolution,”ACM T rans. Graph., vol. 38, no. 4, Jul. 2019. [Online]. Available: https://doi.org/10.1145/3306346.3323024

work page doi:10.1145/3306346.3323024 2019

[1] [1]

Limits on super-resolution and how to break them,

S. Baker and T. Kanade, “Limits on super-resolution and how to break them,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 09, pp. 1167–1183, Sep. 2002

work page 2002

[2] [2]

Fundamental limits of reconstruction- based superresolution algorithms under local translation,

Z. Lin and H.-Y. Shum, “Fundamental limits of reconstruction- based superresolution algorithms under local translation,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 83–97, 2004

work page 2004

[3] [3]

Super-resolution with structured motion,

G. Litterio, J.-D. Lizarazo-Ferro, P . Felzenszwalb, and R. Zia, “Super-resolution with structured motion,” inIEEE International Conference on Computational Photography, Jul. 2025

work page 2025

[4] [4]

Super-resolution from internet-scale scene matching,

L. Sun and J. Hays, “Super-resolution from internet-scale scene matching,” inIEEE International Conference on Computational Pho- tography, 2012, pp. 1–12

work page 2012

[5] [5]

Image super-resolution using deep convolutional networks,

C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2015

work page 2015

[6] [6]

Super-resolution from a single image,

D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” inIEEE International Conference on Computer Vision. IEEE, 2009, pp. 349–356

work page 2009

[7] [7]

Neural fields in visual computing and beyond,

Y. Xie, T. Takikawa, S. Saito, O. Litany, S. Yan, N. Khan, F. Tombari, J. Tompkin, V . Sitzmann, and S. Sridhar, “Neural fields in visual computing and beyond,” inComputer Graphics Forum, vol. 41. Wiley Online Library, 2022, pp. 641–676, issue: 2

work page 2022

[8] [8]

J. R. Janesick,Photon T ransfer. SPIE Press, 2007

work page 2007

[9] [9]

Deconvolution with a box,

P . F. Felzenszwalb, “Deconvolution with a box,” 2024, Technical Report arXiv:2407.11685

work page arXiv 2024

[10] [10]

G. A. Jones and J. M. Jones,Elementary Number Theory, 1st ed., ser. Springer Undergraduate Mathematics Series. Springer London, 1998

work page 1998

[11] [11]

Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed

˚A. Bj ¨orck,Numerical Methods for Least Squares Problems, 2nd ed. Society for Industrial and Applied Mathematics, 1996

work page 1996

[12] [12]

LSQR: An algorithm for sparse linear equations and sparse least squares,

C. C. Paige and M. A. Saunders, “LSQR: An algorithm for sparse linear equations and sparse least squares,”ACM T ransactions on Mathematical Software, vol. 8, pp. 43 – 71, 1982

work page 1982

[13] [13]

Box-filtering techniques,

M. McDonnell, “Box-filtering techniques,”Computer Graphics and Image Processing, vol. 17, no. 1, pp. 65–70, 1981. 12

work page 1981

[14] [14]

Summed-area tables for texture mapping,

F. C. Crow, “Summed-area tables for texture mapping,”Proceedings of the 11th Annual Conference on Computer Graphics and Interactive T echniques, 1984

work page 1984

[15] [15]

Spot: sliced partial optimal transport.ACM Trans

B. Wronski, I. Garcia-Dorado, M. Ernst, D. Kelly, M. Krainin, C.-K. Liang, M. Levoy, and P . Milanfar, “Handheld multi-frame super-resolution,”ACM T rans. Graph., vol. 38, no. 4, Jul. 2019. [Online]. Available: https://doi.org/10.1145/3306346.3323024

work page doi:10.1145/3306346.3323024 2019