AQMP: Image compression through Adaptive Quadtree Refinement and Matching Pursuit with Hyperparameter Optimization
Pith reviewed 2026-05-12 02:24 UTC · model grok-4.3
The pith
AQMP uses adaptive quadtree refinement to vary block sizes by local image complexity, allowing matching pursuit to reach up to four times the compression of JPEG at matching structural similarity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AQMP achieves up to 4× higher compression rates than JPEG at comparable SSIM values by dynamically adapting block sizes to local image structure through quadtree refinement and applying matching pursuit on the resulting variable-sized blocks, with the small set of governing hyperparameters optimized via the Tree-Structured Parzen Estimator to trace Pareto fronts between compression efficiency and visual quality.
What carries the argument
Adaptive quadtree refinement, which allocates finer partitions where the image is complex and coarser partitions where it is smooth, before running matching pursuit on each resulting block.
If this is right
- Superior compression ratios compared to fixed-size block matching pursuit at equivalent image quality.
- Significant parallelization opportunities at both the tree-leaf level and during compression of individual nodes.
- Comprehensive Pareto fronts from multi-objective hyperparameter optimization that trace the efficiency-quality trade-off.
- Competitive quality maintained across a broad range of compression regimes on representative test images.
Where Pith is reading between the lines
- The same adaptive partitioning idea could be tested on video sequences by adding temporal consistency constraints to the quadtree decisions.
- Hybrid systems might combine AQMP blocks with standard codecs to handle only the locally complex regions with matching pursuit.
- The released implementation makes it straightforward to measure wall-clock speedups from the described parallel opportunities on modern hardware.
Load-bearing premise
The adaptive quadtree partitioning combined with matching pursuit on variable-sized blocks will deliver consistent gains over fixed-block baselines on images outside the tested set, and the hyperparameter optimization will not overfit to the chosen test images or SSIM metric.
What would settle it
A controlled test on a fresh collection of images never seen during hyperparameter search, showing that AQMP no longer exceeds JPEG compression rates at the same SSIM values.
Figures
read the original abstract
We present AQMP, a novel image codec combining Adaptive Quadtree Refinement with Matching Pursuit. Unlike conventional Matching Pursuit methods that operate on fixed-size sub-images, AQMP dynamically adapts block sizes to local image structure, allocating finer partitions where the image is complex and coarser ones where it is smooth. This adaptivity yields superior compression ratios compared to fixed-size block Matching Pursuit at equivalent image quality, while offering significant parallelization opportunities at both the tree-leaf level and during compression of individual nodes. The algorithm is governed by user-specified accuracy and sparsity parameters alongside a small set of additional hyperparameters. To navigate the trade-off between compression efficiency and visual quality, we perform multi-objective hyperparameter optimization using the Tree-Structured Parzen Estimator, producing comprehensive Pareto fronts. Experimental results show that AQMP achieves up to $4\times$ higher compression rates than JPEG at comparable SSIM values, while maintaining competitive quality across a broad range of compression regimes. Performance evaluation is provided using a representative set of test images. To ensure reproducibility and promote adoption, we have made our implementation publicly available on GitHub under the MIT license.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AQMP, an image compression codec that combines adaptive quadtree refinement (dynamically varying block sizes based on local image complexity) with matching pursuit for sparse representation on variable-sized blocks. Hyperparameters including accuracy and sparsity parameters are tuned via multi-objective Tree-Structured Parzen Estimator optimization to generate Pareto fronts trading off compression rate and visual quality (measured by SSIM). The central empirical claim is that AQMP achieves up to 4× higher compression rates than JPEG at comparable SSIM values across a range of regimes, with a public MIT-licensed implementation provided for reproducibility on a representative set of test images.
Significance. If the results hold under proper controls, the work offers a practical contribution to adaptive image compression by exploiting image structure for variable partitioning and providing open code, which supports reproducibility. The multi-objective optimization producing Pareto fronts is a methodological strength for navigating quality-compression trade-offs. However, the purely empirical nature of the claims, without parameter-free derivations or machine-checked elements, means significance depends entirely on the robustness of the experimental protocol.
major comments (2)
- [Experimental results and hyperparameter optimization sections] Experimental results and hyperparameter optimization sections: The manuscript does not specify whether TPE optimization was performed on a held-out validation split or directly on the same 'representative set of test images' used to report the final 4× compression gain over JPEG. If the latter, the selected operating points can overfit to both image content and the SSIM metric, rendering the headline performance comparison non-generalizable and undermining the central empirical claim.
- [Methods and experimental protocol] Methods and experimental protocol: No details are provided on the exact baselines (e.g., fixed-block matching pursuit variants, other quadtree codecs), the precise definition of 'compression rate' (bits per pixel or ratio), the number of test images, or any statistical measures (standard deviation, multiple random seeds) supporting the 'up to 4×' and 'competitive quality' statements.
minor comments (3)
- [Abstract and Methods] The abstract and methods mention 'a small set of additional hyperparameters' without enumerating them or their ranges, making it difficult to assess the optimization space.
- [Experimental results] No discussion of known limitations of SSIM (e.g., its insensitivity to certain artifacts or poor correlation with human perception at low bitrates) is included, despite its central role in the quality metric.
- [Conclusion/Reproducibility] The public GitHub implementation is mentioned but without instructions on how to reproduce the exact Pareto fronts or the reported JPEG comparisons.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below, acknowledging where the original manuscript was insufficiently clear or rigorous, and describe the specific revisions that will be made.
read point-by-point responses
-
Referee: [Experimental results and hyperparameter optimization sections] Experimental results and hyperparameter optimization sections: The manuscript does not specify whether TPE optimization was performed on a held-out validation split or directly on the same 'representative set of test images' used to report the final 4× compression gain over JPEG. If the latter, the selected operating points can overfit to both image content and the SSIM metric, rendering the headline performance comparison non-generalizable and undermining the central empirical claim.
Authors: We acknowledge the validity of this concern. The TPE optimization was performed directly on the representative test images to generate the Pareto fronts and the reported operating points. This choice was made because the optimization is an integral part of configuring the codec for the given image content. However, we agree that this procedure introduces a risk of overfitting to both the specific images and the SSIM metric, which limits the strength of the generalizability claim. In the revised manuscript we will (i) explicitly state that optimization occurred on the test set, (ii) discuss the implications for the 4× claim, and (iii) add results on a separate held-out collection of images to provide supporting evidence that the performance advantage is not limited to the original test set. revision: yes
-
Referee: [Methods and experimental protocol] Methods and experimental protocol: No details are provided on the exact baselines (e.g., fixed-block matching pursuit variants, other quadtree codecs), the precise definition of 'compression rate' (bits per pixel or ratio), the number of test images, or any statistical measures (standard deviation, multiple random seeds) supporting the 'up to 4×' and 'competitive quality' statements.
Authors: We agree that these omissions reduce the clarity and reproducibility of the experimental protocol. The revised manuscript will add: (1) explicit descriptions of all baselines, including fixed-block-size matching pursuit variants and other quadtree-based codecs; (2) a precise definition of compression rate as bits per pixel (bpp); (3) the exact number and provenance of the test images; and (4) statistical summaries (means and standard deviations) of the reported metrics across the test set. Because the core AQMP algorithm is deterministic once the image and hyperparameters are fixed, multiple random seeds are not applicable; we will state this explicitly. revision: yes
Circularity Check
No circularity in claimed results or derivation
full rationale
The paper describes an algorithmic construction (adaptive quadtree partitioning combined with matching pursuit) controlled by hyperparameters that are optimized via TPE to produce Pareto fronts, then reports empirical compression and SSIM outcomes on a representative set of test images. No mathematical derivation, first-principles prediction, or fitted quantity is presented as reducing to its own inputs by construction; the performance numbers are framed as experimental measurements rather than tautological outputs of the optimization procedure itself. No self-citation load-bearing steps or ansatz smuggling appear in the provided text.
Axiom & Free-Parameter Ledger
free parameters (3)
- accuracy parameter
- sparsity parameter
- additional hyperparameters
axioms (2)
- domain assumption Natural images exhibit spatially varying complexity that can be captured by recursive quadtree partitioning.
- domain assumption A dictionary of basis functions exists from which sparse linear combinations can approximate image blocks.
Reference graph
Works this paper leans on
-
[1]
Alfredo Nava-Tudela,Image Representation and Compression via Sparse Solutions Of Systems of Linear equations, Ph D Thesis, 2012
work page 2012
-
[2]
Ori Bryt and Michael Elad,Compression of facial images using the K- SVD algorithm, Journal of Visual Communication and Image Represen- tation, Volume 19, Issue 4, May 2008, Pages 270-282
work page 2008
-
[3]
Jian-Liang Lin, Wen-Liang Hwang and Soo-Chang Pei,The fine-grained scalable video coding based on matching pursuits, Image Processing
-
[4]
2002 International Conference on, vol
Proceedings. 2002 International Conference on, vol. 2, Pages II-53 - II-56
work page 2002
-
[5]
Pierre Garrigues and Avideh Zakhor,Atom position coding in a match- ing pursuit based video coder, Lecture Notes in Computer Science Vol- ume 3893, 2006, pp 153-160
work page 2006
-
[6]
Peter Deutsch,DEFLATE Compressed Data Format Specification version 1.3, IETF
L. Peter Deutsch,DEFLATE Compressed Data Format Specification version 1.3, IETF. p. 1. sec. Abstract. RFC 1951, 1996
work page 1951
-
[7]
Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli,Image quality assessment: From error visibility to structural similarity, IEEE Trans- actions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004
work page 2004
-
[8]
Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad,Orthogonal Matching Pursuit : recursive function approximation with application to wavelet decompositionin Conf. Rec. 27th Asilomar Conf. Signals, Syst. Comput., 1993, vol. 1
work page 1993
-
[9]
The USC-SIPI image database.http://sipi.usc.edu/database/
Viterbi School of Engineering, University of Southern California. The USC-SIPI image database.http://sipi.usc.edu/database/. 30
-
[10]
S. Mallat and Z. Zhang,Matching pursuits with time-frequency dictio- naries, IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 3397-3415, 1993
work page 1993
- [11]
-
[12]
J. M. Shapiro,Embedded image coding using zerotrees of wavelet coef- ficients, IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 3445-3462, 1993
work page 1993
-
[13]
A. Said and W. A. Pearlman,A new, fast, and efficient image codec based on set partitioning in hierarchical trees, IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, no. 3, pp. 243-250, 1996
work page 1996
-
[14]
D. Taubman,High performance scalable image compression with EBCOT, IEEE Transactions on Image Processing, vol. 9, no. 7, pp. 1158-1170, 2000
work page 2000
-
[15]
G. K. Wallace,The JPEG still picture compression standard, Commu- nications of the ACM, vol. 34, no. 4, pp. 30-44, 1991
work page 1991
-
[16]
G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor, and M. Covell,Full resolution image compression with recurrent neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5306-5314
work page 2017
-
[17]
J. Ball´ e, V. Laparra, and E. P. Simoncelli,End-to-end optimized image compression, in Proceedings of the International Conference on Learning Representations (ICLR), 2017
work page 2017
-
[18]
A. Blum and J. Spencer,Greedy algorithms for the shortest common superstring, in Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2000, pp. 146-154
work page 2000
-
[19]
F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. Van Gool, Conditional probability models for deep image compression, in Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2018, pp. 4394-4402
work page 2018
- [20]
-
[21]
M. Elad,Sparse and redundant representations: From theory to appli- cations in signal and image processing, Springer, 2010
work page 2010
- [22]
-
[23]
Mallat,A wavelet tour of signal processing, Academic Press, 1999
S. Mallat,A wavelet tour of signal processing, Academic Press, 1999
work page 1999
-
[24]
G. Strang and T. Nguyen,Wavelets and filter banks, Wellesley- Cambridge Press, 1999
work page 1999
-
[25]
Samet,The Quadtree and Related Hierarchical Data Structures, ACM Computing Surveys, vol
H. Samet,The Quadtree and Related Hierarchical Data Structures, ACM Computing Surveys, vol. 16, no. 2, pp. 187–260, 1984
work page 1984
-
[26]
C. A. Shaffer,Application of alternative quadtree representations (vlsi, data structures), Ph.D. dissertation, University of Maryland at College Park, USA, 1986. (Note: AAI8629008)
work page 1986
-
[27]
M. H. Gross, O. G. Staadt, and R. Gatti,Efficient Triangular Surface Approximations Using Wavelets and Quadtree Data Structures,IEEE Transactions on Visualization and Computer Graphics, vol. 2, no. 2, pp. 130–143, June 1996. doi: 10.1109/2945.506225
-
[28]
F. Liu, M. Hernandez-Cabronero, V. Sanchez, M. W. Marcellin, and A. Bilgin,The Current Role of Image Compression Standards in Med- ical Imaging,Information (Basel), vol. 8, no. 4, p. 131, Dec. 2017, doi: 10.3390/info8040131
-
[29]
D. E. Knuth,Algorithms in Modern Mathematics and Computer Sci- ence, Department of Computer Science, Stanford University, 1980
work page 1980
-
[30]
B.-J. Kim and W. A. Pearlman,An embedded wavelet video coder using three-dimensional set partitioning in hierarchical trees (SPIHT),Pro- ceedings DCC ’97. Data Compression Conference, Snowbird, UT, USA, 1997, pp. 251-260, doi: 10.1109/DCC.1997.582048
-
[31]
James Bergstra, R´ emi Bardenet, Yoshua Bengio, and Bal´ azs K´ egl,Al- gorithms for hyper-parameter optimization, Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS’11), Curran Associates Inc., Red Hook, NY, USA, 2011, pp. 2546–2554. doi: 10.5555/2986459.2986743. 32
-
[32]
533–541, doi: 10.1145/3377930.3389817
Yoshihiko Ozaki, Yuki Tanigaki, Shuhei Watanabe, and Masaki On- ishi,Multiobjective tree-structured parzen estimator for computation- ally expensive optimization problems, Proceedings of the 2020 Ge- netic and Evolutionary Computation Conference, Association for Com- puting Machinery, New York, NY, USA, 2020, pp. 533–541, doi: 10.1145/3377930.3389817
- [33]
-
[34]
Tiglio, M., & Villanueva, A.,Reduced order and surrogate models for gravitational waves.Living Reviews in Relativity,25(1), 2. 2022
work page 2022
-
[35]
Alfred M. Bruckstein, David L. Donoho and Michael Elad,From sparse solutions of systems of equations to sparse modeling of signals and im- ages, SIAM Review, 51(1):34–81, 2009
work page 2009
-
[36]
Cerino, F., Diaz-Pace, J. A., Tassone, E. A., Tiglio, M., & Villegas, A.,Hyperparameter Optimization of an hp-Greedy Reduced Basis for Gravitational Wave Surrogates.Universe,10(1), 6. 2023
work page 2023
-
[37]
Knuth, D. E.,The Art of Computer Programming, Volume 1: Funda- mental Algorithms.Addison-Wesley, 3rd Edition, 1997
work page 1997
-
[38]
Knuth, D. E.,The Art of Computer Programming, Volume 2: Seminu- merical Algorithms.Addison-Wesley, 3rd Edition, 1997
work page 1997
-
[39]
Knuth, D. E.,The Art of Computer Programming, Volume 3: Sorting and Searching.Addison-Wesley, 2nd Edition, 1998
work page 1998
-
[40]
Knuth, D. E.,The Art of Computer Programming, Volume 4A: Combi- natorial Algorithms, Part 1.Addison-Wesley, 2011
work page 2011
-
[41]
Knuth, D. E.,The Art of Computer Programming, Volume 4B: Combi- natorial Algorithms, Part 2.Addison-Wesley, 2022
work page 2022
-
[42]
Pratikakis, I., Gatos, B., & Ntirogiannis, K.,H-DIBCO 2010 - Hand- written Document Image Binarization Competition.Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recog- nition (ICFHR),2010, 727–732. IEEE Computer Society
work page 2010
-
[43]
Qaisar, S., Bilal, R. M., Iqbal, W., Naureen, M., and Lee, S.,Compres- sive sensing: From theory to applications, a survey.Journal of Commu- nications and Networks, vol. 15, no. 5, pp. 443–456, 2013. 33
work page 2013
-
[44]
Optuna Documentation,Samplers — Optuna documentation, Avail- able at:https://optuna.readthedocs.io/en/stable/reference/ samplers/index.html, Accessed: May 15, 2025. 34
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.