arxiv: 2605.13127 · v1 · submitted 2026-05-13 · 📊 stat.ML · cs.LG· math.PR

Recognition: unknown

State-of-art minibatches via novel DPP kernels: discretization, wavelets, and rough objectives

Hoang-Son Tran , Pranav Gupta , R\'emi Bardenet , Subhroshekhar Ghosh

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:23 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.PR

keywords determinantal point processesminibatch samplingwavelet kernelsvariance reductiondiscretizationlow-rank kernelsrough objectives

0 comments

The pith

Wavelet-based DPPs on Euclidean space discretize to low-rank kernels that preserve superior variance reduction for minibatches on rough objectives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs new determinantal point processes on continuous Euclidean space using wavelets, proving these achieve better accuracy rates than earlier known DPP families for variance reduction. It then supplies a general conversion procedure that turns any such continuous DPP into a discrete kernel on a finite dataset while keeping the variance decay intact and exposing an explicit low-rank factorization. This factorization makes sampling from the resulting DPP cheap. The same construction supplies explicit convergence rates that improve as the regularity of the objective increases, thereby extending DPP-based coresets and minibatches to losses with arbitrarily low smoothness.

Core claim

We propose new DPPs on the Euclidean space based on wavelets, with provably better accuracy guarantees than the best known rates. We introduce a general method to convert such continuous DPPs into discrete kernels, which simultaneously preserves the desired variance decay and reveals a low-rank decomposition of the discrete kernel. This enlarges the class of ML tasks amenable to improvements via DPP-based minibatches and coresets to include objective functions with arbitrarily low regularity, and rate guarantees that explicitly adapt to this regularity.

What carries the argument

Wavelet DPP kernels defined on Euclidean space, together with the discretization map that produces a low-rank discrete kernel while preserving the continuous variance-reduction decay.

If this is right

Minibatch variance reduction now comes with explicit rates that improve with the regularity of the objective function.
The low-rank decomposition renders sampling from the discrete DPP computationally inexpensive for large data sets.
DPP-based subsampling applies directly to non-smooth or low-regularity losses without separate ad-hoc constructions.
The same discretization technique supplies a systematic route from any continuous DPP with good variance properties to a usable discrete kernel.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The discretization procedure could be applied to other continuous DPP families to obtain additional low-rank kernels with provable properties.
Adaptive rates to objective regularity may improve stochastic optimization in settings such as robust or non-parametric learning.
The exposed low-rank structure could be combined with existing fast approximate sampling algorithms to scale further.
Multi-resolution wavelet choices might yield even tighter adaptation to the specific roughness of a given objective.

Load-bearing premise

The discretization step preserves the variance-reduction properties of the continuous wavelet DPPs with only negligible degradation on finite data sets.

What would settle it

An empirical test on a finite data set with a known low-regularity objective showing that variance reduction from the discretized wavelet DPP is no better than that from independent sampling.

Figures

Figures reproduced from arXiv: 2605.13127 by Hoang-Son Tran, Pranav Gupta, R\'emi Bardenet, Subhroshekhar Ghosh.

**Figure 4.** Figure 4: ∥∇f(θt)∥2 25 50 75 100 125 150 175 200 Pegasos iteration t 10 1 t * 2 IID (m = 32) OPE (m = 32) HAAR (m = 32) DB2 (m = 32) [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 7.** Figure 7: d = 1: φ0.75(x) 10 1 10 2 n 10 5 10 4 10 3 10 2 10 1 Variance IID HAAR DB2 DIR OPE [PITH_FULL_IMAGE:figures/full_fig_p051_7.png] view at source ↗

**Figure 9.** Figure 9: d = 1: φbump(x) 51 [PITH_FULL_IMAGE:figures/full_fig_p051_9.png] view at source ↗

**Figure 11.** Figure 11: d = 2: φ0.75(x) 10 1 10 2 n 10 4 10 3 10 2 10 1 Variance IID HAAR DB2 DIR OPE [PITH_FULL_IMAGE:figures/full_fig_p052_11.png] view at source ↗

**Figure 13.** Figure 13: d = 2: φbump(x) 52 [PITH_FULL_IMAGE:figures/full_fig_p052_13.png] view at source ↗

read the original abstract

Determinantal point processes (DPPs) have emerged as a kernelized alternative to vanilla independent sampling for generating efficient minibatches, coresets and other parsimonious representations of large-scale datasets. While theoretical foundations and promising empirical performance have been demonstrated, there are two challenges for current proposals for DPP-based coresets or minibatches. The first is the need for families of DPPs with certain key variance reduction properties, usually constructed in a continuous setting, of which there are few known examples. The second is the need for an ad-hoc construction of a discrete DPP defined on a given dataset, that inherits such variance reduction. In this work, we contribute to the programme of establishing DPPs as a subsampling toolbox for ML by advancing on these two fronts. First, we propose new DPPs on the Euclidean space based on wavelets, with provably better accuracy guarantees than the best known rates. Second, we introduce a general method to convert such continuous DPPs, which are more amenable to proving analytical statements, into discrete kernels, which are pertinent for subsampling tasks such as minibatch and coreset constructions. This conversion mechanism simultaneously preserves the desired variance decay and reveals a low-rank decomposition of the discrete kernel, which makes sampling the corresponding DPP computationally inexpensive. En route, we enlarge the class of ML tasks amenable to improvements via DPP-based minibatches and coresets to include objective functions with arbitrarily low regularity, and rate guarantees that explicitly adapt to this regularity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Wavelet DPPs with better variance decay rates plus a discretization that preserves them and yields low-rank kernels extend DPP subsampling to low-regularity objectives.

read the letter

The punchline is that this paper introduces wavelet-based DPPs on Euclidean space with provably better accuracy guarantees than prior continuous DPP constructions, and pairs them with a general discretization method that preserves the variance decay while producing a low-rank discrete kernel for efficient sampling. What is new here is the use of wavelets to build these DPPs, which lets the method handle objective functions with arbitrarily low regularity. Earlier DPP work for coresets and minibatches was mostly limited to smoother cases, so this broadens the range of ML tasks that can benefit from DPP subsampling. The discretization procedure is also a useful addition because it bridges the continuous setting where proofs are easier to the discrete setting needed for real datasets, all while keeping the good variance properties and exposing low-rank structure. The paper does well in explaining how the wavelet localization properties lead to rate guarantees that adapt to the regularity level. This adaptability is a practical plus. The low-rank revelation from discretization should make sampling the DPP feasible without heavy computation. On the soft spots, the central claims about better rates depend on the specific wavelet constructions and the proofs of the variance decay. Since the abstract only sketches this, the full paper needs to show clear comparisons to the best known rates and confirm there are no gaps in the arguments. The preservation under discretization is stated to be with negligible degradation, but it would help to see how this holds for typical finite datasets or if there are conditions on the data distribution. The low-rank aspect is promising, but if the effective rank is not small enough, the computational savings might not be as large as hoped. This work is aimed at researchers in statistical machine learning who focus on sampling methods, coresets, and efficient optimization techniques. Readers familiar with DPP literature will appreciate the new constructions and the extension to rough objectives. It deserves a serious referee because it tackles specific open challenges in the field with coherent technical ideas. I recommend sending this to peer review so the proofs can be vetted and any practical aspects clarified.

Referee Report

1 major / 3 minor

Summary. The paper proposes new families of determinantal point processes (DPPs) on Euclidean space constructed from wavelets, claiming provably superior variance reduction rates relative to prior continuous DPP constructions. It further develops a general discretization procedure that maps these continuous DPPs to discrete kernels on finite datasets; the procedure is asserted to preserve the target variance decay while exposing an explicit low-rank factorization of the resulting kernel matrix, thereby enabling efficient exact sampling. The framework is extended to objective functions of arbitrarily low regularity, with rate guarantees that adapt to the regularity parameter.

Significance. If the central claims are fully substantiated, the work would meaningfully strengthen the theoretical foundations for DPP-based minibatch and coreset selection in machine learning. The wavelet construction supplies improved accuracy guarantees, the discretization map renders the method practical for finite data while retaining the decay properties, and the low-rank factorization directly addresses computational cost. The explicit adaptation to low-regularity objectives broadens the range of optimization problems for which DPP subsampling can be applied with non-vacuous guarantees. The absence of free parameters in the constructions and the explicit low-rank form are particular strengths.

major comments (1)

[§4] §4 (discretization theorem): the statement that the discrete kernel inherits the continuous variance decay 'with only negligible degradation' lacks an explicit quantitative bound relating the discretization error to the mesh size or sample cardinality; without such a bound it is unclear whether the preservation remains uniform when the underlying dataset is finite and the target objective has low regularity.

minor comments (3)

[Abstract] The abstract and introduction would benefit from a single displayed inequality that makes the claimed rate improvement explicit (e.g., the dependence on the regularity index).
[§2] Notation for the wavelet kernel and the associated DPP measure should be introduced in a dedicated preliminary subsection before the main constructions.
[§5] A short discussion of the numerical stability of the low-rank factorization under finite-precision arithmetic would be useful for practitioners.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on our manuscript. We address the major comment point by point below.

read point-by-point responses

Referee: [§4] §4 (discretization theorem): the statement that the discrete kernel inherits the continuous variance decay 'with only negligible degradation' lacks an explicit quantitative bound relating the discretization error to the mesh size or sample cardinality; without such a bound it is unclear whether the preservation remains uniform when the underlying dataset is finite and the target objective has low regularity.

Authors: We thank the referee for highlighting this point. We agree that an explicit quantitative bound relating the discretization error to mesh size and sample cardinality would strengthen the result and clarify uniformity for finite datasets and low-regularity objectives. In the revised manuscript we will add a corollary to the discretization theorem that supplies such a bound (of order O(h^β) where β depends on the regularity parameter and mesh size h), derived directly from the wavelet kernel decay and the low-rank factorization. This bound will be shown to be negligible under the scaling assumptions already present in the paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper's central claims rest on constructing wavelet DPPs with improved variance decay rates via analytical proofs under stated regularity assumptions, followed by a discretization map that inherits the decay and exposes an explicit low-rank factorization. These steps are presented as holding directly from wavelet localization properties and the conversion mechanism, without reducing to fitted parameters, self-referential definitions, or load-bearing self-citations. The adaptation to low-regularity objectives follows from the same localization without hidden conditions or renaming of known results. The argument structure contains no internal reductions by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest primarily on the existence and properties of the new wavelet DPPs and on the preservation property of the discretization map; these are the novel elements rather than additional free parameters or external axioms.

axioms (2)

domain assumption Wavelet-based DPPs exist on Euclidean space and admit explicit variance reduction bounds superior to known constructions
Invoked when stating the first contribution and the provable accuracy guarantees.
domain assumption The discretization map preserves the variance decay rate of the underlying continuous DPP
Central to the second contribution and the claim that discrete kernels remain useful for minibatch tasks.

pith-pipeline@v0.9.0 · 5585 in / 1414 out tokens · 63489 ms · 2026-05-14T18:23:15.698803+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

299 extracted references · 36 canonical work pages · 16 internal anchors

[1]

Miramont, J. M. and Tan, K. A. and Mukherjee, S. S. and Bardenet, R. and Ghosh, S. , date-added =. arXiv preprint arXiv:2504.07720 , title =

work page arXiv
[2]

Miramont, J. M. and Auger, F. and Colominas, M. A. and Laurent, N. and Meignen, S. , date-added =. Signal Processing , title =
[3]

Miramont, J. M. and Bardenet, R. and Chainais, P. and Auger, F. , booktitle =. Adaptive hyperparameter tuning for time-frequency algorithms based on the zeros of the spectrogram , year =
[4]

and Husz

Lacoste--Julien, S. and Husz. Approximate inference for the loss-calibrated. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , organization =
[5]

Advances in Neural Information Processing Systems (NeurIPS) , title =

Ku. Advances in Neural Information Processing Systems (NeurIPS) , title =
[6]

Pironio, S. and Ac. Random numbers certified by. Nature , number =
[7]

and Bourdoncle, B

Fyrillas, A. and Bourdoncle, B. and Ma. Certified randomness in tight space , volume =. PRX Quantum , number =
[8]

and Ta-Shma, A

Nisan, N. and Ta-Shma, A. , journal =. Extracting randomness: A survey and new constructions , volume =
[9]

, school =

Colbeck, R. , school =. Quantum and relativistic protocols for secure multi-party computation , year =
[10]

A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations , volume =

Navascu. A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations , volume =. New Journal of Physics , number =
[11]

and Bardenet, R

Lemoine, T. and Bardenet, R. , journal =. Monte
[12]

and Bardenet, R

Fanuel, M. and Bardenet, R. , journal =. On the Number of Steps of
[13]

and Bardenet, R

Fanuel, M. and Bardenet, R. , title =. arXiv preprint arXiv:2503.05906 , year =

work page arXiv
[14]

and Bardenet, R

Fanuel, M. and Bardenet, R. , journal =. Cycling in the forest with
[15]

and Ghosh, S

Bardenet, R. and Ghosh, S. and Simon-Onfroy, H. and Tran, H.-S. , booktitle =. Small coresets via negative dependence:
[16]

, date-added =

Bardenet, R. , date-added =. Turning repulsive point processes into subsampling algorithms , year =
[17]

and De Iorio, M

Argiento, R. and De Iorio, M. , date-added =. Is infinity that far? A Bayesian nonparametric perspective of finite mixture models , volume =. The Annals of Statistics , number =
[18]

and Argiento, R

Beraha, M. and Argiento, R. and M. Journal of Computational and Graphical Statistics , number =
[19]

, date-added =

Pollard, D. , date-added =. A user's guide to measure theoretic probability , year =
[20]

and Varadhan, S

Donsker, M. . and Varadhan, S. R. S. , date-added =. Asymptotic evaluation of certain. Communications on pure and applied mathematics , number =
[21]

and Marzo, J

Levi, M. and Marzo, J. and Ortega-Cerd\`a, J. , date-added =. arXiv preprint arXiv:2311.03204 , title =

work page arXiv
[22]

and Nagaoka, H

Amari, S. and Nagaoka, H. , date-added =. Methods of information geometry , volume =
[23]

and Wang, D

Liu, Q. and Wang, D. , date-added =. Advances in Neural Information Processing Systemseural information processing systems , title =
[24]

and Giannakis, D

Das, S. and Giannakis, D. , date-added =. Journal of Fourier Analysis and Applications , number =
[25]

and Montgomery, M

Giannakis, D. and Montgomery, M. , date-added =. arXiv preprint arXiv:2401.01295 , title =

work page arXiv
[26]

, date-added =

Bach, F. , date-added =. On the equivalence between kernel quadrature rules and random feature expansions , volume =. The Journal of Machine Learning Research , number =
[27]

Murphy, K. P. , date-added =. Probabilistic machine learning: Advanced topics , year =
[28]

Miramont, J. M. and Bardenet, R. and Chainais, P. and Auger, F. , journal =. mcsm-benchs: Benchmarking methods for multi-component signal processing , year =
[30]

and Bardenet, R

Belhadji, A. and Bardenet, R. and Chainais, P. , journal =. Signal reconstruction using determinantal sampling , year =
[31]

and Bardenet, R

Hawat, D. and Bardenet, R. and Lachi. In revision for Scandinavian Journal of Statistics, arXiv preprint arXiv:2308.04825 , title =

work page arXiv
[32]

and Fanuel, M

Bardenet, R. and Fanuel, M. and Feller, A. , date-added =. Journal of Physics A: Mathematical and Theoretical , title =
[33]

, date-added =

Skyrms, B. , date-added =. Dynamic coherence and probability kinematics , volume =. Philosophy of Science , number =
[34]

, booktitle =

Alquier, P. , booktitle =. Non-exponentially weighted aggregation: regret bounds for unbounded loss functions , year =
[35]

and Jewson, J

Knoblauch, J. and Jewson, J. and Damoulas, T. , date-added =. An optimization-centric view on. Journal of Machine Learning Research , number =
[36]

Frazier, D. T. and Nott, D. J. , date-added =. arXiv preprint arXiv:2301.10911 , title =

work page arXiv
[37]

Dawid, A. P. , date-added =. The prequential approach , volume =. Journal of the Royal Statistical Society, Series A , pages =
[38]

and Gr\"

Ramdas, A. and Gr\". arXiv preprint arXiv:2210.01948 , title =

work page arXiv
[39]

, date-added =

Ville, J. , date-added =. \'Etude critique de la notion de collectif , year =
[40]

and Vovk, V

Shafer, G. and Vovk, V. , date-added =. Probability and finance: it's only a game! , volume =
[41]

and Michel, B

Chazal, F. and Michel, B. , date-added =. An introduction to topological data analysis: fundamental and practical aspects for data scientists , volume =. Frontiers in artificial intelligence , pages =
[42]

and Vovk, V

Shafer, G. and Vovk, V. , date-added =. Game-Theoretic Foundations for Probability and Finance , year =
[43]

, date-added =

Macchi, O. , date-added =. Point processes and coincidences -- Contributions to the theory, with applications to statistical optics and optical communication, augmented with a scholion by Suren Poghosyan and Hans Zessin , year =
[44]

, date-added =

Macchi, O. , date-added =. Processus ponctuels et coincidences -- Contributions
[45]

and Sung, K

Jiang, Z. and Sung, K. J. and Kechedzhi, K. and Smelyanskiy, V. N. and Boixo, S. , date-added =. Quantum algorithms to simulate many-body physics of correlated fermions , volume =. Physical Review Applied , number =
[46]

Nielsen, M. A. and Chuang, I. L. , date-added =. Quantum computation and quantum information , year =
[47]

Nielsen, M. A. , date-added =. The fermionic canonical commutation relations and the
[48]

and Rigollet, P

Ghosh, S. and Rigollet, P. , date-added =. Gaussian determinantal processes: A new model for directionality in data , volume =. Proceedings of the National Academy of Sciences , number =
[49]

and Bardenet, R

Fanuel, M. and Bardenet, R. , journal =. Sparsification of the regularized magnetic
[50]

Eldar, Y. C. , date-added =. Sampling theory:
[51]

Limit theory for geometric statistics of point processes having fast decay of correlations , volume =

B. Limit theory for geometric statistics of point processes having fast decay of correlations , volume =. The Annals of Probability , number =
[52]

and Brunel, V.-E

Gartrell, M. and Brunel, V.-E. and Dohmatob, E. and Krichene, S. , date-added =. Advances in Neural Information Processing Systems , title =
[53]

C. F. Gauss , date-added =. Methodus nova integralium valores per approximationem inveniendi , year =
[54]

and Bach, F

Moulines, \'E. and Bach, F. , booktitle =. Non-asymptotic analysis of stochastic approximation algorithms for machine learning , volume =
[55]

and Lin, M

Ghosh, S. and Lin, M. and Sun, D. , date-added =. Signal analysis via the stochastic geometry of spectrogram level sets , volume =. IEEE Transactions on Signal Processing , pages =
[56]

and L\'evy, T

Kassel, A. and L\'evy, T. , date-added =. arXiv preprint arXiv:1910.06312 , title =

work page arXiv 1910
[57]

and Gautier, G

Hawat, D. and Gautier, G. and Bardenet, R. and Lachi\`eze-Rey, R. , date-added =. Statistics and Computing , title =
[58]

, date-added =

Ghosh, S. , date-added =. Determinantal processes and completeness of random exponentials: the critical case , volume =. Probability Theory and Related Fields , number =
[59]

Bass, R. F. and Gr\"ochenig, K. , date-added =. Relevant sampling of band-limited functions , volume =. Illinois Journal of Mathematics , number =
[60]

and Ghosh, S

Bardenet, R. and Ghosh, S. , date-added =. arXiv preprint arXiv:2007.04287 , title =

work page arXiv 2007
[61]

and Fanuel, M

Jaquard, H. and Fanuel, M. and Amblard, P.-O. and Bardenet, R. and Barthelm\'e, S. and Tremblay, N. , booktitle =. arXiv preprint arXiv:2208.14797 , title =

work page arXiv
[62]

Abreu, L. D. , date-added =. arXiv preprint arXiv:2210.06721 , title =

work page arXiv
[63]

and Koliander, G

Holighaus, N. and Koliander, G. and Pr. Characterization of analytic wavelet transforms and a new phaseless reconstruction algorithm , volume =. IEEE Transactions on Signal processing , number =
[64]

and Bardenet, R

Mai, X. and Bardenet, R. , booktitle =. Un processus ponctuel d\'eterminantal pour la s\'election de variables supervis\'ee , year =
[65]

Miramont, J. M. and Bardenet, R. and Chainais, P. and Auger, F. , booktitle =. A public benchmark for denoising and detection methods , year =
[66]

and Bardenet, R

Pascal, B. and Bardenet, R. , booktitle =. Une famille de repr
[67]

, date-added =

Belhadji, A. , date-added =. 2020 , bdsk-url-1 =

2020
[68]

and Bardenet, R

Rouault, M. and Bardenet, R. and Ma. arXiv preprint arXiv:2508.01392 , title =

work page arXiv
[69]

and Bardenet, R

Rouault, M. and Bardenet, R. and Ma. In revision for SIAM Journal on the Mathematics of Data Science (SIMODS), arXiv preprint arXiv:2402.11736 , title =

work page arXiv
[70]

and Feller, A

Bardenet, R. and Feller, A. and Bouttier, J. and Degiovanni, P. and Hardy, A. and Ran. arXiv preprint arXiv:2210.05522 , title =

work page arXiv
[71]

Non-parametric Models for Non-negative Functions , url =

Ulysse Marteau. Non-parametric Models for Non-negative Functions , url =. Advances in Neural Information Processing Systems 33 , date-added =. 2020 , bdsk-url-1 =

2020
[72]

Forrester, P. J. , date-added =. Log-gases and random matrices , year =
[73]

and Hardy, A

Beltr\'an, C. and Hardy, A. , date-added =. Energy of the. Archive for Rational Mechanics and Analysis , number =
[74]

and Rubak, E

Baddeley, A. and Rubak, E. and Turner, R. , date-added =. Spatial point patterns: methodology and applications with
[75]

Cunden, F. D. and Majumdar, S. N. and O'Connell, N. , date-added =. Free fermions and -determinantal processes , volume =. Journal of Physics A: Mathematical and Theoretical , number =
[76]

and Coeurjolly, J.-F

Mazoyer, A. and Coeurjolly, J.-F. and Amblard, P.-O. , date-added =. Projections of determinantal point processes , volume =. Spatial Statistics , pages =
[77]

Couplings for determinantal point processes and their reduced

M. Couplings for determinantal point processes and their reduced. Journal of Applied Probability , number =
[78]

Biscio, C. A. N. and Lavancier, F. , date-added =. Quantifying repulsiveness of determinantal point processes , volume =. Bernoulli , number =
[79]

, booktitle =

Belhadji, A. , booktitle =. An analysis of
[80]

and Bardenet, R

Pascal, B. and Bardenet, R. , date-added =. IEEE Transactions on Signal Processing , title =
[81]

and Bardenet, R

Petrovic, V. and Bardenet, R. and Desolneux, A. , journal =. Repulsive

Showing first 80 references.