A Practical Theory of Generalization in Selectivity Learning

Haoshu Xu; Peizhi Wu; Ryan Marcus; Zachary G. Ives

arxiv: 2409.07014 · v3 · pith:M6Y7KGVQnew · submitted 2024-09-11 · 📊 stat.ML · cs.DB· cs.LG

A Practical Theory of Generalization in Selectivity Learning

Peizhi Wu , Haoshu Xu , Ryan Marcus , Zachary G. Ives This is my paper

Pith reviewed 2026-05-23 21:22 UTC · model grok-4.3

classification 📊 stat.ML cs.DBcs.LG

keywords selectivity estimationquery-driven learningout-of-distribution generalizationsigned measuresPAC learningdatabase query optimizationmachine learning for databases

0 comments

The pith

Selectivity predictors induced by signed measures are learnable and exhibit favorable out-of-distribution generalization bounds beyond the PAC framework under mild assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that query-driven selectivity models based on signed measures can be learned without requiring probability measures, relaxing a key restriction in existing theory. It then derives out-of-distribution generalization error bounds that go beyond what the standard PAC learning framework provides, again under mild assumptions. These results directly motivate two practical strategies for improving how existing selectivity models handle unseen query workloads. The work verifies the approach both theoretically and through experiments measuring prediction accuracy and end-to-end query latency.

Core claim

Selectivity predictors induced by signed measures are learnable and, under mild assumptions, exhibit favorable out-of-distribution generalization error bounds beyond the standard PAC framework; these advances enable two general strategies that improve OOD generalization for query-driven selectivity models while preserving strong in-distribution performance.

What carries the argument

Selectivity predictors induced by signed measures, which relax the probability-measure restriction of prior theory and support OOD error bounds.

If this is right

Query-driven selectivity models can be designed with explicit OOD guarantees rather than relying solely on in-distribution PAC bounds.
Two general improvement strategies become available that raise accuracy on unseen queries without harming performance on training distributions.
Database systems can expect better query planning latency when selectivity estimates must transfer to new workloads.
The gap between practical selectivity estimators and theoretical learning guarantees narrows for the signed-measure class.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar signed-measure constructions could be tested for cardinality estimation tasks that also require cross-distribution robustness.
The OOD bounds may suggest regularization choices during training that explicitly target distribution shift in other database ML settings.
If the mild assumptions prove easy to satisfy in practice, the approach could extend to learned cost models or join-order predictors facing workload drift.

Load-bearing premise

The mild assumptions needed for the out-of-distribution generalization error bounds to hold.

What would settle it

An empirical test in which the proposed OOD strategies produce no measurable improvement in prediction error or latency on workloads drawn from a different distribution than the training data, despite the stated assumptions being satisfied.

Figures

Figures reproduced from arXiv: 2409.07014 by Haoshu Xu, Peizhi Wu, Ryan Marcus, Zachary G. Ives.

**Figure 1.** Figure 1: Left: three data points (𝐴, 𝐵,𝐶) and four ranges. Right Top: predictions from selectivity functions (𝑆1 ∼ 𝑆4) for the four ranges. Right Bottom: corresponding measures (𝜇1 ∼ 𝜇4) that induce 𝑆1 ∼ 𝑆4, including their measure values on 𝐴, 𝐵,𝐶 (technically, the three sets {𝐴}, {𝐵}, {𝐶}). subsets of X closed under complements and countable unions and intersections. For all practical applications, it holds that … view at source ↗

**Figure 2.** Figure 2: Left: relationship between a rectangle query and [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Training a selectivity model M with SeConCDF. modeling to achieve both strong in-distribution generalization with arbitrary loss functions and improved OOD generalization? SeConCDF addresses this limitation by adopting direct query selectivity modeling (which avoids the negative estimate issue) and introducing a soft constraint on the signed measure, unlike the hard constraint used in NeuroCDF. Specificall… view at source ↗

**Figure 4.** Figure 4: Accuracy on Census with granularity shifts. [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Per-query latency performance on CEB-1a-varied under granularity shift. Left: In-Dist queries; Right: OOD queries. 2479 3 1345 668 1665 1180 1104 2069 316 951 2713 478 1215 2645 787 2840 2629 752 2695 1633 Query ID (sorted by Postgres running time) 0 20 40 60 80 100 Running Time (s) PostgreSQL (workload time: 606s) MSCN (workload time: 242s) Robust-MSCN* (workload time: 235s) MSCN+CDF (workload time: 242s)… view at source ↗

**Figure 6.** Figure 6: Accuracy of OOD point queries on IMDb-small. (a) center move (b) granularity shift (c) center move (d) granularity shift [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: OOD query latency performance on IMDb-small [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

read the original abstract

Query-driven machine learning models have emerged as a promising estimation technique for query selectivities. Yet, surprisingly little is known about the efficacy of these techniques from a theoretical perspective, as there exist substantial gaps between practical solutions and state-of-the-art (SOTA) theory based on the Probably Approximately Correct (PAC) learning framework. In this paper, we aim to bridge the gaps between theory and practice. First, we demonstrate that selectivity predictors induced by signed measures are learnable, which relaxes the reliance on probability measures in SOTA theory. More importantly, beyond the PAC learning framework (which only allows us to characterize how the model behaves when both training and test workloads are drawn from the same distribution), we establish, under mild assumptions, that selectivity predictors from this class exhibit favorable out-of-distribution (OOD) generalization error bounds. These theoretical advances provide us with a better understanding of both the in-distribution and OOD generalization capabilities of query-driven selectivity learning, and facilitate the design of two general strategies to improve OOD generalization for existing query-driven selectivity models. We empirically verify that our techniques help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance, while maintaining their superior in-distribution generalization performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper relaxes selectivity models to signed measures and claims OOD bounds beyond PAC, but the assumptions stay unspecified so the central claim is hard to assess.

read the letter

The main point is that this work tries to give OOD generalization bounds for query-driven selectivity predictors by moving from probability measures to signed measures. That relaxation is presented as new, and the authors derive learnability plus error bounds outside the standard PAC i.i.d. setting, then turn the theory into two practical strategies for better OOD performance. They report empirical gains on both prediction error and query latency for OOD workloads while keeping in-distribution results intact. The abstract positions this as bridging the gap between deployed models and existing theory, which is a reasonable goal given how little formal work exists on selectivity estimation generalization. The empirical section appears to test real shifts, which is better than pure theory papers that stop at bounds. The soft spot is the OOD bounds themselves. They are stated to hold under mild assumptions, yet those assumptions are not listed or motivated in the provided text. Without knowing whether they involve bounded Radon-Nikodym derivatives, limited support mismatch, or Lipschitz conditions on the shift, it is impossible to judge if they are actually mild or whether they match the query workload changes used in the experiments. The stress-test note is on target here; this gap makes the beyond-PAC claim unevaluable from what is shown. The derivations and citation pattern look standard for the subfield, with no obvious circularity or invented entities. This paper is for database-systems researchers who build or tune learned selectivity components and want some theoretical handle on distribution shift. A reader already working on ML-for-databases would find the strategies and the signed-measure framing useful even if the bounds need more grounding. It deserves peer review because the problem is real, the direction is concrete, and referees can require the assumptions to be stated explicitly and checked against the workloads.

Referee Report

2 major / 2 minor

Summary. The paper claims that selectivity predictors induced by signed measures are learnable (relaxing the probability-measure restriction in prior PAC-based theory), and that under mild assumptions these predictors admit favorable OOD generalization error bounds that go beyond the standard PAC i.i.d. setting. The theory is used to derive two general strategies for improving OOD performance of existing query-driven selectivity models; these strategies are then shown empirically to improve both prediction accuracy and end-to-end query latency on OOD workloads while preserving in-distribution performance.

Significance. If the OOD bounds are valid under assumptions that are both mild and satisfied by realistic query shifts, the work supplies the first explicit theoretical account of out-of-distribution behavior for this class of estimators and directly informs practical model design. The combination of a signed-measure relaxation, explicit (if still unspecified) OOD bounds, and reproducible empirical validation on latency metrics constitutes a concrete advance over purely PAC-style analyses.

major comments (2)

[Abstract / OOD generalization section] Abstract (OOD paragraph) and the section deriving the OOD bounds: the claim that the predictors 'exhibit favorable out-of-distribution (OOD) generalization error bounds' under 'mild assumptions' is load-bearing for the central contribution, yet the assumptions themselves (e.g., conditions on the Radon-Nikodym derivative between training and test measures, Lipschitz continuity of the shift, or support overlap) are never stated explicitly. Without an enumerated list it is impossible to verify whether the assumptions are weaker than PAC i.i.d., whether they hold for the workload shifts used in the experiments, or whether the derived bounds are non-vacuous.
[Empirical evaluation] The empirical section reporting OOD results: the paper asserts that the proposed strategies 'help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance,' but no quantitative comparison is given against the theoretical OOD bound (e.g., whether observed error lies inside the derived bound or how the bound scales with the observed distribution shift). This leaves the link between the theoretical guarantee and the reported numbers unverified.

minor comments (2)

[Theoretical development] Notation for signed measures and the induced predictors should be introduced once with a clear definition before being used in the learnability statement.
[Practical strategies] The two general strategies for improving OOD generalization are described at a high level; a short algorithmic pseudocode or explicit optimization objective for each would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments correctly identify areas where the presentation of the OOD theory can be strengthened for clarity and verifiability. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract / OOD generalization section] Abstract (OOD paragraph) and the section deriving the OOD bounds: the claim that the predictors 'exhibit favorable out-of-distribution (OOD) generalization error bounds' under 'mild assumptions' is load-bearing for the central contribution, yet the assumptions themselves (e.g., conditions on the Radon-Nikodym derivative between training and test measures, Lipschitz continuity of the shift, or support overlap) are never stated explicitly. Without an enumerated list it is impossible to verify whether the assumptions are weaker than PAC i.i.d., whether they hold for the workload shifts used in the experiments, or whether the derived bounds are non-vacuous.

Authors: We agree that an explicit, enumerated list of assumptions is needed for verifiability. The OOD bounds rely on three main conditions: (1) the Radon-Nikodym derivative dP_test/dP_train is bounded by a finite constant M (allowing controlled shift), (2) the selectivity function is Lipschitz continuous w.r.t. a metric on the query feature space, and (3) the training and test supports overlap on a set of positive measure. These are strictly weaker than the i.i.d. assumption of standard PAC learning. In the revision we will insert a dedicated paragraph (or subsection) that enumerates these assumptions, proves they are milder than PAC-i.i.d., verifies they hold for the query shifts used in the experiments (via explicit computation of the Radon-Nikodym norm on the workloads), and shows that the resulting bounds remain non-vacuous for moderate shifts (M ≤ 5). revision: yes
Referee: [Empirical evaluation] The empirical section reporting OOD results: the paper asserts that the proposed strategies 'help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance,' but no quantitative comparison is given against the theoretical OOD bound (e.g., whether observed error lies inside the derived bound or how the bound scales with the observed distribution shift). This leaves the link between the theoretical guarantee and the reported numbers unverified.

Authors: We accept that a direct numerical link between the derived OOD bound and the reported errors is currently missing. The manuscript demonstrates that the two practical strategies (derived from the theory) improve OOD accuracy and latency, but does not tabulate observed error versus the bound value. In the revised version we will add a short subsection (or appendix table) that, for each OOD workload, reports (a) the measured distribution shift (e.g., estimated Radon-Nikodym norm or total variation), (b) the empirical OOD error, and (c) the numerical value of the theoretical bound under the observed shift. We will also note any looseness in the bound. This will make the connection between theory and experiment explicit. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation presented as independent theoretical advance

full rationale

The abstract states that selectivity predictors induced by signed measures are learnable and yield OOD generalization bounds under mild assumptions beyond PAC, but provides no equations, fitted parameters, or self-citations that reduce these claims to their inputs by construction. The derivation chain is therefore self-contained against external benchmarks, with the unspecified assumptions representing a presentation gap rather than a definitional or fitted-input circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available. The central claims rest on unspecified 'mild assumptions' for the OOD bounds; no free parameters, invented entities, or additional axioms are mentioned.

axioms (1)

domain assumption Mild assumptions under which OOD generalization error bounds hold
Abstract states the bounds hold 'under mild assumptions' but does not enumerate them.

pith-pipeline@v0.9.0 · 5758 in / 1186 out tokens · 23562 ms · 2026-05-23T21:22:14.530887+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

selectivity predictors induced by signed measures are learnable... OOD generalization error bounds
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

fat-shattering dimension of Ssgn(C) is finite... er_P(Ŝ) ≤ O(√ε)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

[1]

https://github.com/learnedsystems/CEB/tree/main

Ceb. https://github.com/learnedsystems/CEB/tree/main

work page
[2]

https://github.com/waltercai/pqo-opensource

Modified postgresql. https://github.com/waltercai/pqo-opensource

work page
[3]

https://github.com/andreaskipf/learnedcardinalities

MSCN code. https://github.com/andreaskipf/learnedcardinalities. Last Accessed Date: 2025-04-05

work page 2025
[4]

https://github.com/huxiao2010/Selectivity/

PtsHist and QuadHist code. https://github.com/huxiao2010/Selectivity/. Last Accessed Date: 2025-04-05

work page 2025
[5]

Foundations of databases, vol

Abiteboul, S., Hull, R., and Vianu, V. Foundations of databases, vol. 8. Addison- Wesley Reading, 1995

work page 1995
[6]

In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp

Aboulnaga, A., and Chaudhuri, S.Self-tuning histograms: building histograms without looking at data. In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp. 181–192

work page 1999
[7]

Scale-sensitive dimensions, uniform convergence, and learnability

Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussler, D. Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44 , 4 (1997), 615–631

work page 1997
[8]

Learning Set Cardinality in Distance Nearest Neighbours

Anagnostopoulos, C., and Triantafillou, P. Learning Set Cardinality in Distance Nearest Neighbours. In 2015 IEEE International Conference on Data Mining (2015), IEEE, pp. 691–696

work page 2015
[9]

Learning to accurately COUNT with query-driven predictive analytics

Anagnostopoulos, C., and Triantafillou, P. Learning to accurately COUNT with query-driven predictive analytics. In 2015 IEEE International Conference on Big Data (Big Data) (2015), IEEE, pp. 14–23

work page 2015
[10]

Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality

Anagnostopoulos, C., and Triantafillou, P. Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality. ACM Transactions on Knowledge Discovery from Data 11 , 4 (2017), 1–46

work page 2017
[11]

L., and Long, P

Bartlett, P. L., and Long, P. M.More theorems about scale-sensitive dimensions and learning. In Proceedings of the eighth annual conference on Computational learning theory (1995), ACM Press, pp. 392–401

work page 1995
[12]

Bishop, C. M. Pattern recognition and machine learning . Information science and statistics. Springer, 2006

work page 2006
[13]

Exploiting statistics on query expressions for optimization

Bruno, N., and Chaudhuri, S. Exploiting statistics on query expressions for optimization. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 263–274

work page 2002
[14]

STHoles: a multidimensional workload-aware histogram

Bruno, N., Chaudhuri, S., and Gravano, L. STHoles: a multidimensional workload-aware histogram. In Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001), ACM, pp. 211–222

work page 2001
[15]

M., and Roussopoulos, N

Chen, C. M., and Roussopoulos, N. Adaptive selectivity estimation using query feedback. In Proceedings of the 1994 ACM SIGMOD international conference on Management of data (1994), ACM, pp. 161–172

work page 1994
[16]

XGBoost: A Scalable Tree Boosting System

Chen, T., and Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), ACM, pp. 785–794

work page 2016
[17]

B., Suciu, D., and Balazinska, M

Deeds, K. B., Suciu, D., and Balazinska, M. SafeBound: A Practical System for Generating Cardinality Bounds. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–26

work page 2023
[18]

Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388

Ding, B., Chaudhuri, S., Gehrke, J., and Narasayya, V.DSB: a decision support benchmark for workload-driven and traditional database systems. Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388

work page 2021
[19]

Probability: theory and examples, fifth edition ed

Durrett, R. Probability: theory and examples, fifth edition ed. No. 49 in Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge ; New York, NY, 2019

work page 2019
[20]

Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228

Dutt, A., W ang, C., Narasayya, V., and Chaudhuri, S.Efficiently approximat- ing selectivity functions using low overhead regression models. Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228

work page 2020
[21]

Selectivity estimation for range predicates using lightweight models

Dutt, A., W ang, C., Nazi, A., Kandula, S., Narasayya, V., and Chaudhuri, S. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment 12 , 9 (2019), 1044–1057

work page 2019
[22]

Deep Residual Learning for Image Recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), IEEE, pp. 770–778

work page 2016
[23]

Convolution and Cross- Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries

Heddes, M., Nunes, I., Givargis, T., and Nicolau, A. Convolution and Cross- Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries. Proceedings of the ACM on Management of Data 2 , 3 (2024), 1–26

work page 2024
[24]

Probability of error, equivocation, and the Chernoff bound

Hellman, M., and Raviv, J. Probability of error, equivocation, and the Chernoff bound. IEEE Transactions on Information Theory 16 , 4 (1970), 368–372

work page 1970
[25]

DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005

Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., and Binnig, C. DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005

work page 2020
[26]

K., Panigrahi, D., Roy, S., and Yang, J

Hu, X., Liu, Y., Xiu, H., Agarwal, P. K., Panigrahi, D., Roy, S., and Yang, J. Selectivity Functions of Range Queries are Learnable. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 959–972

work page 2022
[27]

E., and Christodoulakis, S

Ioannidis, Y. E., and Christodoulakis, S. On the propagation of errors in the size of join results. In Proceedings of the 1991 ACM SIGMOD international conference on Management of data (1991), ACM Press, pp. 268–277

work page 1991
[28]

J., and Schapire, R

Kearns, M. J., and Schapire, R. E. Efficient distribution-free learning of proba- bilistic concepts. Journal of Computer and System Sciences 48 , 3 (1994), 464–497

work page 1994
[29]

J., V azirani, U

Kearns, M. J., V azirani, U. V., and V azirani, U. V.An introduction to computa- tional learning theory, 2. print ed. MIT Press, Cambridge, Mass., 1997

work page 1997
[30]

Learned Cardi- nality Estimation: An In-depth Study

Kim, K., Jung, J., Seo, I., Han, W.-S., Choi, K., and Chong, J. Learned Cardi- nality Estimation: An In-depth Study. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 1214–1227

work page 2022
[31]

Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

Kim, K., Lee, S., Kim, I., and Han, W.-S.ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

work page 2024
[32]

A., and Kemper, A

Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P. A., and Kemper, A. Learned cardinalities: Estimating correlated joins with deep learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019 (2019)

work page 2019
[33]

How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215

Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., and Neumann, T. How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215

work page 2015
[34]

ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads

Li, P., Wei, W., Zhu, R., Ding, B., Zhou, J., and Lu, H. ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads. Proceedings of the VLDB Endowment 17 , 2 (2023), 197–210

work page 2023
[35]

Lim, L., Wang, M., and Vitter, J. S. SASH: A self-adaptive histogram set for dynamically changing workloads. In Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, September 9-12, 2003 (2003), pp. 369–380

work page 2003
[36]

Certified monotonic neural networks

Liu, X., Han, X., Zhang, N., and Liu, Q. Certified monotonic neural networks. In Advances in Neural Information Processing Systems 33 (2020)

work page 2020
[37]

Lynch, C. A. Selectivity estimation and query optimization in large databases with highly skewed distribution of column values. In Proceedings of the VLDB Endowment (1988), pp. 240–251

work page 1988
[38]

J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T

Markl, V., Haas, P. J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T. M. Consistent selectivity estimation via maximum entropy. The VLDB Journal 16, 1 (2006), 55–76

work page 2006
[39]

M., and Raman, V

Markl, V., Lohman, G. M., and Raman, V. LEO: An autonomic query optimizer for DB2. IBM Systems Journal 42 , 1 (2003), 98–106

work page 2003
[40]

Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes

Meng, Z., Cao, X., and Cong, G. Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes. Proceedings of the ACM on Management of Data 1, 4 (2023), 1–26

work page 2023
[41]

Proceedings of the VLDB Endowment 2, 1 (2009), 982–993

Moerkotte, G., Neumann, T., and Steidl, G.Preventing bad plans by bounding the impact of cardinality estimation errors. Proceedings of the VLDB Endowment 2, 1 (2009), 982–993

work page 2009
[42]

Flow-loss: learning cardinality estimates that matter

Negi, P., Marcus, R., Kipf, A., Mao, H., Tatbul, N., Kraska, T., and Alizadeh, M. Flow-loss: learning cardinality estimates that matter. Proceedings of the VLDB Endowment 14, 11 (2021), 2019–2032

work page 2021
[43]

Robust Query Driven Cardinality Estimation under Changing Workloads

Negi, P., Wu, Z., Kipf, A., Tatbul, N., Marcus, R., Madden, S., Kraska, T., and Alizadeh, M. Robust Query Driven Cardinality Estimation under Changing Workloads. Proceedings of the VLDB Endowment 16 , 6 (2023), 1520–1533

work page 2023
[44]

QuickSel: Quick Selectivity Learning with Mixture Models

Park, Y., Zhong, S., and Mozafari, B. QuickSel: Quick Selectivity Learning with Mixture Models. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (2020), ACM, pp. 1017–1033

work page 2020
[45]

TPC-DS, taking decision support benchmarking to the next level

Poess, M., Smith, B., Kollar, L., and Larson, P. TPC-DS, taking decision support benchmarking to the next level. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 582–587

work page 2002
[46]

Sample-Efficient Cardinality Estimation Using Geometric Deep Learning

Reiner, S., and Grossniklaus, M. Sample-Efficient Cardinality Estimation Using Geometric Deep Learning. Proceedings of the VLDB Endowment 17, 4 (2023), 740–752

work page 2023
[47]

S., and Tesman, B

Roberts, F. S., and Tesman, B. Applied combinatorics, second edition ed. CRC Press, 2009

work page 2009
[48]

Learning from label proportions: A mutual contami- nation framework

Scott, C., and Zhang, J. Learning from label proportions: A mutual contami- nation framework. Advances in neural information processing systems 33 (2020), 22256–22267

work page 2020
[49]

G., Astrahan, M

Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD international conference on Management of A Practical Theory of Generalization in Selectivity Learning Conference’17, July 2017, Washington, DC, USA data (1979), ACM Pre...

work page 1979
[50]

Stein, E. M. Real Analysis: Measure Theory, Integration, and Hilbert Spaces , 1st ed ed. Princeton University Press, 2009

work page 2009
[51]

M., Markl, V., and Kandil, M

Stillger, M., Lohman, G. M., Markl, V., and Kandil, M. LEO - db2’s learning optimizer. In Proceedings of the VLDB Endowment (2001), pp. 19–28

work page 2001
[52]

An end-to-end learning-based cost estimator

Sun, J., and Li, G. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment 13 , 3 (2019), 307–319

work page 2019
[53]

L., Li, S., Mao, Z., and Tang, B

Wang, F., Yan, X., Yiu, M. L., Li, S., Mao, Z., and Tang, B. Speeding Up End- to-end Query Execution via Learning-based Progressive Cardinality Estimation. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–25

work page 2023
[54]

Decomposition—a strategy for query processing

Wong, E., and Youssefi, K. Decomposition—a strategy for query processing. ACM Transactions on Database Systems 1 , 3 (1976), 223–241

work page 1976
[55]

Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222

Wu, C., Jindal, A., Amizadeh, S., Patel, H., Le, W., Qiao, S., and Rao, S.Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222

work page 2018
[56]

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

Wu, P., and Cong, G. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. In Proceedings of the 2021 International Conference on Management of Data (2021), ACM, pp. 2009–2022

work page 2021
[57]

Wu, P., and Ives, Z. G. Modeling Shifting Workloads for Learned Database Systems. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

work page 2024
[58]

FactorJoin: A New Cardinality Estimation Framework for Join Queries

Wu, Z., Negi, P., Alizadeh, M., Kraska, T., and Madden, S. FactorJoin: A New Cardinality Estimation Framework for Join Queries. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–27

work page 2023
[59]

K., and Yang, J

Xiu, H., Agarwal, P. K., and Yang, J. PARQO: Penalty-Aware Robust Plan Selection in Query Optimization. Proceedings of the VLDB Endowment 17 , 13 (2024), 4627–4640

work page 2024
[60]

NeuroCard: one cardinality estimator for all tables

Yang, Z., Kamsetty, A., Luan, S., Liang, E., Duan, Y., Chen, X., and Stoica, I. NeuroCard: one cardinality estimator for all tables. Proceedings of the VLDB Endowment 14, 1 (2020), 61–73

work page 2020
[61]

Understanding deep learning (still) requires rethinking generalization

Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM 64, 3 (2021), 107–115

work page 2021
[62]

Advances in Neural Information Processing Systems 35 (2022), 26933–26942

Zhang, J., W ang, Y., and Scott, C.Learning from label proportions by learning with label noise. Advances in Neural Information Processing Systems 35 (2022), 26933–26942

work page 2022
[63]

Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding

Zhang, K., Wang, H., Lu, Y., Li, Z., Shu, C., Yan, Y., and Yang, D. Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding. In 2024 IEEE 40th International Conference on Data Engineering (ICDE) (2024), IEEE, pp. 56–69

work page 2024

[1] [1]

https://github.com/learnedsystems/CEB/tree/main

Ceb. https://github.com/learnedsystems/CEB/tree/main

work page

[2] [2]

https://github.com/waltercai/pqo-opensource

Modified postgresql. https://github.com/waltercai/pqo-opensource

work page

[3] [3]

https://github.com/andreaskipf/learnedcardinalities

MSCN code. https://github.com/andreaskipf/learnedcardinalities. Last Accessed Date: 2025-04-05

work page 2025

[4] [4]

https://github.com/huxiao2010/Selectivity/

PtsHist and QuadHist code. https://github.com/huxiao2010/Selectivity/. Last Accessed Date: 2025-04-05

work page 2025

[5] [5]

Foundations of databases, vol

Abiteboul, S., Hull, R., and Vianu, V. Foundations of databases, vol. 8. Addison- Wesley Reading, 1995

work page 1995

[6] [6]

In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp

Aboulnaga, A., and Chaudhuri, S.Self-tuning histograms: building histograms without looking at data. In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp. 181–192

work page 1999

[7] [7]

Scale-sensitive dimensions, uniform convergence, and learnability

Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussler, D. Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44 , 4 (1997), 615–631

work page 1997

[8] [8]

Learning Set Cardinality in Distance Nearest Neighbours

Anagnostopoulos, C., and Triantafillou, P. Learning Set Cardinality in Distance Nearest Neighbours. In 2015 IEEE International Conference on Data Mining (2015), IEEE, pp. 691–696

work page 2015

[9] [9]

Learning to accurately COUNT with query-driven predictive analytics

Anagnostopoulos, C., and Triantafillou, P. Learning to accurately COUNT with query-driven predictive analytics. In 2015 IEEE International Conference on Big Data (Big Data) (2015), IEEE, pp. 14–23

work page 2015

[10] [10]

Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality

Anagnostopoulos, C., and Triantafillou, P. Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality. ACM Transactions on Knowledge Discovery from Data 11 , 4 (2017), 1–46

work page 2017

[11] [11]

L., and Long, P

Bartlett, P. L., and Long, P. M.More theorems about scale-sensitive dimensions and learning. In Proceedings of the eighth annual conference on Computational learning theory (1995), ACM Press, pp. 392–401

work page 1995

[12] [12]

Bishop, C. M. Pattern recognition and machine learning . Information science and statistics. Springer, 2006

work page 2006

[13] [13]

Exploiting statistics on query expressions for optimization

Bruno, N., and Chaudhuri, S. Exploiting statistics on query expressions for optimization. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 263–274

work page 2002

[14] [14]

STHoles: a multidimensional workload-aware histogram

Bruno, N., Chaudhuri, S., and Gravano, L. STHoles: a multidimensional workload-aware histogram. In Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001), ACM, pp. 211–222

work page 2001

[15] [15]

M., and Roussopoulos, N

Chen, C. M., and Roussopoulos, N. Adaptive selectivity estimation using query feedback. In Proceedings of the 1994 ACM SIGMOD international conference on Management of data (1994), ACM, pp. 161–172

work page 1994

[16] [16]

XGBoost: A Scalable Tree Boosting System

Chen, T., and Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), ACM, pp. 785–794

work page 2016

[17] [17]

B., Suciu, D., and Balazinska, M

Deeds, K. B., Suciu, D., and Balazinska, M. SafeBound: A Practical System for Generating Cardinality Bounds. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–26

work page 2023

[18] [18]

Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388

Ding, B., Chaudhuri, S., Gehrke, J., and Narasayya, V.DSB: a decision support benchmark for workload-driven and traditional database systems. Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388

work page 2021

[19] [19]

Probability: theory and examples, fifth edition ed

Durrett, R. Probability: theory and examples, fifth edition ed. No. 49 in Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge ; New York, NY, 2019

work page 2019

[20] [20]

Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228

Dutt, A., W ang, C., Narasayya, V., and Chaudhuri, S.Efficiently approximat- ing selectivity functions using low overhead regression models. Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228

work page 2020

[21] [21]

Selectivity estimation for range predicates using lightweight models

Dutt, A., W ang, C., Nazi, A., Kandula, S., Narasayya, V., and Chaudhuri, S. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment 12 , 9 (2019), 1044–1057

work page 2019

[22] [22]

Deep Residual Learning for Image Recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), IEEE, pp. 770–778

work page 2016

[23] [23]

Convolution and Cross- Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries

Heddes, M., Nunes, I., Givargis, T., and Nicolau, A. Convolution and Cross- Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries. Proceedings of the ACM on Management of Data 2 , 3 (2024), 1–26

work page 2024

[24] [24]

Probability of error, equivocation, and the Chernoff bound

Hellman, M., and Raviv, J. Probability of error, equivocation, and the Chernoff bound. IEEE Transactions on Information Theory 16 , 4 (1970), 368–372

work page 1970

[25] [25]

DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005

Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., and Binnig, C. DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005

work page 2020

[26] [26]

K., Panigrahi, D., Roy, S., and Yang, J

Hu, X., Liu, Y., Xiu, H., Agarwal, P. K., Panigrahi, D., Roy, S., and Yang, J. Selectivity Functions of Range Queries are Learnable. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 959–972

work page 2022

[27] [27]

E., and Christodoulakis, S

Ioannidis, Y. E., and Christodoulakis, S. On the propagation of errors in the size of join results. In Proceedings of the 1991 ACM SIGMOD international conference on Management of data (1991), ACM Press, pp. 268–277

work page 1991

[28] [28]

J., and Schapire, R

Kearns, M. J., and Schapire, R. E. Efficient distribution-free learning of proba- bilistic concepts. Journal of Computer and System Sciences 48 , 3 (1994), 464–497

work page 1994

[29] [29]

J., V azirani, U

Kearns, M. J., V azirani, U. V., and V azirani, U. V.An introduction to computa- tional learning theory, 2. print ed. MIT Press, Cambridge, Mass., 1997

work page 1997

[30] [30]

Learned Cardi- nality Estimation: An In-depth Study

Kim, K., Jung, J., Seo, I., Han, W.-S., Choi, K., and Chong, J. Learned Cardi- nality Estimation: An In-depth Study. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 1214–1227

work page 2022

[31] [31]

Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

Kim, K., Lee, S., Kim, I., and Han, W.-S.ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

work page 2024

[32] [32]

A., and Kemper, A

Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P. A., and Kemper, A. Learned cardinalities: Estimating correlated joins with deep learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019 (2019)

work page 2019

[33] [33]

How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215

Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., and Neumann, T. How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215

work page 2015

[34] [34]

ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads

Li, P., Wei, W., Zhu, R., Ding, B., Zhou, J., and Lu, H. ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads. Proceedings of the VLDB Endowment 17 , 2 (2023), 197–210

work page 2023

[35] [35]

Lim, L., Wang, M., and Vitter, J. S. SASH: A self-adaptive histogram set for dynamically changing workloads. In Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, September 9-12, 2003 (2003), pp. 369–380

work page 2003

[36] [36]

Certified monotonic neural networks

Liu, X., Han, X., Zhang, N., and Liu, Q. Certified monotonic neural networks. In Advances in Neural Information Processing Systems 33 (2020)

work page 2020

[37] [37]

Lynch, C. A. Selectivity estimation and query optimization in large databases with highly skewed distribution of column values. In Proceedings of the VLDB Endowment (1988), pp. 240–251

work page 1988

[38] [38]

J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T

Markl, V., Haas, P. J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T. M. Consistent selectivity estimation via maximum entropy. The VLDB Journal 16, 1 (2006), 55–76

work page 2006

[39] [39]

M., and Raman, V

Markl, V., Lohman, G. M., and Raman, V. LEO: An autonomic query optimizer for DB2. IBM Systems Journal 42 , 1 (2003), 98–106

work page 2003

[40] [40]

Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes

Meng, Z., Cao, X., and Cong, G. Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes. Proceedings of the ACM on Management of Data 1, 4 (2023), 1–26

work page 2023

[41] [41]

Proceedings of the VLDB Endowment 2, 1 (2009), 982–993

Moerkotte, G., Neumann, T., and Steidl, G.Preventing bad plans by bounding the impact of cardinality estimation errors. Proceedings of the VLDB Endowment 2, 1 (2009), 982–993

work page 2009

[42] [42]

Flow-loss: learning cardinality estimates that matter

Negi, P., Marcus, R., Kipf, A., Mao, H., Tatbul, N., Kraska, T., and Alizadeh, M. Flow-loss: learning cardinality estimates that matter. Proceedings of the VLDB Endowment 14, 11 (2021), 2019–2032

work page 2021

[43] [43]

Robust Query Driven Cardinality Estimation under Changing Workloads

Negi, P., Wu, Z., Kipf, A., Tatbul, N., Marcus, R., Madden, S., Kraska, T., and Alizadeh, M. Robust Query Driven Cardinality Estimation under Changing Workloads. Proceedings of the VLDB Endowment 16 , 6 (2023), 1520–1533

work page 2023

[44] [44]

QuickSel: Quick Selectivity Learning with Mixture Models

Park, Y., Zhong, S., and Mozafari, B. QuickSel: Quick Selectivity Learning with Mixture Models. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (2020), ACM, pp. 1017–1033

work page 2020

[45] [45]

TPC-DS, taking decision support benchmarking to the next level

Poess, M., Smith, B., Kollar, L., and Larson, P. TPC-DS, taking decision support benchmarking to the next level. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 582–587

work page 2002

[46] [46]

Sample-Efficient Cardinality Estimation Using Geometric Deep Learning

Reiner, S., and Grossniklaus, M. Sample-Efficient Cardinality Estimation Using Geometric Deep Learning. Proceedings of the VLDB Endowment 17, 4 (2023), 740–752

work page 2023

[47] [47]

S., and Tesman, B

Roberts, F. S., and Tesman, B. Applied combinatorics, second edition ed. CRC Press, 2009

work page 2009

[48] [48]

Learning from label proportions: A mutual contami- nation framework

Scott, C., and Zhang, J. Learning from label proportions: A mutual contami- nation framework. Advances in neural information processing systems 33 (2020), 22256–22267

work page 2020

[49] [49]

G., Astrahan, M

Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD international conference on Management of A Practical Theory of Generalization in Selectivity Learning Conference’17, July 2017, Washington, DC, USA data (1979), ACM Pre...

work page 1979

[50] [50]

Stein, E. M. Real Analysis: Measure Theory, Integration, and Hilbert Spaces , 1st ed ed. Princeton University Press, 2009

work page 2009

[51] [51]

M., Markl, V., and Kandil, M

Stillger, M., Lohman, G. M., Markl, V., and Kandil, M. LEO - db2’s learning optimizer. In Proceedings of the VLDB Endowment (2001), pp. 19–28

work page 2001

[52] [52]

An end-to-end learning-based cost estimator

Sun, J., and Li, G. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment 13 , 3 (2019), 307–319

work page 2019

[53] [53]

L., Li, S., Mao, Z., and Tang, B

Wang, F., Yan, X., Yiu, M. L., Li, S., Mao, Z., and Tang, B. Speeding Up End- to-end Query Execution via Learning-based Progressive Cardinality Estimation. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–25

work page 2023

[54] [54]

Decomposition—a strategy for query processing

Wong, E., and Youssefi, K. Decomposition—a strategy for query processing. ACM Transactions on Database Systems 1 , 3 (1976), 223–241

work page 1976

[55] [55]

Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222

Wu, C., Jindal, A., Amizadeh, S., Patel, H., Le, W., Qiao, S., and Rao, S.Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222

work page 2018

[56] [56]

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

Wu, P., and Cong, G. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. In Proceedings of the 2021 International Conference on Management of Data (2021), ACM, pp. 2009–2022

work page 2021

[57] [57]

Wu, P., and Ives, Z. G. Modeling Shifting Workloads for Learned Database Systems. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27

work page 2024

[58] [58]

FactorJoin: A New Cardinality Estimation Framework for Join Queries

Wu, Z., Negi, P., Alizadeh, M., Kraska, T., and Madden, S. FactorJoin: A New Cardinality Estimation Framework for Join Queries. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–27

work page 2023

[59] [59]

K., and Yang, J

Xiu, H., Agarwal, P. K., and Yang, J. PARQO: Penalty-Aware Robust Plan Selection in Query Optimization. Proceedings of the VLDB Endowment 17 , 13 (2024), 4627–4640

work page 2024

[60] [60]

NeuroCard: one cardinality estimator for all tables

Yang, Z., Kamsetty, A., Luan, S., Liang, E., Duan, Y., Chen, X., and Stoica, I. NeuroCard: one cardinality estimator for all tables. Proceedings of the VLDB Endowment 14, 1 (2020), 61–73

work page 2020

[61] [61]

Understanding deep learning (still) requires rethinking generalization

Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM 64, 3 (2021), 107–115

work page 2021

[62] [62]

Advances in Neural Information Processing Systems 35 (2022), 26933–26942

Zhang, J., W ang, Y., and Scott, C.Learning from label proportions by learning with label noise. Advances in Neural Information Processing Systems 35 (2022), 26933–26942

work page 2022

[63] [63]

Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding

Zhang, K., Wang, H., Lu, Y., Li, Z., Shu, C., Yan, Y., and Yang, D. Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding. In 2024 IEEE 40th International Conference on Data Engineering (ICDE) (2024), IEEE, pp. 56–69

work page 2024