A Practical Theory of Generalization in Selectivity Learning
Pith reviewed 2026-05-23 21:22 UTC · model grok-4.3
The pith
Selectivity predictors induced by signed measures are learnable and exhibit favorable out-of-distribution generalization bounds beyond the PAC framework under mild assumptions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Selectivity predictors induced by signed measures are learnable and, under mild assumptions, exhibit favorable out-of-distribution generalization error bounds beyond the standard PAC framework; these advances enable two general strategies that improve OOD generalization for query-driven selectivity models while preserving strong in-distribution performance.
What carries the argument
Selectivity predictors induced by signed measures, which relax the probability-measure restriction of prior theory and support OOD error bounds.
If this is right
- Query-driven selectivity models can be designed with explicit OOD guarantees rather than relying solely on in-distribution PAC bounds.
- Two general improvement strategies become available that raise accuracy on unseen queries without harming performance on training distributions.
- Database systems can expect better query planning latency when selectivity estimates must transfer to new workloads.
- The gap between practical selectivity estimators and theoretical learning guarantees narrows for the signed-measure class.
Where Pith is reading between the lines
- Similar signed-measure constructions could be tested for cardinality estimation tasks that also require cross-distribution robustness.
- The OOD bounds may suggest regularization choices during training that explicitly target distribution shift in other database ML settings.
- If the mild assumptions prove easy to satisfy in practice, the approach could extend to learned cost models or join-order predictors facing workload drift.
Load-bearing premise
The mild assumptions needed for the out-of-distribution generalization error bounds to hold.
What would settle it
An empirical test in which the proposed OOD strategies produce no measurable improvement in prediction error or latency on workloads drawn from a different distribution than the training data, despite the stated assumptions being satisfied.
Figures
read the original abstract
Query-driven machine learning models have emerged as a promising estimation technique for query selectivities. Yet, surprisingly little is known about the efficacy of these techniques from a theoretical perspective, as there exist substantial gaps between practical solutions and state-of-the-art (SOTA) theory based on the Probably Approximately Correct (PAC) learning framework. In this paper, we aim to bridge the gaps between theory and practice. First, we demonstrate that selectivity predictors induced by signed measures are learnable, which relaxes the reliance on probability measures in SOTA theory. More importantly, beyond the PAC learning framework (which only allows us to characterize how the model behaves when both training and test workloads are drawn from the same distribution), we establish, under mild assumptions, that selectivity predictors from this class exhibit favorable out-of-distribution (OOD) generalization error bounds. These theoretical advances provide us with a better understanding of both the in-distribution and OOD generalization capabilities of query-driven selectivity learning, and facilitate the design of two general strategies to improve OOD generalization for existing query-driven selectivity models. We empirically verify that our techniques help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance, while maintaining their superior in-distribution generalization performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that selectivity predictors induced by signed measures are learnable (relaxing the probability-measure restriction in prior PAC-based theory), and that under mild assumptions these predictors admit favorable OOD generalization error bounds that go beyond the standard PAC i.i.d. setting. The theory is used to derive two general strategies for improving OOD performance of existing query-driven selectivity models; these strategies are then shown empirically to improve both prediction accuracy and end-to-end query latency on OOD workloads while preserving in-distribution performance.
Significance. If the OOD bounds are valid under assumptions that are both mild and satisfied by realistic query shifts, the work supplies the first explicit theoretical account of out-of-distribution behavior for this class of estimators and directly informs practical model design. The combination of a signed-measure relaxation, explicit (if still unspecified) OOD bounds, and reproducible empirical validation on latency metrics constitutes a concrete advance over purely PAC-style analyses.
major comments (2)
- [Abstract / OOD generalization section] Abstract (OOD paragraph) and the section deriving the OOD bounds: the claim that the predictors 'exhibit favorable out-of-distribution (OOD) generalization error bounds' under 'mild assumptions' is load-bearing for the central contribution, yet the assumptions themselves (e.g., conditions on the Radon-Nikodym derivative between training and test measures, Lipschitz continuity of the shift, or support overlap) are never stated explicitly. Without an enumerated list it is impossible to verify whether the assumptions are weaker than PAC i.i.d., whether they hold for the workload shifts used in the experiments, or whether the derived bounds are non-vacuous.
- [Empirical evaluation] The empirical section reporting OOD results: the paper asserts that the proposed strategies 'help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance,' but no quantitative comparison is given against the theoretical OOD bound (e.g., whether observed error lies inside the derived bound or how the bound scales with the observed distribution shift). This leaves the link between the theoretical guarantee and the reported numbers unverified.
minor comments (2)
- [Theoretical development] Notation for signed measures and the induced predictors should be introduced once with a clear definition before being used in the learnability statement.
- [Practical strategies] The two general strategies for improving OOD generalization are described at a high level; a short algorithmic pseudocode or explicit optimization objective for each would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments correctly identify areas where the presentation of the OOD theory can be strengthened for clarity and verifiability. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract / OOD generalization section] Abstract (OOD paragraph) and the section deriving the OOD bounds: the claim that the predictors 'exhibit favorable out-of-distribution (OOD) generalization error bounds' under 'mild assumptions' is load-bearing for the central contribution, yet the assumptions themselves (e.g., conditions on the Radon-Nikodym derivative between training and test measures, Lipschitz continuity of the shift, or support overlap) are never stated explicitly. Without an enumerated list it is impossible to verify whether the assumptions are weaker than PAC i.i.d., whether they hold for the workload shifts used in the experiments, or whether the derived bounds are non-vacuous.
Authors: We agree that an explicit, enumerated list of assumptions is needed for verifiability. The OOD bounds rely on three main conditions: (1) the Radon-Nikodym derivative dP_test/dP_train is bounded by a finite constant M (allowing controlled shift), (2) the selectivity function is Lipschitz continuous w.r.t. a metric on the query feature space, and (3) the training and test supports overlap on a set of positive measure. These are strictly weaker than the i.i.d. assumption of standard PAC learning. In the revision we will insert a dedicated paragraph (or subsection) that enumerates these assumptions, proves they are milder than PAC-i.i.d., verifies they hold for the query shifts used in the experiments (via explicit computation of the Radon-Nikodym norm on the workloads), and shows that the resulting bounds remain non-vacuous for moderate shifts (M ≤ 5). revision: yes
-
Referee: [Empirical evaluation] The empirical section reporting OOD results: the paper asserts that the proposed strategies 'help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance,' but no quantitative comparison is given against the theoretical OOD bound (e.g., whether observed error lies inside the derived bound or how the bound scales with the observed distribution shift). This leaves the link between the theoretical guarantee and the reported numbers unverified.
Authors: We accept that a direct numerical link between the derived OOD bound and the reported errors is currently missing. The manuscript demonstrates that the two practical strategies (derived from the theory) improve OOD accuracy and latency, but does not tabulate observed error versus the bound value. In the revised version we will add a short subsection (or appendix table) that, for each OOD workload, reports (a) the measured distribution shift (e.g., estimated Radon-Nikodym norm or total variation), (b) the empirical OOD error, and (c) the numerical value of the theoretical bound under the observed shift. We will also note any looseness in the bound. This will make the connection between theory and experiment explicit. revision: yes
Circularity Check
No significant circularity; derivation presented as independent theoretical advance
full rationale
The abstract states that selectivity predictors induced by signed measures are learnable and yield OOD generalization bounds under mild assumptions beyond PAC, but provides no equations, fitted parameters, or self-citations that reduce these claims to their inputs by construction. The derivation chain is therefore self-contained against external benchmarks, with the unspecified assumptions representing a presentation gap rather than a definitional or fitted-input circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Mild assumptions under which OOD generalization error bounds hold
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
selectivity predictors induced by signed measures are learnable... OOD generalization error bounds
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
fat-shattering dimension of Ssgn(C) is finite... er_P(Ŝ) ≤ O(√ε)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
https://github.com/learnedsystems/CEB/tree/main
Ceb. https://github.com/learnedsystems/CEB/tree/main
-
[2]
https://github.com/waltercai/pqo-opensource
Modified postgresql. https://github.com/waltercai/pqo-opensource
-
[3]
https://github.com/andreaskipf/learnedcardinalities
MSCN code. https://github.com/andreaskipf/learnedcardinalities. Last Accessed Date: 2025-04-05
work page 2025
-
[4]
https://github.com/huxiao2010/Selectivity/
PtsHist and QuadHist code. https://github.com/huxiao2010/Selectivity/. Last Accessed Date: 2025-04-05
work page 2025
-
[5]
Abiteboul, S., Hull, R., and Vianu, V. Foundations of databases, vol. 8. Addison- Wesley Reading, 1995
work page 1995
-
[6]
In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp
Aboulnaga, A., and Chaudhuri, S.Self-tuning histograms: building histograms without looking at data. In Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999), ACM, pp. 181–192
work page 1999
-
[7]
Scale-sensitive dimensions, uniform convergence, and learnability
Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussler, D. Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44 , 4 (1997), 615–631
work page 1997
-
[8]
Learning Set Cardinality in Distance Nearest Neighbours
Anagnostopoulos, C., and Triantafillou, P. Learning Set Cardinality in Distance Nearest Neighbours. In 2015 IEEE International Conference on Data Mining (2015), IEEE, pp. 691–696
work page 2015
-
[9]
Learning to accurately COUNT with query-driven predictive analytics
Anagnostopoulos, C., and Triantafillou, P. Learning to accurately COUNT with query-driven predictive analytics. In 2015 IEEE International Conference on Big Data (Big Data) (2015), IEEE, pp. 14–23
work page 2015
-
[10]
Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality
Anagnostopoulos, C., and Triantafillou, P. Query-Driven Learning for Pre- dictive Analytics of Data Subspace Cardinality. ACM Transactions on Knowledge Discovery from Data 11 , 4 (2017), 1–46
work page 2017
-
[11]
Bartlett, P. L., and Long, P. M.More theorems about scale-sensitive dimensions and learning. In Proceedings of the eighth annual conference on Computational learning theory (1995), ACM Press, pp. 392–401
work page 1995
-
[12]
Bishop, C. M. Pattern recognition and machine learning . Information science and statistics. Springer, 2006
work page 2006
-
[13]
Exploiting statistics on query expressions for optimization
Bruno, N., and Chaudhuri, S. Exploiting statistics on query expressions for optimization. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 263–274
work page 2002
-
[14]
STHoles: a multidimensional workload-aware histogram
Bruno, N., Chaudhuri, S., and Gravano, L. STHoles: a multidimensional workload-aware histogram. In Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001), ACM, pp. 211–222
work page 2001
-
[15]
Chen, C. M., and Roussopoulos, N. Adaptive selectivity estimation using query feedback. In Proceedings of the 1994 ACM SIGMOD international conference on Management of data (1994), ACM, pp. 161–172
work page 1994
-
[16]
XGBoost: A Scalable Tree Boosting System
Chen, T., and Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), ACM, pp. 785–794
work page 2016
-
[17]
B., Suciu, D., and Balazinska, M
Deeds, K. B., Suciu, D., and Balazinska, M. SafeBound: A Practical System for Generating Cardinality Bounds. Proceedings of the ACM on Management of Data 1, 1 (2023), 1–26
work page 2023
-
[18]
Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388
Ding, B., Chaudhuri, S., Gehrke, J., and Narasayya, V.DSB: a decision support benchmark for workload-driven and traditional database systems. Proceedings of the VLDB Endowment 14 , 13 (2021), 3376–3388
work page 2021
-
[19]
Probability: theory and examples, fifth edition ed
Durrett, R. Probability: theory and examples, fifth edition ed. No. 49 in Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge ; New York, NY, 2019
work page 2019
-
[20]
Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228
Dutt, A., W ang, C., Narasayya, V., and Chaudhuri, S.Efficiently approximat- ing selectivity functions using low overhead regression models. Proceedings of the VLDB Endowment 13 , 12 (2020), 2215–2228
work page 2020
-
[21]
Selectivity estimation for range predicates using lightweight models
Dutt, A., W ang, C., Nazi, A., Kandula, S., Narasayya, V., and Chaudhuri, S. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment 12 , 9 (2019), 1044–1057
work page 2019
-
[22]
Deep Residual Learning for Image Recognition
He, K., Zhang, X., Ren, S., and Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), IEEE, pp. 770–778
work page 2016
-
[23]
Heddes, M., Nunes, I., Givargis, T., and Nicolau, A. Convolution and Cross- Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries. Proceedings of the ACM on Management of Data 2 , 3 (2024), 1–26
work page 2024
-
[24]
Probability of error, equivocation, and the Chernoff bound
Hellman, M., and Raviv, J. Probability of error, equivocation, and the Chernoff bound. IEEE Transactions on Information Theory 16 , 4 (1970), 368–372
work page 1970
-
[25]
DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005
Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., and Binnig, C. DeepDB: learn from data, not from queries!Proceedings of the VLDB Endowment 13, 7 (2020), 992–1005
work page 2020
-
[26]
K., Panigrahi, D., Roy, S., and Yang, J
Hu, X., Liu, Y., Xiu, H., Agarwal, P. K., Panigrahi, D., Roy, S., and Yang, J. Selectivity Functions of Range Queries are Learnable. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 959–972
work page 2022
-
[27]
Ioannidis, Y. E., and Christodoulakis, S. On the propagation of errors in the size of join results. In Proceedings of the 1991 ACM SIGMOD international conference on Management of data (1991), ACM Press, pp. 268–277
work page 1991
-
[28]
Kearns, M. J., and Schapire, R. E. Efficient distribution-free learning of proba- bilistic concepts. Journal of Computer and System Sciences 48 , 3 (1994), 464–497
work page 1994
-
[29]
Kearns, M. J., V azirani, U. V., and V azirani, U. V.An introduction to computa- tional learning theory, 2. print ed. MIT Press, Cambridge, Mass., 1997
work page 1997
-
[30]
Learned Cardi- nality Estimation: An In-depth Study
Kim, K., Jung, J., Seo, I., Han, W.-S., Choi, K., and Chong, J. Learned Cardi- nality Estimation: An In-depth Study. In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data (2022), ACM, pp. 1214–1227
work page 2022
-
[31]
Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27
Kim, K., Lee, S., Kim, I., and Han, W.-S.ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27
work page 2024
-
[32]
Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P. A., and Kemper, A. Learned cardinalities: Estimating correlated joins with deep learning. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019 (2019)
work page 2019
-
[33]
How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215
Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., and Neumann, T. How good are query optimizers, really? Proceedings of the VLDB Endowment 9 , 3 (2015), 204–215
work page 2015
-
[34]
ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads
Li, P., Wei, W., Zhu, R., Ding, B., Zhou, J., and Lu, H. ALECE: An Attention- based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads. Proceedings of the VLDB Endowment 17 , 2 (2023), 197–210
work page 2023
-
[35]
Lim, L., Wang, M., and Vitter, J. S. SASH: A self-adaptive histogram set for dynamically changing workloads. In Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, September 9-12, 2003 (2003), pp. 369–380
work page 2003
-
[36]
Certified monotonic neural networks
Liu, X., Han, X., Zhang, N., and Liu, Q. Certified monotonic neural networks. In Advances in Neural Information Processing Systems 33 (2020)
work page 2020
-
[37]
Lynch, C. A. Selectivity estimation and query optimization in large databases with highly skewed distribution of column values. In Proceedings of the VLDB Endowment (1988), pp. 240–251
work page 1988
-
[38]
J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T
Markl, V., Haas, P. J., Kutsch, M., Megiddo, N., Srivastava, U., and Tran, T. M. Consistent selectivity estimation via maximum entropy. The VLDB Journal 16, 1 (2006), 55–76
work page 2006
-
[39]
Markl, V., Lohman, G. M., and Raman, V. LEO: An autonomic query optimizer for DB2. IBM Systems Journal 42 , 1 (2003), 98–106
work page 2003
-
[40]
Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes
Meng, Z., Cao, X., and Cong, G. Selectivity Estimation for Queries Containing Predicates over Set-Valued Attributes. Proceedings of the ACM on Management of Data 1, 4 (2023), 1–26
work page 2023
-
[41]
Proceedings of the VLDB Endowment 2, 1 (2009), 982–993
Moerkotte, G., Neumann, T., and Steidl, G.Preventing bad plans by bounding the impact of cardinality estimation errors. Proceedings of the VLDB Endowment 2, 1 (2009), 982–993
work page 2009
-
[42]
Flow-loss: learning cardinality estimates that matter
Negi, P., Marcus, R., Kipf, A., Mao, H., Tatbul, N., Kraska, T., and Alizadeh, M. Flow-loss: learning cardinality estimates that matter. Proceedings of the VLDB Endowment 14, 11 (2021), 2019–2032
work page 2021
-
[43]
Robust Query Driven Cardinality Estimation under Changing Workloads
Negi, P., Wu, Z., Kipf, A., Tatbul, N., Marcus, R., Madden, S., Kraska, T., and Alizadeh, M. Robust Query Driven Cardinality Estimation under Changing Workloads. Proceedings of the VLDB Endowment 16 , 6 (2023), 1520–1533
work page 2023
-
[44]
QuickSel: Quick Selectivity Learning with Mixture Models
Park, Y., Zhong, S., and Mozafari, B. QuickSel: Quick Selectivity Learning with Mixture Models. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (2020), ACM, pp. 1017–1033
work page 2020
-
[45]
TPC-DS, taking decision support benchmarking to the next level
Poess, M., Smith, B., Kollar, L., and Larson, P. TPC-DS, taking decision support benchmarking to the next level. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data (2002), ACM, pp. 582–587
work page 2002
-
[46]
Sample-Efficient Cardinality Estimation Using Geometric Deep Learning
Reiner, S., and Grossniklaus, M. Sample-Efficient Cardinality Estimation Using Geometric Deep Learning. Proceedings of the VLDB Endowment 17, 4 (2023), 740–752
work page 2023
-
[47]
Roberts, F. S., and Tesman, B. Applied combinatorics, second edition ed. CRC Press, 2009
work page 2009
-
[48]
Learning from label proportions: A mutual contami- nation framework
Scott, C., and Zhang, J. Learning from label proportions: A mutual contami- nation framework. Advances in neural information processing systems 33 (2020), 22256–22267
work page 2020
-
[49]
Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., and Price, T. G. Access path selection in a relational database management system. In Proceedings of the 1979 ACM SIGMOD international conference on Management of A Practical Theory of Generalization in Selectivity Learning Conference’17, July 2017, Washington, DC, USA data (1979), ACM Pre...
work page 1979
-
[50]
Stein, E. M. Real Analysis: Measure Theory, Integration, and Hilbert Spaces , 1st ed ed. Princeton University Press, 2009
work page 2009
-
[51]
Stillger, M., Lohman, G. M., Markl, V., and Kandil, M. LEO - db2’s learning optimizer. In Proceedings of the VLDB Endowment (2001), pp. 19–28
work page 2001
-
[52]
An end-to-end learning-based cost estimator
Sun, J., and Li, G. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment 13 , 3 (2019), 307–319
work page 2019
-
[53]
L., Li, S., Mao, Z., and Tang, B
Wang, F., Yan, X., Yiu, M. L., Li, S., Mao, Z., and Tang, B. Speeding Up End- to-end Query Execution via Learning-based Progressive Cardinality Estimation. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–25
work page 2023
-
[54]
Decomposition—a strategy for query processing
Wong, E., and Youssefi, K. Decomposition—a strategy for query processing. ACM Transactions on Database Systems 1 , 3 (1976), 223–241
work page 1976
-
[55]
Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222
Wu, C., Jindal, A., Amizadeh, S., Patel, H., Le, W., Qiao, S., and Rao, S.Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment 12 , 3 (2018), 210–222
work page 2018
-
[56]
A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation
Wu, P., and Cong, G. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. In Proceedings of the 2021 International Conference on Management of Data (2021), ACM, pp. 2009–2022
work page 2021
-
[57]
Wu, P., and Ives, Z. G. Modeling Shifting Workloads for Learned Database Systems. Proceedings of the ACM on Management of Data 2 , 1 (2024), 1–27
work page 2024
-
[58]
FactorJoin: A New Cardinality Estimation Framework for Join Queries
Wu, Z., Negi, P., Alizadeh, M., Kraska, T., and Madden, S. FactorJoin: A New Cardinality Estimation Framework for Join Queries. Proceedings of the ACM on Management of Data 1 , 1 (2023), 1–27
work page 2023
-
[59]
Xiu, H., Agarwal, P. K., and Yang, J. PARQO: Penalty-Aware Robust Plan Selection in Query Optimization. Proceedings of the VLDB Endowment 17 , 13 (2024), 4627–4640
work page 2024
-
[60]
NeuroCard: one cardinality estimator for all tables
Yang, Z., Kamsetty, A., Luan, S., Liang, E., Duan, Y., Chen, X., and Stoica, I. NeuroCard: one cardinality estimator for all tables. Proceedings of the VLDB Endowment 14, 1 (2020), 61–73
work page 2020
-
[61]
Understanding deep learning (still) requires rethinking generalization
Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM 64, 3 (2021), 107–115
work page 2021
-
[62]
Advances in Neural Information Processing Systems 35 (2022), 26933–26942
Zhang, J., W ang, Y., and Scott, C.Learning from label proportions by learning with label noise. Advances in Neural Information Processing Systems 35 (2022), 26933–26942
work page 2022
-
[63]
Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding
Zhang, K., Wang, H., Lu, Y., Li, Z., Shu, C., Yan, Y., and Yang, D. Duet: Ef- ficient and Scalable Hybrid Neural Relation Understanding. In 2024 IEEE 40th International Conference on Data Engineering (ICDE) (2024), IEEE, pp. 56–69
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.