Learned Cardinalities: Estimating Correlated Joins with Deep Learning
read the original abstract
We describe a new deep learning approach to cardinality estimation. MSCN is a multi-set convolutional network, tailored to representing relational query plans, that employs set semantics to capture query features and true cardinalities. MSCN builds on sampling-based estimation, addressing its weaknesses when no sampled tuples qualify a predicate, and in capturing join-crossing correlations. Our evaluation of MSCN using a real-world dataset shows that deep learning significantly enhances the quality of cardinality estimation, which is the core problem in query optimization.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
SynQL: A Controllable and Scalable Rule-Based Framework for SQL Workload Synthesis for Performance Benchmarking
SynQL synthesizes diverse, execution-ready SQL workloads by deterministically traversing foreign-key graphs to populate ASTs, yielding high topological entropy and cost-model training data with R² ≥ 0.79 on held-out sets.
-
Cortex AISQL: A Production SQL Engine for Unstructured Data
Snowflake's Cortex AISQL adds native semantic operations to SQL via AI-aware optimization, adaptive model cascades, and semantic join rewriting, delivering 2-70x speedups in production workloads.
-
An Approach Based on Bayesian Networks for Query Selectivity Estimation
Chow-Liu trees approximate attribute distributions per relation to relax independence assumptions, yielding an order of magnitude better selectivity estimates on TPC-DS than prior methods while staying efficient.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.