pith. sign in

arxiv: 1809.00677 · v2 · pith:TNWYLYEZnew · submitted 2018-09-03 · 💻 cs.DB

Learned Cardinalities: Estimating Correlated Joins with Deep Learning

classification 💻 cs.DB
keywords deepestimationlearningmscnquerycardinalitiescardinalityaddressing
0
0 comments X
read the original abstract

We describe a new deep learning approach to cardinality estimation. MSCN is a multi-set convolutional network, tailored to representing relational query plans, that employs set semantics to capture query features and true cardinalities. MSCN builds on sampling-based estimation, addressing its weaknesses when no sampled tuples qualify a predicate, and in capturing join-crossing correlations. Our evaluation of MSCN using a real-world dataset shows that deep learning significantly enhances the quality of cardinality estimation, which is the core problem in query optimization.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SynQL: A Controllable and Scalable Rule-Based Framework for SQL Workload Synthesis for Performance Benchmarking

    cs.DB 2026-04 unverdicted novelty 7.0

    SynQL synthesizes diverse, execution-ready SQL workloads by deterministically traversing foreign-key graphs to populate ASTs, yielding high topological entropy and cost-model training data with R² ≥ 0.79 on held-out sets.

  2. Cortex AISQL: A Production SQL Engine for Unstructured Data

    cs.DB 2025-11 unverdicted novelty 6.0

    Snowflake's Cortex AISQL adds native semantic operations to SQL via AI-aware optimization, adaptive model cascades, and semantic join rewriting, delivering 2-70x speedups in production workloads.

  3. An Approach Based on Bayesian Networks for Query Selectivity Estimation

    cs.DB 2019-07 unverdicted novelty 6.0

    Chow-Liu trees approximate attribute distributions per relation to relax independence assumptions, yielding an order of magnitude better selectivity estimates on TPC-DS than prior methods while staying efficient.