hub Canonical reference

SiloFuse: Cross-silo Synthetic Data Generation with Latent Tabular Diffusion Models

Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, Ji-Rong Wen · 2024 · arXiv 0146.2024

Canonical reference. 100% of citing Pith papers cite this work as background.

44 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 44 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 8

citation-polarity summary

background 8

representative citing papers

ReSequel: Robust LLM-assisted Query Rewriting and Optimization using Templatization and Sampling

cs.DB · 2026-06-18 · conditional · novelty 7.0

ReSequel uses LLMs guided by metadata-derived templates and sampling-based verification to rewrite SQL queries, delivering up to 16x workload speedups over native DBMSs and 22x over prior LLM baselines across eight benchmarks and three systems.

A Fast Gaussian Mechanism under Continual Observation, with Applications

cs.DS · 2026-06-10 · unverdicted · novelty 7.0

A new data structure samples any entry of the noise vector in constant time while exactly reproducing the binary tree Gaussian mechanism distribution, applied to DP CountSketches for improved range counting and join size estimation.

Arbitrage-free Data Pricing

cs.GT · 2026-06-09 · unverdicted · novelty 7.0

The paper shows that arbitrage-free information pricing is computationally hard in general, provides a branch-and-bound algorithm, and proves that for threshold utilities arbitrage-freeness reduces to Blackwell dominance, unifying prior query and model pricing results.

Generative Conversational Recommender System

cs.IR · 2026-05-21 · unverdicted · novelty 7.0

A single autoregressive model for conversational recommendation that uses semantic item IDs, predicts response intent and target first, then generates the response, reporting up to 29% Recall@1 gains.

DARE-EEG: A Foundation Model for Mining Dual-Aligned Representation of EEG

cs.AI · 2026-05-18 · unverdicted · novelty 7.0

DARE-EEG is a self-supervised EEG foundation model that enforces mask-invariance via contrastive mask alignment and momentum anchor alignment, plus conv-linear-probing for heterogeneous setups, achieving SOTA accuracy and cross-dataset portability.

Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

A rule-based strikingness measure is added to TKGR metrics to weight rare events higher, revealing that models weaken on striking events and ensemble gains come mostly from trivial fits.

U-HNSW: An Efficient Graph-based Solution to ANNS Under Universal Lp Metrics

cs.DB · 2026-05-03 · unverdicted · novelty 7.0

U-HNSW is the first graph-based index for approximate nearest neighbor search under all Lp metrics (0 < p <= 2) simultaneously, using L1/L2 HNSW graphs plus early-termination verification to beat MLSH query times.

Limitations of LTI Koopman Modeling for Nonlinear Control Systems

math.OC · 2026-04-28 · unverdicted · novelty 7.0

Exact LTI Koopman models for nonlinear control systems require affine linear dynamics under controllability and coordinate inclusion assumptions.

SynHAT: A Two-stage Coarse-to-Fine Diffusion Framework for Synthesizing Human Activity Traces

cs.AI · 2026-04-16 · unverdicted · novelty 7.0

SynHAT uses a novel two-stage spatio-temporal diffusion framework with Latent Spatio-Temporal U-Net to synthesize realistic human activity traces, outperforming baselines by 52% on spatial and 33% on temporal metrics across four cities.

NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions

cs.DB · 2026-04-13 · conditional · novelty 7.0 · 2 refs

NL2SQLBench is a new modular benchmarking framework that evaluates LLM NL2SQL methods across three core modules on existing datasets, exposing large accuracy gaps and computational inefficiency.

GRAB-ANNS: High-Throughput Indexing and Hybrid Search via GPU-Native Bucketing

cs.DB · 2026-03-31 · unverdicted · novelty 7.0

GRAB-ANNS is a new GPU graph index that achieves up to 240x higher hybrid search throughput via bucket layouts and hybrid intra/inter-bucket edges.

Sublime: Sublinear Error & Space for Unbounded Skewed Streams

cs.DS · 2026-03-15 · unverdicted · novelty 7.0 · 2 refs

Sublime generalizes Count-Min and Count Sketch with dynamically elongating counters and expanding counter arrays to deliver sublinear error growth and lower memory use on skewed unbounded streams.

An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs

cs.LG · 2026-03-04 · unverdicted · novelty 7.0

KG-WISE decomposes GNN models and uses LLM-generated query templates for partial loading of relevant components, achieving up to 28x faster inference and 98% lower memory on KGs with up to 42 million nodes while preserving accuracy.

Learned Static Function Data Structures

cs.DS · 2025-10-31 · accept · novelty 7.0

Learned static functions combine per-key ML-predicted prefix codes with classic static function storage to compress static key-value mappings beyond zero-order entropy limits.

Dynamic read & write optimization with TurtleKV

cs.DB · 2025-09-12 · conditional · novelty 7.0

TurtleKV uses a balanced TurtleTree on-disk structure and flexible memory tuning knobs to deliver strong performance across inserts, mixed workloads, point queries, and scans in YCSB tests, matching or beating SplinterDB, RocksDB, and WiredTiger.

Diffusion and Flow Matching Models for Tabular Data: A Survey

cs.LG · 2025-02-24 · unverdicted · novelty 7.0

First dedicated survey organizing diffusion and flow matching models for tabular data synthesis, imputation, anomaly detection, and related tasks, covering literature from 2015 to 2026 and highlighting open problems.

EcoTable: Cost-effective Table Integration in Data Lakes for Natural Language Queries

cs.DB · 2026-06-25 · unverdicted · novelty 6.0 · 2 refs

EcoTable is the first NL-based data integration framework that builds a join-likelihood graph, uses two-stage schema linking and Steiner tree search to find paths, then generates transformations with LLMs, reporting >30% accuracy gain and 5x lower cost on four real-world datasets.

Can Aggregate Invariants Accelerate Continuous Subgraph Matching? Limits, Laws, and a Dynamic Spectral Index

cs.AI · 2026-06-23 · unverdicted · novelty 6.0

Spectral aggregate tests prune up to 51% of candidates in CSM but leave enumeration intermediates unchanged beyond initial bindings across tested workloads.

Disk-Based Interval Indexes Under the Increasing Ending Time Assumption

cs.DB · 2026-06-22 · unverdicted · novelty 6.0

CEB and TIDE are two-layer append-only B+-tree indexes for intervals under the increasing ending time assumption that claim smaller size, faster insertions, and superior query performance over prior art.

A Risk Decomposition Framework for Pre-Hoc Fine-Tuning Prediction

cs.LG · 2026-06-16 · unverdicted · novelty 6.0

Formulates pre-hoc fine-tuning prediction as stochastic estimation, proves lower bound on optimization variance decay rate, and introduces a three-regime predictability phase diagram.

SIDInspector: A Mapping-First Diagnostic Resource for Semantic-ID Tokenizers

cs.IR · 2026-06-09 · accept · novelty 6.0

SIDInspector provides a standardized adapter contract and mapping-level probes for Semantic-ID tokenizers, with empirical contrasts showing high aliasing in GRID-style exports and superior prefix alignment from deterministic controls on Musical items.

ANNS-AMP: Accelerating Approximate Nearest Neighbor Search via Adaptive Mixed-Precision Computing

cs.PF · 2026-06-05 · unverdicted · novelty 6.0

ANNS-AMP adapts distance-computation precision to vector-space regions via a lightweight cluster-level predictor and a bit-serial accelerator, delivering 163.76x/10.57x/2.06x average speedups and 1100x/39.41x/6.66x energy reductions versus CPU/GPU/custom baselines with <2.7% accuracy loss.

ANN Search: Recall What Matters

cs.IR · 2026-06-03 · conditional · novelty 6.0

ANN search quality is better assessed by 1/Ratio@k than Recall@k because the former tracks downstream task utility more closely while allowing substantially lower computational cost.

Language Models Compare Quantities Using Number-specific and Unit-specific Heuristics

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

LMs compare unit quantities via number-specific and unit-specific heuristics rather than unified scale conversion, evidenced by degraded accuracy near boundaries, linear surrogate predictions, and causal subspace interventions.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

SiloFuse: Cross-silo Synthetic Data Generation with Latent Tabular Diffusion Models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer