pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

446 papers in cs.DB · page 4

  1. cs.DB 2026-04-24 reviewed
    Bounded self-joins make fact relevance as easy as query evaluation

    How Hard is it to Decide if a Fact is Relevant to a Query?

    Meghyn Bienvenu +2

  2. cs.DB 2026-04-24 reviewed
    Unified model pivots database migration across heterogeneous systems

    A Model-Driven Approach to Database Migration with a Unified Data Model

    Mar\'ia J. Ort\'in +2

  3. cs.DB 2026-04-24 reviewed
    Maximal-clique index speeds filtered nearest-neighbor search

    MCI: A Maximal Clique Index for Efficient Arbitrary-Filtered Approximate Nearest Neighbor Search

    Xiaowei Ye +5

  4. cs.DB 2026-04-23 reviewed
    ESPRESSO scales keyword search over Solid pods with privacy

    Implementation and Privacy Guarantees for Scalable Keyword Search on SOLID-based Decentralized Data with Granular Visibility Constraints

    Mohamed Ragab +7

  5. cs.LG 2026-04-23 reviewed
    Best tabular embedding model varies by task and level

    Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks

    Liane Vogel +5

  6. cs.LO 2026-04-23 reviewed
    ASP(Q) first implements globally-optimal repairs for inconsistent data

    Using ASP(Q) to Handle Inconsistent Prioritized Data

    Meghyn Bienvenu +3

  7. cs.DC 2026-04-23 reviewed
    Delta Lake loads fastest, Iceberg saves most space

    Research on the efficiency of data loading and storage in Data Lakehouse architectures for the formation of analytical data systems

    Ivan Borodii +1

  8. cs.DB 2026-04-23 reviewed
    Query algebra and wrappers replace LLM agents for enterprise data

    An Alternate Agentic AI Architecture (It's About the Data)

    Fabian Wenz +4

  9. cs.DB 2026-04-23 reviewed
    SQLyzr adds diverse metrics and realism to text-to-SQL evaluation

    A Demonstration of SQLyzr: A Platform for Fine-Grained Text-to-SQL Evaluation and Analysis

    Sepideh Abedini +1

  10. cs.DL 2026-04-22 reviewed
    Only 150k scientific posters shared across 86 platforms

    The State of Scientific Poster Sharing and Reuse

    Aydan Gasimova +5

  11. cs.AR 2026-04-22 reviewed
    FPGA level-wise batch search speeds B+ tree lookups 4.9x

    Efficient Batch Search Algorithm for B+ Tree Index Structures with Level-Wise Traversal on FPGAs

    Max Tzschoppe +3

  12. cs.LO 2026-04-22 reviewed
    ShEx and SHACL match on large recursive fragments via duality

    Common Foundations for Recursive Shape Languages

    Shqiponja Ahmetaj +11

  13. cs.LO 2026-04-22 reviewed
    ShEx and SHACL fragments match via fixpoint duality

    Common Foundations for Recursive Shape Languages

    Shqiponja Ahmetaj +11

  14. cs.IR 2026-04-22 reviewed
    Self-aware embeddings double RAG accuracy on versioned queries

    Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge

    Naizhong Xu

  15. cs.DB 2026-04-22 reviewed
    New framework checks isolation levels without database internals

    Making TransactionIsolation Checking Practical

    Jian Zhang +2

  16. cs.RO 2026-04-22 reviewed
    Vision-based tactile dataset scales bimanual robot data

    VTouch++: A Multimodal Dataset with Vision-Based Tactile Enhancement for Bimanual Manipulation

    Qianxi Hua +6

  17. cs.DB 2026-04-22 reviewed
    Low-dim stats cut noise in private power-law exponent estimates

    Estimating Power-Law Exponent with Edge Differential Privacy

    Adam Tan +2

  18. cs.DB 2026-04-22 reviewed
    ML model predicts query slot-time before execution

    Pre-Execution Query Slot-Time Prediction in Cloud Data Warehouses: A Feature-Scoped Machine Learning Approach

    Prashant Kumar Pathak

  19. cs.DB 2026-04-22 reviewed
    LLM agent finds minimal data sets for analysis at 83% F1

    An Agentic Approach to Metadata Reasoning

    Jiani Zhang +4

  20. cs.DB 2026-04-22 reviewed
    Garfield cuts RFANNS index size 4.4x and raises throughput 120x

    A GPU-Accelerated Framework for Multi-Attribute Range Filtered Approximate Nearest Neighbor Search

    Zhonggen Li +4

  21. cs.DB 2026-04-22 reviewed
    First GPU Datalog engine uses WCOJ to avoid memory blowup

    Scaling Worst-Case Optimal Datalog to GPUs

    Yihao Sun +4

  22. cs.DB 2026-04-21 reviewed
    GPU pipeline speeds 3D polyhedral spatial joins by 9x

    3DPipe: A Pipelined GPU Framework for Scalable Generalized Spatial Join over Polyhedral Objects

    Lyuheng Yuan +3

  23. cs.LG 2026-04-21 reviewed
    RaBitQ outperforms TurboQuant on most quantization tasks

    Revisiting RaBitQ and TurboQuant: A Symmetric Comparison of Methods, Theory, and Experiments

    Jianyang Gao +7

  24. cs.DB 2026-04-21 reviewed
    Online schema alignment recovers full results in decentralized queries

    Demonstrating Online Schema Alignment in Decentralized Knowledge Graphs Querying

    Bryan-Elliott Tam +2

  25. cs.DB 2026-04-21 reviewed
    Monotonic embeddings prune more vertices in subgraph matching

    LIVE: Learnable Monotonic Vertex Embedding for Efficient Exact Subgraph Matching (Technical Report)

    Yutong Ye +7

  26. cs.DB 2026-04-21 reviewed
    Heuristic partitioning cuts multi-tenant query P95 latency from 61s to 2s

    Heuristic Search Space Partitioning for Low-Latency Multi-Tenant Cloud Queries

    Prashant Kumar Pathak +2

  27. cs.AI 2026-04-21 reviewed
    Tool-augmented LLMs beat static ones on warehouse graph reasoning

    DW-Bench: Benchmarking LLMs on Data Warehouse Graph Topology Reasoning

    Ahmed G.A.H Ahmed +1

  28. cs.DB 2026-04-20 reviewed
    Open data model v3 fixes wastewater surveillance data sharing

    The Public Health and Environmental Surveillance Open Data Model (PHES-ODM) Version 3: An Open, Relational Data Model and Interoperability Framework for Wastewater Surveillance

    Mathew Thomson +8

  29. cs.AI 2026-04-20 reviewed
    Modular adapters beat fine-tuning on hard SQL queries

    LeGo-Code: Can Modular Curriculum Learning Advance Complex Code Generation? Insights from Text-to-SQL

    Salmane Chafik +2

  30. cs.SI 2026-04-20 reviewed
    Topology grouping cuts token use 50-90% in LLM social simulations

    Topology-Aware LLM-Driven Social Simulation: A Unified Framework for Efficient and Realistic Agent Dynamics

    Yuwei Xu +5

  31. cs.CL 2026-04-20 reviewed
    Syntactic tests flag contamination in old NL2SQL benchmarks

    SPENCE: A Syntactic Probe for Detecting Contamination in NL2SQL Benchmarks

    Mohammadtaher Safarzadeh +4

  32. cs.AI 2026-04-19 reviewed
    Database probing and rule checks raise text-to-SQL accuracy 5%

    PV-SQL: Synergizing Database Probing and Rule-based Verification for Text-to-SQL Agents

    Yuan Tian +1

  33. cs.DB 2026-04-19 reviewed
    Branchable databases slow reads up to 4000x as agent branches deepen

    BranchBench: Aligning Database Branching with Agentic Demands

    Elaine Ang +5

  34. cs.AI 2026-04-18 reviewed
    New benchmark finds AI agents falter on complex personalized home tasks

    PersonalHomeBench: Evaluating Agents in Personalized Smart Homes

    Nikhil Verma +7

  35. cs.AI 2026-04-18 reviewed
    Agents falter in smart homes as tasks grow complex

    PersonalHomeBench: Evaluating Agents in Personalized Smart Homes

    Nikhil Verma +7

  36. cs.DB 2026-04-17 reviewed
    Flipped indexing delivers 6.5x lower GPU query latency with dynamic updates

    FliX: Flipped-Indexing for Scalable GPU Queries and Updates

    Rosina Kharal +3

  37. cs.SE 2026-04-17 reviewed
    QMutBench gives 700k quantum mutants to benchmark tests

    QMutBench: A Dataset of Quantum Circuit Mutants

    E\~naut Mendiluze Usandizaga +3

  38. cs.DB 2026-04-17 reviewed
    Policy structure dictates database optimizer plans

    Compliance in Databases: A Study of Structural Policies and Query Optimization

    Ahana Pradhan +3

  39. cs.DB 2026-04-17 reviewed
    Agent autonomy pushes humans to supervisor roles in visual analytics

    Exploring Agentic Visual Analytics: A Co-Evolutionary Framework of Roles and Workflows

    Tianqi Luo +2

  40. cs.CV 2026-04-17 reviewed
    Event cameras enable lip-motion speaker ID across new views and lights

    NeuroLip: An Event-driven Spatiotemporal Learning Framework for Cross-Scene Lip-Motion-based Visual Speaker Recognition

    Junguang Yao +3

  41. cs.DB 2026-04-17 reviewed
    Response feedback backpropagates to refine KG-RAG by 7.34%

    EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation

    Zhenbo Fu +7

  42. cs.DB 2026-04-16 reviewed
    Small model attention prunes long docs to 10% for big QA

    SAGE: Selective Attention-Guided Extraction for Token-Efficient Document Indexing

    Xinzhi Wang +7

  43. cs.AI 2026-04-16 reviewed
    Layer treats LLMs and web as databases for natural language data queries

    Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal Data-Centric Applications

    Moin Aminnaseri +19

  44. cs.DB 2026-04-16 reviewed
    SQL and Python agreement on tiny database picks correct queries

    DPC: Training-Free Text-to-SQL Candidate Selection via Dual-Paradigm Consistency

    Boyan Li +3

  45. cs.DB 2026-04-16 reviewed
    Four-layer architecture unifies reconciliation and anomaly detection

    Data Engineering Patterns for Cross-System Reconciliation in Regulated Enterprises: Architecture, Anomaly Detection, and Governance

    Zhijun Qiu

  46. cs.DB 2026-04-16 reviewed
    PP-FP-tree finds top keyword k-core communities in public-private graphs

    Efficient Community Search on Attributed Public-Private Graphs

    Yuqi Chen +2

  47. cs.DB 2026-04-16 reviewed
    RELOAD is a learned query optimizer that reduces individual query performance regressions…

    RELOAD: A Robust and Efficient Learned Query Optimizer for Database Systems

    Seokwon Lee +5

  48. cs.DB 2026-04-15 reviewed
    PIM hardware speeds R-tree queries up to 3.66x with less energy

    Parallel R-tree-based Spatial Query Processing on a Commercial Processing-in-Memory System

    Tasmia Jannat +2

  49. cs.AI 2026-04-15 reviewed
    Beliefs and policies declaratively control LLM pipelines

    Credo: Declarative Control of LLM Pipelines via Beliefs and Policies

    Duo Lu +2

  50. cs.CL 2026-04-15 reviewed
    GLOW hybrid boosts open-world QA on incomplete KGs by 38% on average

    Leveraging LLM-GNN Integration for Open-World Question Answering over Knowledge Graphs

    Hussein Abdallah +3