pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

446 papers in cs.DB · page 3

  1. cs.AI 2026-05-04 reviewed
    Event languages mapped to one Temporal Datalog engine for streams

    Efficient Temporal Datalog Materialisation for Composite Event Recognition

    Periklis Mantenoglou

  2. cs.DB 2026-05-04 reviewed
    eBPF scheduler doubles throughput for time-sensitive DB tasks

    Unfair by design: eBPF-based scheduling of mixed database workloads

    Carl-Elliott Bilodeau-Savaria +3

  3. cs.DB 2026-05-04 reviewed
    2-bit vectors build ANN graphs for 16x faster search

    QuIVer: Rethinking ANN Graph Topology via Training-Free Binary Quantization

    Wenxuan Xiao +2

  4. cs.DB 2026-05-04 reviewed
    2-bit quantization builds ANN graphs without training

    QuIVer: Rethinking ANN Graph Topology via Training-Free Binary Quantization

    Wenxuan Xiao +2

  5. cs.DB 2026-05-04 reviewed
    Binary quantization builds ANN graphs for 88% recall

    QuIVer: Rethinking ANN Graph Topology via Training-Free Binary Quantization

    Wenxuan Xiao +2

  6. cs.DB 2026-05-03 reviewed
    Dual HNSW graphs enable fast search for any Lp metric

    U-HNSW: An Efficient Graph-based Solution to ANNS Under Universal Lp Metrics

    Huayi Wang +2

  7. cs.CR 2026-05-03 reviewed
    Predictions let private query streams reach near-offline utility

    LAPRAS : Learning-Augmented PRivate Answering for linear query Streams

    Pranay Mundra +3

  8. cs.DC 2026-05-03 reviewed
    Decentralized geohash sampling cuts geospatial stream latency

    Decentralized Stratified Sampling for Low-Latency Approximate Geospatial Data Stream Processing in Edge-Cloud Architectures

    Isam Mashhour Al Jawarneh +3

  9. cs.CR 2026-05-03 reviewed
    Prompt-conditioned masking traces RAG poison to exact characters

    Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

    Huining Cui +1

  10. cs.DB 2026-05-02 reviewed
    This paper proposes Action Units as structured extensions to knowledge representations…

    Actionable Understanding: Action Units for Bridging the Knowledge-Action Gap in Post-FAIR Knowledge Infrastructures

    Lars Vogt

  11. cs.DB 2026-05-02 reviewed
    Lattice merges co-accessed vectors to cut authorized search cost

    Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing

    Shanshan Han +2

  12. cs.DB 2026-05-02 reviewed
    Lattice method balances duplication and search for authorized vectors

    Don't Stir the Pot! Authorized Vector Data Retrieval via Access-Aware Indexing

    Shanshan Han +2

  13. cs.DB 2026-05-02 reviewed
    Five patterns decouple writes from reads in search engines

    Write-Read Decoupling in Modern Large-Scale Search Engines: Architectures, Techniques, and Emerging Approaches

    Xin Liang +6

  14. cs.DB 2026-05-01 reviewed
    Team projects built into database courses raise grades and teamwork scores

    Complete Integration of Team Project-based Learning into a Database Syllabus

    S. Iserte +5

  15. cs.DB 2026-05-01 reviewed
    One abstraction unifies database evolution

    Living Databases: A Unified Model for Continuous Schema Evolution, Versioning, and Transformations

    Amol Deshpande

  16. cs.DB 2026-05-01 reviewed
    Execution-verified renamings recover Text-to-SQL accuracy on noisy schemas

    EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

    Jiaqian Wang +4

  17. cs.CR 2026-05-01 reviewed
    Framework makes shuffle-DP protocols resist poisoning

    Defense against Poisoning Attacks under Shuffle-DP

    Siyi Wang +8

  18. cs.DB 2026-05-01 reviewed
    SPARQL multiset patterns match Datalog and relational algebra

    Multiset semantics in SPARQL, Relational Algebra and Datalog

    Renzo Angles +2

  19. cs.DB 2026-04-30 reviewed
    Two-phase sampling cuts online aggregation cost up to 3x

    Index-Assisted Stratified Sampling for Online Aggregation

    Yunnan Yu +1

  20. cs.DB 2026-04-30 reviewed
    Tailwind speeds TPC-H queries 1.38x on average

    Tailwind: A Practical Framework for Query Accelerators

    Geoffrey X. Yu +2

  21. cs.CL 2026-04-30 reviewed
    Templates from past queries boost Text-to-SQL accuracy 36%

    Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding

    Smit Jivani +2

  22. cs.CV 2026-04-30 reviewed
    GUI agents hit exact states only 23 percent of the time

    FineState-Bench: Benchmarking State-Conditioned Grounding for Fine-grained GUI State Setting

    Fengxian Ji +7

  23. cs.AI 2026-04-30 reviewed
    ObjectGraph cuts agent document tokens by 95% without accuracy loss

    ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era

    Mohit Dubey +1

  24. cs.DB 2026-04-29 reviewed
    Synthetic databases reveal 3-14 percent drops in text-to-SQL accuracy

    SynSQL: Synthesizing Relational Databases for Robust Evaluation of Text-to-SQL Systems

    Mohammadamin Habibollah +1

  25. cs.DB 2026-04-29 reviewed
    One model unifies table discovery from text and table queries

    Unified Data Discovery across Query Modalities and User Intents

    Tingting Wang +6

  26. cs.DB 2026-04-29 reviewed
    Graphify turns GraphQL into single optimized Gremlin queries in linear time

    Graphify: Automated Synthesis of Type-Safe Graph Backends via $O(S)$ GraphQL-to-Gremlin Transpilation

    Johannes Graf

  27. cs.SD 2026-04-29 reviewed
    Non-speech audio reveals spurious correlations in speech data

    A Toolkit for Detecting Spurious Correlations in Speech Datasets

    Lara Gauder +5

  28. cs.DB 2026-04-29 reviewed
    LLM search aligns pivot table schemas at 88% accuracy

    PiLLar: Matching for Pivot Table Schema via LLM-guided Monte-Carlo Tree Search

    Yunjun Gao +3

  29. cs.DB 2026-04-29 reviewed
    LLM assistant cuts big data support tickets by 20.8%

    SiriusHelper: An LLM Agent-Based Operations Assistant for Big Data Platforms

    Yu Shen +16

  30. cs.DB 2026-04-28 reviewed
    Evergreen converts verification of claims in LLM-generated semantic aggregates into…

    Evergreen: Efficient Claim Verification for Semantic Aggregates

    Alexander W. Lee +5

  31. cs.DB 2026-04-28 reviewed
    CacheRAG turns stateless KGQA planning into cached learning

    CacheRAG: A Semantic Caching System for Retrieval-Augmented Generation in Knowledge Graph Question Answering

    Yushi Sun +1

  32. cs.IR 2026-04-28 reviewed
    Semantic search runs on 166 million clinical notes at $4k per month

    Health System Scale Semantic Search Across Unstructured Clinical Notes

    Faith Wavinya Mutinda +16

  33. cs.DS 2026-04-28 reviewed
    Streaming sampler approximates graphlets in constant passes

    An Efficient Streaming Algorithm for Approximating Graphlet Distributions

    Marco Bressan +3

  34. cs.DB 2026-04-28 reviewed
    Negative patterns raise viral classification accuracy

    Mining Negative Sequential Patterns to Improve Viral Genomic Feature Representation and Classification

    Wenxi Zhu +2

  35. cs.DB 2026-04-28 reviewed
    VisualNeo connects visual queries to Neo4j for graph searches

    VisualNeo: Bridging the Gap between Visual Query Interfaces and Graph Query Engines

    Kai Huang +7

  36. cs.DB 2026-04-28 reviewed
    Algorithms hide all sensitive cross-level utility patterns without fakes

    Cross-level Privacy Preserving Utility Mining

    Jiahong Cai +2

  37. cs.LG 2026-04-28 reviewed
    RL learns to clean tabular data for foundation model priors

    Prior-Aligned Data Cleaning for Tabular Foundation Models

    Laure Berti-Equille

  38. cs.DC 2026-04-27 reviewed
    Fixed-input lock keeps Spark policy outputs identical under repartitioning

    Spark Policy Toolkit: Semantic Contracts and Scalable Execution for Policy Learning in Spark

    Zeyu Bai

  39. cs.CR 2026-04-27 reviewed
    Dynamic attacks slow ALEX lookups up to 2.8x

    Poisoning Learned Index Structures: Static and Dynamic Adversarial Attacks on ALEX

    Allen Jue

  40. cs.DB 2026-04-27 reviewed
    Autoencoder rewrites speed hybrid vector queries 2x on average

    BoomHQ: Learning to Boost Multiple Hybrid Queries on Vector DBMSs

    Ermu Qiu +6

  41. cs.DB 2026-04-27 reviewed
    Sliding window finds dense patterns exactly without gap parameters

    Exact Mining of Dense Patterns via Direct Evaluation of Local Interval Frequency Using a Sliding Window

    Taihei Takahashi +3

  42. cs.IR 2026-04-27 reviewed
    Late materialization slashes storage for long user sequences in DLRMs

    Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

    Liang Guo +10

  43. cs.DB 2026-04-27 reviewed
    IM chat turns natural language into complete data reports

    DataClaw: An Autonomous Data Agent with Instant Messaging Integration

    Huahang Li +5

  44. cs.CL 2026-04-27 reviewed
    RL distills agentic reasoning into private product mapping models

    EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce

    Minhyeong Yu +1

  45. cs.DB 2026-04-26 reviewed
    SEMA-SQL mixes SQL with LLM semantics for natural language database questions

    SEMA-SQL: Beyond Traditional Relational Querying with Large Language Models

    Yin Lin +6

  46. cs.DB 2026-04-26 reviewed
    SEMA-SQL mixes SQL with LLM reasoning for semantic database queries

    SEMA-SQL: Beyond Traditional Relational Querying with Large Language Models

    Yin Lin +6

  47. cs.DS 2026-04-24 reviewed
    Branchwidth approximates submodular width to within 3/2

    Cuts and Gauges for Submodular Width

    Matthias Lanzinger

  48. cs.DB 2026-04-24 reviewed
    Dataset released for 10,000 early AI agents on Ethereum

    A dataset of early blockchain-registered AI agents on Ethereum

    Yulin Liu

  49. cs.DB 2026-04-24 reviewed
    Atomic RDF Datasets can serve as standardized messages for streaming

    It's Time to Standardize RDF Messages

    Pieter Colpaert +1

  50. cs.LO 2026-04-24 reviewed
    Formal library verifies chase as universal model

    The Chase in Lean -- Crafting a Formal Library for Existential Rule Research

    Lukas Gerlach