Beyond Similarity Search: A Unified Data Layer for Production RAG Systems

· 2026 · cs.IR · arXiv 2605.03275

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Retrieval-Augmented Generation (RAG) systems have become the standard architecture for grounding large language models in organizational knowledge. Yet production deployments consistently expose a gap between clean prototype performance and real-world reliability. This paper identifies three root causes of that gap: data staleness, tenant data leakage, and query composition explosion. All three trace back to the conventional split-system data layer. We propose and evaluate a unified data layer built on PostgreSQL with native vector search (pgvector) and HNSW indexing. Controlled benchmarks on 50,000 documents show 92% latency reduction for date-filtered queries, 74% for tenant-scoped queries, zero synchronization inconsistency, and complete elimination of cross-tenant data leakage with 93% less synchronization code. We additionally discuss a recommended hybrid tier architecture

representative citing papers

BatchBench: Toward a Workload-Aware Benchmark for Autoscaling Policies in Big Data Batch Processing -- A Proposed Framework

cs.IR · 2026-05-12 · unverdicted · novelty 5.0

BatchBench is a proposed framework with workload taxonomy, parameterized generator, five-axis evaluation harness, and standardized agent interface to enable fair comparison of autoscaling policies.

citing papers explorer

Showing 1 of 1 citing paper.

BatchBench: Toward a Workload-Aware Benchmark for Autoscaling Policies in Big Data Batch Processing -- A Proposed Framework cs.IR · 2026-05-12 · unverdicted · none · ref 28 · internal anchor
BatchBench is a proposed framework with workload taxonomy, parameterized generator, five-axis evaluation harness, and standardized agent interface to enable fair comparison of autoscaling policies.

Beyond Similarity Search: A Unified Data Layer for Production RAG Systems

fields

years

verdicts

representative citing papers

citing papers explorer