FD-RAG: Federated Dual-System Retrieval-Augmented Generation
read the original abstract
Retrieval-augmented generation (RAG) has emerged as a paradigm for grounding large language models in external knowledge, yet most existing RAG systems assume centralized knowledge access and ample computation. These assumptions break down in edge environments, where knowledge is fragmented across devices, raw data cannot be shared, and repeated LLM calls are prohibitively expensive. We propose FD-RAG, a federated dual-system RAG framework that decouples lightweight memory access from on-demand LLM reasoning for decentralized deployment. Specifically, FD-RAG learns semantic-aware adaptive hypergraphs over local corpora and distills them into compact QA memories. At inference time, it answers well-covered queries via direct memory matching and invokes LLM-based reasoning only when necessary, while tracing retrieved memories to hypergraph-grounded evidence. To mitigate cross-device knowledge fragmentation, FD-RAG aggregates anonymized memories across devices without exposing raw documents. Experiments on QA benchmarks show that FD-RAG improves accuracy by up to 7.8\% while reducing latency by 8.4$\times$ compared with strong local and federated baselines. We also provide theoretical analysis establishing an $\mathcal{O}(1/\epsilon^{2})$ convergence rate for the proposed hypergraph learning, supporting its tractable deployment in edge settings.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Mesh Inference: A Formal Model of Collective Inference Without a Center
Mesh inference allows a network of agents to reach the centralized optimum through local relaxations of a coupled free energy using only admitted observations, with convergence guaranteed by M-matrix properties in the...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.