PGR expands user queries into plausible future steps via Tree-of-Thought or chains and uses them as retrieval probes, delivering nearly 3x recall gains on the new MemoryQuest benchmark for low-similarity memory retrieval.
hub
Retrieval-augmented generation for knowledge- intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
This paper introduces a systems-level conceptual framing and a three-level taxonomy (intra-model, system-level, socio-technical) for uncertainty propagation in compound LLM applications, along with engineering insights and open challenges.
The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.
Constructs continuous sign conversation data from isolated signs using retrieval and diffusion models to train a direct sign-to-sign conversational AI.
A context-aware Sentinel-Strategist system for RAG selectively applies defenses to block membership inference and data poisoning while recovering most retrieval utility compared to always-on defense stacks.
OKH-RAG represents knowledge as ordered hyperedges and retrieves coherent interaction sequences via a learned transition model, outperforming permutation-invariant RAG baselines on order-sensitive QA tasks.
HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower computational cost on LOCOMO and LongMemEval benchmarks.
W-RAC decouples extraction from semantic planning via structured units and LLM grouping to match traditional retrieval performance at roughly 10x lower LLM token cost.
CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
GraphRAG with 7-8B local LLMs on 8GB VRAM hardware builds knowledge graphs from EHR docs and answers queries, with Llama 3.1 creating the largest graph, Qwen 2.5 scoring highest on quality, and models below ~7B failing to complete the pipeline.
MERIT achieves 81.65% F1 on MMFakeBench for multimodal misinformation detection via a four-module framework, outperforming zero-shot baselines like GPT-4V with MMD-Agent at 74.0% F1, with gains attributed to architectural design.
An event-centric framework encodes environments as semantic events and retrieves weighted prior maneuvers from a knowledge bank to enable interpretable, physics-aware decision-making for UAVs.
citing papers explorer
-
Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models
PGR expands user queries into plausible future steps via Tree-of-Thought or chains and uses them as retrieval probes, delivering nearly 3x recall gains on the new MemoryQuest benchmark for low-similarity memory retrieval.
-
Uncertainty Propagation in LLM-Based Systems
This paper introduces a systems-level conceptual framing and a three-level taxonomy (intra-model, system-level, socio-technical) for uncertainty propagation in compound LLM applications, along with engineering insights and open challenges.
-
XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception
The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.
-
Towards Continuous Sign Language Conversation from Isolated Signs
Constructs continuous sign conversation data from isolated signs using retrieval and diffusion models to train a direct sign-to-sign conversational AI.
-
Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks
A context-aware Sentinel-Strategist system for RAG selectively applies defenses to block membership inference and data poisoning while recovering most retrieval utility compared to always-on defense stacks.
-
Knowledge Is Not Static: Order-Aware Hypergraph RAG for Language Models
OKH-RAG represents knowledge as ordered hyperedges and retrieves coherent interaction sequences via a learned transition model, outperforming permutation-invariant RAG baselines on order-sensitive QA tasks.
-
HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling
HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower computational cost on LOCOMO and LongMemEval benchmarks.
-
Web Retrieval-Aware Chunking (W-RAC) for Efficient and Cost-Effective Retrieval-Augmented Generation Systems
W-RAC decouples extraction from semantic planning via structured units and LLM grouping to match traditional retrieval performance at roughly 10x lower LLM token cost.
-
CacheClip: Accelerating RAG with Effective KV Cache Reuse
CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
-
GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval
GraphRAG with 7-8B local LLMs on 8GB VRAM hardware builds knowledge graphs from EHR docs and answers queries, with Llama 3.1 creating the largest graph, Qwen 2.5 scoring highest on quality, and models below ~7B failing to complete the pipeline.
-
MERIT: Modular Framework for Multimodal Misinformation Detection with Web-Grounded Reasoning
MERIT achieves 81.65% F1 on MMFakeBench for multimodal misinformation detection via a four-module framework, outperforming zero-shot baselines like GPT-4V with MMD-Agent at 74.0% F1, with gains attributed to architectural design.
-
Event-Centric World Modeling with Memory-Augmented Retrieval for Embodied Decision-Making
An event-centric framework encodes environments as semantic events and retrieves weighted prior maneuvers from a knowledge bank to enable interpretable, physics-aware decision-making for UAVs.