{"total":11,"items":[{"citing_arxiv_id":"2605.30027","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark","primary_cat":"cs.CV","submitted_at":"2026-05-28T14:50:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"DocRetriever introduces a framework using layout-aware sparse embeddings for hybrid encoding without OCR and a generalizable reasoning-augmented reranker for few-shot settings, plus the MultiDocR benchmark for evaluation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.20724","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"CALMem : Application-Layer Dual Memory for Conversational AI","primary_cat":"cs.IR","submitted_at":"2026-05-20T05:23:57+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CALMem delivers virtually unbounded effective context for LLM conversations via an application-layer dual memory architecture with intra-session retrieval and token-adaptive injection.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.04496","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States","primary_cat":"cs.CL","submitted_at":"2026-05-06T04:55:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SCOUT achieves state-of-the-art long-text understanding with up to 8x lower token use by actively foraging for sparse query-relevant information and updating a compact provenance-grounded epistemic state.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.26622","ref_index":3,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory","primary_cat":"cs.CL","submitted_at":"2026-04-29T12:49:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"OCR-Memory encodes agent trajectories as images with visual anchors and retrieves verbatim text via locate-and-transcribe, yielding gains on long-horizon benchmarks under strict context limits.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"which are often necessary for debugging, faithful retrospection, and grounded decision-making. Context Compression.Instead of selecting or abstracting history, recent methods aim to compress the context itself via latent memory representations, learned compression policies, token pruning, or streaming-friendly inference mechanisms (Zhang et al., 2025; Kang et al., 2025; Jiang et al., 2023; [3] [2] [1] Optical Retriever Optical Context RetrievalVisual Memory Bank Agent Interaction Screenshots Multi-Resolution Visual HistorySet-of-Mark Visual Grounding Locate-and-Transcribe Deterministic Text FetchingAgent Context \"How to specify a task?\" Specifyingatask(pipeline_tag)Y oucanspecifythepipeline_taginthemodelcardmetadata.Thepipeline_tagindicatesthetypeoftaskthemodelisintendedfor."},{"citing_arxiv_id":"2604.06179","ref_index":19,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ARIA: Adaptive Retrieval Intelligence Assistant -- A Multimodal RAG Framework for Domain-Specific Engineering Education","primary_cat":"cs.IR","submitted_at":"2026-02-04T01:08:24+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ARIA is a multimodal RAG framework that filters domain-specific questions with 97.5% accuracy and outperforms ChatGPT-5 on pedagogical quality for a university civil engineering course.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.11913","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LSTM-MAS: A Long Short-Term Memory Inspired Multi-Agent System for Long-Context Understanding","primary_cat":"cs.CL","submitted_at":"2026-01-17T05:16:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LSTM-MAS uses a chained multi-agent architecture modeled on LSTM input, forget, and output gates to improve long-context QA performance and reduce hallucinations compared with prior multi-agent baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.03724","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemOS: A Memory OS for AI System","primary_cat":"cs.CL","submitted_at":"2025-07-04T17:21:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Key-Value Cache Mechanism vLLM [ 29], StreamingLLM[ 30], H2O[ 31], LESS [ 32], KVQuant [33], RetrievalAttention [34], Memory 3 [1] Hidden State Steering Steer [ 35], ICV [ 36], ActAdd [ 37], StyleVec [38], CAA [ 39], FreeCtrl [40], EasyEdit2 [41] Activation Circuit Modula- tion SAC [42], DESTEIN [43], LM-Steer [44] Long-term Explicit Non-parametric Retrieval- Augmented Generation kNN-LMs [ 45, 46], MEMW ALKER [9], Graph RAG [ 10], LightRAG [11], NodeRAG [ 47, 48], HyperGraphRAG [ 49], HippoRAG [ 50, 51], PGRAG[ 52], Zep [ 53], A-MEM [ 54], Mem0[55] Implicit Parametric Knowledge BERT [56], RLHF [57], CTRL [58], SLayer [59] Modular Parameter Adapta- tion LoRA [60], PRAG [ 61], DyPRAG [ 62], SERAC [ 63], CaliNet [64], DPM [65], GRACE [66] Parametric Memory Editing ROME [ 67], MEMIT [ 68], AlphaEdit [ 69], AnyEdit [ 70],"},{"citing_arxiv_id":"2410.10813","ref_index":62,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory","primary_cat":"cs.CL","submitted_at":"2024-10-14T17:59:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LongMemEval benchmarks long-term memory in chat assistants, revealing 30% accuracy drops across sustained interactions and proposing indexing-retrieval-reading optimizations that boost performance.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2409.10102","ref_index":39,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Trustworthiness in Retrieval-Augmented Generation Systems: A Survey","primary_cat":"cs.IR","submitted_at":"2024-09-16T09:06:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces Trust-RAG Compass framework and TRC Bench benchmark to assess RAG trustworthiness across factuality, robustness, fairness, transparency, accountability, and privacy, with evaluations showing performance gaps between LLMs.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Rerankers, often using cross-encoder architectures, better measure the similarity between the query and retrieved documents, pushing more relevant documents forward and removing less relevant ones. Another optimization com- ponent is the refiner, which summarizes or compresses retrieved content using techniques like prompting the LLM to summarize [39, 40], or training a summarizer through supervised fine-tuning or reinforcement learning [41-43]. Despite the flexibility of Advanced RAG, its sequential structure limits adaptability in complex scenarios, such as queries requiring step-by-step reasoning. Modular RAG. As RAG research evolves, it has entered the modular RAG stage, where components are treated as"},{"citing_arxiv_id":"2404.10981","ref_index":13,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Survey on Retrieval-Augmented Text Generation for Large Language Models","primary_cat":"cs.IR","submitted_at":"2024-04-17T01:27:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A survey that categorizes RAG methods for LLMs into four retrieval-centric stages, reviews their evolution and evaluation, and outlines challenges and future directions.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"to suit user-specific factors such as intended audience, situational context, and personal prefer- ences, shaping the response to be both contextually relevant and user-centric. This dual focus on , Vol. 1, No. 1, Article . Publication date: August 2018. 6 Huang et al. RAG Pre-Retrieval Indexing REALM [42];kNN-LMs [72];RAG [83];Webgpt [100];RETRO [9];MEMWALKER [13];Atlas [94];Chameleon [63];AiSAQ [126];PipeRAG [64];LRUS-CoverTree [93] Query Ma-nipulation Webgpt [100];DSP [73];CoK [86];IRCOT [131];Query2doc [137];Step-Back [163];PROMPTAGATOR [27];KnowledGPT [140];Rewrite-Retrieve-Read [94];FLARE [65];RQ-RAG [12];RARG [159];DRAGIN [124] Data Modification RA-DIT [89];RECITE [125];UPRISE [20];GENREAD [156];KnowledGPT [140];Selfmem [21];RARG [159]"},{"citing_arxiv_id":"2402.17753","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Evaluating Very Long-Term Conversational Memory of LLM Agents","primary_cat":"cs.CL","submitted_at":"2024-02-27T18:42:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Creates LoCoMo benchmark dataset for very long-term LLM conversational memory and shows current models struggle with lengthy dialogues and long-range temporal dynamics.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}