{"total":15,"items":[{"citing_arxiv_id":"2606.29706","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ARMOR: Adaptive Retriever Optimization for Low-Resource Telecom Question Answering","primary_cat":"cs.IR","submitted_at":"2026-06-29T02:18:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ARMOR optimizes retrievers via joint RAG-likelihood and InfoNCE training with regularization toward the base encoder, yielding improved retrieval and QA on telecom benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.20608","ref_index":28,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"CourseBlueprint: A Structured Pipeline for Adaptive Pedagogical Video Generation Grounded in Course Corpora","primary_cat":"cs.CY","submitted_at":"2026-05-22T02:02:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CourseBlueprint builds a typed pipeline over a 23-lecture biomedical imaging corpus to generate prerequisite-aware, learner-adaptive videos with auditable engagement contracts and slide grounding.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16347","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support","primary_cat":"cs.LG","submitted_at":"2026-05-08T03:54:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"HPC-LLM fine-tunes Llama 3.1 8B via QLoRA on 9k-24k HPC examples and adds dense retrieval to deliver practical support for job scheduling, MPI, and GPU workflows, approaching the performance of larger general models at lower memory and latency cost.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03312","ref_index":49,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents","primary_cat":"cs.MA","submitted_at":"2026-05-05T02:57:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MemFlow routes queries by intent to tiered memory operations, nearly doubling accuracy of a 1.7B SLM on long-horizon benchmarks compared to full-context baselines.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Retrieval, compression, and evidence preparation.RAG [ 22, 21, 16] established the retrieved- memory paradigm; FiD [18], Contriever [17], and ColBERTv2 [38] improved passage fusion and retrieval precision. Later systems refine retrieval decisions through reasoning state or critique, including IRCoT [42], Active RAG [20], Self-RAG [3], CRAG [45], and RAFT [49]; hierarchical and ranking methods such as RAPTOR [39], GraphRAG [11], and RankRAG [47] improve synthesis 2 V alidator Agent User Query Router Agent T ier 1: Proﬁle Lookup T ier 2: T argeted Retrieval T ier 3: Deep Reasoning Packer Answer Agent Final Answer SLM SLM SLM P ASS Retry with heavier tier Memory Agent F AIL Figure 2: Overview of theMemFlowpipeline."},{"citing_arxiv_id":"2605.01284","ref_index":68,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation","primary_cat":"cs.CV","submitted_at":"2026-05-02T06:40:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Chain of Evidence introduces a retriever-agnostic visual attribution method for iRAG that reasons over document screenshots with VLMs to output precise bounding boxes, outperforming text baselines on Wiki-CoE and SlideVQA.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.27415","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ChipLingo: A Systematic Training Framework for Large Language Models in EDA","primary_cat":"cs.LG","submitted_at":"2026-04-30T04:35:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"ChipLingo trains LLMs on EDA data via corpus construction, domain-adaptive pretraining, and RAG scenario alignment, reaching 59.7% accuracy with an 8B model and 70.02% with a 32B model on a new internal EDA benchmark.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Compared to these works, the EDA domain exhibits more complex knowledge structures. EDA knowledge exists not only in technical documentation but is also distributed across engi- neering experience, tool commands, and design workflows in various knowledge forms. 2.2 Domain Adaptation and Continued Pretraining Domain Adaptation is an important approach for building domain-specific models [10, 11, 7]. A common method involves performing continued pretraining on domain corpora based on general foundation models to enhance model understanding of domain knowledge [12, 7, 10]. Inspired by research such as RAFT [10], this paper explores a QA-augmented domain contin- ued pretraining strategy. Unlike traditional approaches that use only documents for pretraining,"},{"citing_arxiv_id":"2604.04036","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MisEdu-RAG: A Misconception-Aware Dual-Hypergraph RAG for Novice Math Teachers","primary_cat":"cs.IR","submitted_at":"2026-04-05T09:31:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MisEdu-RAG builds concept and instance hypergraphs for two-stage retrieval of pedagogical knowledge and student errors, improving feedback quality on the MisstepMath benchmark by 10.95% token-F1 and up to 15.3% on response dimensions.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"-RQ2:How well does MisEdu-RAG align with the novice teachers' needs, and to what extent do its outputs support its instructional usability? We examine these questions onMisstepMath[2], a diverse student miscon- ception dataset for math teacher training, observing significant improvements in both retrieval precision and instructional response quality over strong baselines, including LLM generation [17], StandardRAG [30], and HyperGraphRAG [10]. Our Contributions are as follows. •We propose a dual-hypergraph RAG method in Education to assist novice math teachers in solving students' misconceptions. •OurqualitativeanalysisindicatestheneedfordevelopingAI-assistedsystems that help novice teachers address student misconceptions and meet their demand for timely instructional support."},{"citing_arxiv_id":"2604.09666","ref_index":42,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems","primary_cat":"cs.IR","submitted_at":"2026-04-01T07:21:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Agentic search narrows the gap between dense RAG and GraphRAG but does not remove GraphRAG's advantage on complex multi-hop reasoning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2601.12538","ref_index":268,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agentic Reasoning for Large Language Models","primary_cat":"cs.AI","submitted_at":"2026-01-18T18:58:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"INTERS [257] extends this direction by performing instruction-based fine-tuning over a diverse, multi-task dataset compiled from over 40 sources, capturing a wide spectrum of retrieval-reasoning patterns. This class of methods benefits from scalable data generation pipelines [266, 267, 23], which minimize the need for human annotation. Instructional reformulation techniques [268, 257, 269] further enhance generalization by aligning tasks with human-preferred formats and reasoning. RL-Based Agentic Search.These methods optimize retrieval-aware behaviors through reward signals that reflect answer quality, factuality, or user preferences. WebGPT [258] introduces reward modeling to supervise search-augmented chains aligned with human judgment, while RAG-RL [259] formulates retrieval"},{"citing_arxiv_id":"2506.04565","ref_index":226,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems","primary_cat":"cs.MA","submitted_at":"2025-06-05T02:34:43+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2505.17086","ref_index":101,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning","primary_cat":"cs.CL","submitted_at":"2025-05-20T18:33:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Mujica-MyGo decomposes multi-turn RAG interactions via multi-agent workflows and applies minimalist policy gradient optimization to improve performance on QA benchmarks while avoiding long-context problems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.13957","ref_index":96,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Supervising the search process produces reliable and generalizable information-seeking agents","primary_cat":"cs.CL","submitted_at":"2025-02-19T18:56:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Process supervision via RAG-Gym produces more reliable and generalizable search agents, with gains driven by higher-quality queries on out-of-domain multi-hop tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2404.18416","ref_index":298,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Capabilities of Gemini Models in Medicine","primary_cat":"cs.AI","submitted_at":"2024-04-29T04:11:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Med-Gemini sets new records on 10 of 14 medical benchmarks including 91.1% on MedQA-USMLE, beats GPT-4V by 44.5% on multimodal tasks, and surpasses humans on medical text summarization.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2401.15884","ref_index":39,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Corrective Retrieval Augmented Generation","primary_cat":"cs.CL","submitted_at":"2024-01-29T04:36:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CRAG improves RAG robustness via a retrieval quality evaluator that triggers web augmentation and a decompose-recompose filter to focus on relevant information, yielding better results on short- and long-form generation tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2312.10997","ref_index":173,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Retrieval-Augmented Generation for Large Language Models: A Survey","primary_cat":"cs.CL","submitted_at":"2023-12-18T07:47:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"For example, CRAG [67] trains a lightweight retrieval evaluator to assess the overall quality of the retrieved docu- ments for a query and triggers different knowledge retrieval actions based on confidence levels. D. Scaling laws of RAG End-to-end RAG models and pre-trained models based on RAG are still one of the focuses of current re- searchers [173].The parameters of these models are one of the key factors.While scaling laws [174] are established for LLMs, their applicability to RAG remains uncertain. Initial studies like RETRO++ [44] have begun to address this, yet the parameter count in RAG models still lags behind that of LLMs. The possibility of an Inverse Scaling Law 10, where smaller models outperform larger ones, is particularly intriguing and"}],"limit":50,"offset":0}