{"total":21,"items":[{"citing_arxiv_id":"2605.19738","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TERGAD: Structure-Aware Text-Enhanced Representations for Graph Anomaly Detection","primary_cat":"cs.CL","submitted_at":"2026-05-19T12:09:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TERGAD augments graph anomaly detection by converting node topological properties into LLM-generated semantic embeddings that are fused with original attributes via a gated dual-branch autoencoder for joint reconstruction-based anomaly scoring.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19366","ref_index":198,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Accurate, Efficient, and Explainable Deep Learning Approaches for Environmental Science Problems","primary_cat":"cs.LG","submitted_at":"2026-05-19T04:58:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The work introduces WaLeF/FIDLAr for flood forecasting, CoDiCast for probabilistic weather, and Hypercube-RAG for explainable environmental QA, claiming superior accuracy, efficiency, and interpretability over baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18747","ref_index":182,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Code as Agent Harness","primary_cat":"cs.CL","submitted_at":"2026-05-18T17:59:03+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed open challenges.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"AHE [281] Telemetry-driven optimization Cost, decisions, latency, failures Context, tools, validators GEPA [18] Reflective prompt evolution Scores, feedback, critiques Prompts and instructions EvoMAC [328] Workflow topology evolution Handoffs, idle roles, loops Agent roles and graph SEW [312] Self-evolving workflow Workflow scores, failures Stage order and roles Live-SWE [182] Online agent evolution Live issue trajectories Policies, tools, memory GroundedTTA [232] Test-time adaptation State-action evidence Adaptation rules RLEF [104] Execution-feedback learning Execution rewards, failures Feedback reward signal DeepEval [300] Evaluation harness Scenario and metric traces Regression suites, gates FeedbackEval [23] Repair evaluation benchmark Feedback-task scores Failure taxonomy and eval set"},{"citing_arxiv_id":"2605.18025","ref_index":64,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TeleCom-Bench: How Far Are Large Language Models from Industrial Telecommunication Applications?","primary_cat":"cs.AI","submitted_at":"2026-05-18T08:14:49+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TeleCom-Bench reveals LLMs reach 90% on telecom intent and entity tasks but drop to 30% on solution generation and root cause analysis in live network scenarios.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13050","ref_index":49,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Context Training with Active Information Seeking","primary_cat":"cs.CL","submitted_at":"2026-05-13T06:15:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Active information seeking via search tools, when combined with multi-candidate context pruning during training, produces consistent gains on translation, health, and reasoning tasks over naive tool addition or no-tool baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12061","ref_index":166,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory","primary_cat":"cs.AI","submitted_at":"2026-05-12T12:47:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and long-term agent benchmarks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"If the reader reward bias is at mostϵ ϕ, and the writer update improves the surrogate reward by bUϕ(Gθ′)− bUϕ(Gθ)≥∆,(163) then the true utility satisfies U ⋆(Gθ′)−U ⋆(Gθ)≥∆−2ϵ ϕ.(164) Proof.By the bias assumption, U ⋆(Gθ′)≥ bUϕ(Gθ′)−ϵ ϕ, U ⋆(Gθ)≤ bUϕ(Gθ) +ϵ ϕ.(165) Subtracting the two inequalities gives U ⋆(Gθ′)−U ⋆(Gθ)≥ bUϕ(Gθ′)− bUϕ(Gθ)−2ϵ ϕ ≥∆−2ϵ ϕ.(166) 34 Corollary F.4(Reader calibration reduces writer optimization bias).If the reader is calibrated from ϕ to ϕ′ and reduces the reward bias from ϵϕ to ϵϕ′, where ϵϕ′ < ϵ ϕ, then for the same surrogate reward improvement∆, the lower bound on true utility improvement increases by 2(ϵϕ −ϵ ϕ′).(167) Proof. By Theorem 5.3, the true utility improvement lower bound before calibration is ∆−2ϵ ϕ, and"},{"citing_arxiv_id":"2605.07517","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation","primary_cat":"cs.IR","submitted_at":"2026-05-08T09:50:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LARAG improves RAG answer quality on hyperlinked technical documentation by using author-defined links for retrieval, achieving higher BERTScore while using fewer chunks and tokens than standard embedding-based RAG.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"fine-tuning framework for attributed graph embedding.Advances in neural information processing systems 36(2023), 13308-13325. [42] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., and et al.Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023). [43] Xiao, T., and Zhu, J.Foundations of large language models.arXiv preprint arXiv:2501.09223 (2025). [44] Xiao, T., and Zhu, J.Natural Language Processing: Neural Networks and Large Language Models. NiuTrans, 2025. [45] Zhang, Q., Chen, S., Bei, Y., Yuan, Z., Zhou, H., Hong, Z., Chen, H., Xiao, Y., Zhou, C., Dong, J., et al.A survey of graph retrieval-augmented generation for customized large language"},{"citing_arxiv_id":"2605.02106","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The Dynamic Gist-Based Memory Model (DGMM): A Memory-Centric Architecture for Artificial Intelligence","primary_cat":"cs.AI","submitted_at":"2026-05-04T00:02:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"DGMM is proposed as an explicit graph-structured memory architecture for AI that enables persistent episodic memory, cue-based recall, and context-dependent interpretation without retraining.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.00702","ref_index":63,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory","primary_cat":"cs.CL","submitted_at":"2026-05-01T14:45:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.17843","ref_index":123,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research","primary_cat":"cs.HC","submitted_at":"2026-04-20T05:53:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AVA is a specialized GenAI platform for development policy research that provides verifiable syntheses from World Bank reports and is associated with 2.4-3.9 hours of weekly time savings in a large-scale user evaluation.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. InProceedings of the 2020 conference on fairness, accountability, and transparency. 295-305. [122] Zelun Tony Zhang and Heinrich Hußmann. 2021. How to Manage Output Uncertainty: Targeting the Actual End User Problem in Interactions with AI.. In IUI Workshops. [123] Jiawei Zhou, Yixuan Zhang, Qianni Luo, Andrea G Parker, and Munmun De Choudhury. 2023. Synthetic lies: Understanding ai-generated misinfor- mation and evaluating algorithmic and human solutions. InProceedings of the 2023 CHI conference on human factors in computing systems. 1-20. [124] Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, and Maarten Sap. 2024."},{"citing_arxiv_id":"2604.15676","ref_index":104,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation","primary_cat":"cs.DB","submitted_at":"2026-04-17T03:54:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EvoRAG adds a feedback-driven backpropagation step that attributes response quality to individual knowledge-graph triplets and updates the graph to raise reasoning accuracy by 7.34 percent over prior KG-RAG methods.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Retrieval-Augmented Generation (RAG) [17, 35, 101] empowers Large Language Models (LLMs) to improve response quality by leveraging external knowledge, and has been widely adopted in various domains [80, 83, 94]. Among RAG paradigms, Knowledge Graph-based RAG (KG-RAG) [60, 97, 104] has gained increasing at- tention for its ability to transform the textual corpus into structured knowledge graphs (KGs) [104], capturing rich semantic informa- tion and entity-level relations. Given a query 𝑞, the core idea of KG-RAG is to retrieve a relevant knowledge subgraph (KSG) from the KG and feed it together with 𝑞 to the LLM for response gen- eration. The retrieved KSG is typically organized into a sequence Low Quality! Human Feedback Input Feedback-driven Backpropagation"},{"citing_arxiv_id":"2604.12138","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Beyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented Generation","primary_cat":"cs.AI","submitted_at":"2026-04-13T23:39:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Opinion-aware RAG with LLM opinion extraction and entity-linked graphs improves retrieval diversity by 26-42% over factual baselines on e-commerce forum data.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Retrieval augmented generation evaluation in the era of large language models: A comprehensive survey.arXiv preprint arXiv:2504.14891, 2024. [6] Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, and Bin Cui. Retrieval-augmented generation for ai-generated content: A survey.arXiv preprint arXiv:2402.19473, 2024. [7] Yucheng Wang, Xiaohan Li, Yongbin Gao, Jiawei Chen, and Zhiyuan Liu. A systematic literature review of rag: Techniques, metrics, and challenges.arXiv preprint arXiv:2501.13958, 2025. [8] Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, and Qing Li. A survey on rag meeting llms: Towards retrieval- augmented large language models."},{"citing_arxiv_id":"2605.18765","ref_index":36,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-04-11T10:16:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"STAR is a semantic-tuned and tail-adaptive retriever for GraphRAG that uses cross-attention interaction learning and path-weighted contrastive learning to mitigate Semantic Shortcut Bias and Long-Tail Path Bias, reporting 1.8% retrieval and 2.2% QA gains.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02545","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling","primary_cat":"cs.AI","submitted_at":"2026-04-02T21:54:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"UNKNOWN","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Repurposing competency questions as runtime executable plans creates a controlled neuro-symbolic RAG architecture that produces evidence-closed stories from knowledge graphs.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"logical leap presents a significant dilemma for applications in cultural heritage. LLMs are prone to factual invention and distortion - a phenomenon widely known as \"hallucination\" - which renders them fundamentally unsuitable for domains where factuality is paramount [20,36]. For museums, archives, and ed- ucational platforms, factual veracity is not merely a desirable feature but an ethical and institutional imperative [34,37]. The core of this issue is a deep epistemological conflict. LLMs are probabilistic systems; their mastery lies in statistical correlation and linguistic fluency, not in the representation of factual, verifiable knowledge [18]. In contrast, cultural heritage is a domain of evidence, where the value of a statement is intrinsically tied to its provenance [11]."},{"citing_arxiv_id":"2603.14828","ref_index":14,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Toward Robust GraphRAG: Mitigating Retrieval Drift and Hallucination from Imperfect Knowledge Graphs","primary_cat":"cs.IR","submitted_at":"2026-03-16T05:08:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CS-RAG is a GraphRAG framework that plans queries as ordered atomic constraints, uses anchor-relation aware retrieval, applies sufficiency checks, and falls back to text recovery to reduce drift and hallucination from imperfect KGs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.22762","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Behavioral Intelligence Platforms: From Event Streams to Autonomous Insight via Probabilistic Journey Graphs, Behavioral Knowledge Extraction, and Grounded Language Generation","primary_cat":"cs.IR","submitted_at":"2026-03-12T16:47:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"BIP turns event streams into autonomous insights by modeling journeys as absorbing Markov chains, extracting facts via knowledge graphs, and generating narratives constrained to verified data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20859","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"KGiRAG: An Iterative GraphRAG Approach for Responding Sensemaking Queries","primary_cat":"cs.IR","submitted_at":"2026-03-02T10:38:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An iterative feedback-driven GraphRAG architecture produces higher semantic quality and relevance on HotPotQA queries than single-shot baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20844","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AtomicRAG: Atom-Entity Graphs for Retrieval-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-02-10T05:57:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AtomicRAG replaces chunk-based and triple-based GraphRAG with atom-entity graphs that store facts as atomic units and use personalized PageRank plus relevance filtering to achieve higher retrieval accuracy and reasoning robustness on five benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.20136","ref_index":58,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation","primary_cat":"cs.CL","submitted_at":"2025-12-23T07:54:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"M³KG-RAG improves multimodal reasoning in large language models by constructing multi-hop knowledge graphs and selectively pruning retrieved context with GRASP.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.03724","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemOS: A Memory OS for AI System","primary_cat":"cs.CL","submitted_at":"2025-07-04T17:21:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[5] Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, and Bin Cui. Retrieval-augmented generation for ai-generated content: A survey.CoRR, abs/2402.19473, 2024. 31 [6] Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.CoRR, abs/2312.10997, 2023. [7] Qinggang Zhang, Shengyuan Chen, Yuanchen Bei, Zheng Yuan, Huachi Zhou, Zijin Hong, Junnan Dong, Hao Chen, Yi Chang, and Xiao Huang. A survey of graph retrieval-augmented generation for customized large language models. CoRR, abs/2501.13958, 2025. [8] Bo Ni, Zheyuan Liu, Leyao Wang, Yongjia Lei, Yuying Zhao, Xueqi Cheng, Qingkai Zeng, Luna Dong, Yinglong"},{"citing_arxiv_id":"2502.12911","ref_index":44,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation","primary_cat":"cs.CL","submitted_at":"2025-02-18T14:53:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"KaSLA applies knapsack optimization hierarchically to schema linking for LLM text-to-SQL, claiming better results than large models and improved SQL generation on Spider and BIRD.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}