Title resolution pending

Nelson F · 2024

18 Pith papers cite this work. Polarity classification is still indexing.

18 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

The Context Gathering Decision Process: A POMDP Framework for Agentic Search

cs.AI · 2026-05-07 · accept · novelty 7.0

Framing LLM agent loops as a Context Gathering Decision Process POMDP yields a predicate-based belief state that boosts multi-hop reasoning up to 11.4% and an exhaustion gate that cuts token use up to 39% with no performance loss.

MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents

cs.MA · 2026-05-05 · unverdicted · novelty 7.0

MemFlow routes queries by intent to tiered memory operations, nearly doubling accuracy of a 1.7B SLM on long-horizon benchmarks compared to full-context baselines.

Screening Is Enough

cs.LG · 2026-04-01 · unverdicted · novelty 7.0

Multiscreen replaces softmax attention with screening to provide absolute query-key relevance, resulting in models with 30% fewer parameters that maintain stable performance at long contexts.

APWA: A Distributed Architecture for Parallelizable Agentic Workflows

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

APWA is a distributed multi-agent architecture that decomposes parallelizable agentic workflows into non-interfering subproblems for scalable execution on heterogeneous resources.

PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

PAI-2 improves factual correctness in LLM answers by 4% on average across benchmarks using adaptive graph traversal and planning, with 6% gains from traversal algorithms and 18% from enabled planning.

CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

CANTANTE uses contrastive rollouts to attribute system rewards to individual agents, enabling better prompt optimization than prior methods on programming, math, and QA benchmarks.

Self-Consolidating Language Models: Continual Knowledge Incorporation from Context

cs.CL · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

SCoL trains LLMs via meta-reinforcement learning to generate layer-specific update instructions that improve knowledge acquisition and retention from context streams over standard baselines.

STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

LLM agents struggle to detect and act on implicit memory conflicts, with top models scoring 55.2% on the new STALE benchmark of 400 scenarios; CUPMem prototype strengthens state-aware revision.

Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

Event-Causal RAG segments videos into events represented as SES graphs, merges them into a causal knowledge graph, and uses bidirectional retrieval to supply relevant event chains to a video foundation model for improved long-video question answering.

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

cs.AI · 2026-04-27 · unverdicted · novelty 6.0

SSRP separates planning from execution in LLM agents to overcome the Attention Latch, delivering 715X resilience gains over ReAct baselines on MultiWOZ tasks.

ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants

cs.DC · 2026-04-16 · unverdicted · novelty 6.0

Argus generates GPU kernels achieving 99-104% of hand-optimized throughput on key LLM kernels by enforcing compile-time data-flow invariants via a tag-based DSL and an in-context RL planner.

Gym-Anything: Turn any Software into an Agent Environment

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

Gym-Anything turns arbitrary software into agent environments via multi-agent setup and auditing, creating CUA-World with 10K+ long-horizon tasks and showing that trajectory distillation plus test-time auditing improves small VLMs.

CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation

cs.IR · 2026-04-07 · conditional · novelty 6.0

CUE-R uses REMOVE, REPLACE, and DUPLICATE interventions on individual evidence items to quantify their per-item utility in RAG along correctness, grounding faithfulness, and confidence axes.

MeMo: Memory as a Model

cs.CL · 2026-05-14 · unverdicted · novelty 5.0 · 2 refs

MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

SLIM dynamically optimizes the active external skill set in agentic RL via leave-one-skill-out marginal contribution estimates and lifecycle operations, delivering a 7.1% average gain over baselines on ALFWorld and SearchQA while showing some skills remain externally useful.

AgenticPosesRanker: An Agentic AI Framework for Physically Grounded Ranking of Protein-Ligand Docking Poses

q-bio.BM · 2026-05-05 · conditional · novelty 5.0

AgenticPosesRanker ranks docking poses using six deterministic physical tools and LLM reasoning, achieving 50% best-pose accuracy that matches the Smina baseline on a balanced 10-system, 162-pose benchmark.

Rewriting the Response Path: Silent Tampering and Provider-Signed Defense in BYOK LLM Agents

cs.CR · 2026-05-04 · conditional · novelty 5.0

A malicious BYOK relay can rewrite an LLM agent's execution-bearing response fields after safety alignment, achieving 73.5-99.1% attack success on agent benchmarks while bypassing model defenses.

Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction

cs.LG · 2026-05-18 · unverdicted · novelty 4.0

Structural protection of boundary tokens in globally capped KV cache eviction recovers 69-90% of full-cache quality at 13% retention and dominates differences among scoring policies.

citing papers explorer

Showing 18 of 18 citing papers.

The Context Gathering Decision Process: A POMDP Framework for Agentic Search cs.AI · 2026-05-07 · accept · none · ref 21
Framing LLM agent loops as a Context Gathering Decision Process POMDP yields a predicate-based belief state that boosts multi-hop reasoning up to 11.4% and an exhaustion gate that cuts token use up to 39% with no performance loss.
MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents cs.MA · 2026-05-05 · unverdicted · none · ref 23
MemFlow routes queries by intent to tiered memory operations, nearly doubling accuracy of a 1.7B SLM on long-horizon benchmarks compared to full-context baselines.
Screening Is Enough cs.LG · 2026-04-01 · unverdicted · none · ref 24
Multiscreen replaces softmax attention with screening to provide absolute query-key relevance, resulting in models with 30% fewer parameters that maintain stable performance at long contexts.
APWA: A Distributed Architecture for Parallelizable Agentic Workflows cs.AI · 2026-05-14 · unverdicted · none · ref 33
APWA is a distributed multi-agent architecture that decomposes parallelizable agentic workflows into non-interfering subproblems for scalable execution on heterogeneous resources.
PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents cs.CL · 2026-05-13 · unverdicted · none · ref 35
PAI-2 improves factual correctness in LLM answers by 4% on average across benchmarks using adaptive graph traversal and planning, with 6% gains from traversal algorithms and 18% from enabled planning.
CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution cs.CL · 2026-05-13 · unverdicted · none · ref 25
CANTANTE uses contrastive rollouts to attribute system rewards to individual agents, enabling better prompt optimization than prior methods on programming, math, and QA benchmarks.
Self-Consolidating Language Models: Continual Knowledge Incorporation from Context cs.CL · 2026-05-08 · unverdicted · none · ref 1 · 2 links
SCoL trains LLMs via meta-reinforcement learning to generate layer-specific update instructions that improve knowledge acquisition and retention from context streams over standard baselines.
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid? cs.CL · 2026-05-07 · unverdicted · none · ref 22
LLM agents struggle to detect and act on implicit memory conflicts, with top models scoring 55.2% on the new STALE benchmark of 400 scenarios; CUPMem prototype strengthens state-aware revision.
Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios cs.AI · 2026-05-07 · unverdicted · none · ref 25
Event-Causal RAG segments videos into events represented as SES graphs, merges them into a causal knowledge graph, and uses bidirectional retrieval to supply relevant event chains to a video foundation model for improved long-video question answering.
Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols cs.AI · 2026-04-27 · unverdicted · none · ref 14
SSRP separates planning from execution in LLM agents to overcome the Attention Latch, delivering 715X resilience gains over ReAct baselines on MultiWOZ tasks.
ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants cs.DC · 2026-04-16 · unverdicted · none · ref 39
Argus generates GPU kernels achieving 99-104% of hand-optimized throughput on key LLM kernels by enforcing compile-time data-flow invariants via a tag-based DSL and an in-context RL planner.
Gym-Anything: Turn any Software into an Agent Environment cs.LG · 2026-04-07 · unverdicted · none · ref 27
Gym-Anything turns arbitrary software into agent environments via multi-agent setup and auditing, creating CUA-World with 10K+ long-horizon tasks and showing that trajectory distillation plus test-time auditing improves small VLMs.
CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation cs.IR · 2026-04-07 · conditional · none · ref 7
CUE-R uses REMOVE, REPLACE, and DUPLICATE interventions on individual evidence items to quantify their per-item utility in RAG along correctness, grounding faithfulness, and confidence axes.
MeMo: Memory as a Model cs.CL · 2026-05-14 · unverdicted · none · ref 41 · 2 links
MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.
Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning cs.LG · 2026-05-11 · unverdicted · none · ref 32 · 2 links
SLIM dynamically optimizes the active external skill set in agentic RL via leave-one-skill-out marginal contribution estimates and lifecycle operations, delivering a 7.1% average gain over baselines on ALFWorld and SearchQA while showing some skills remain externally useful.
AgenticPosesRanker: An Agentic AI Framework for Physically Grounded Ranking of Protein-Ligand Docking Poses q-bio.BM · 2026-05-05 · conditional · none · ref 29
AgenticPosesRanker ranks docking poses using six deterministic physical tools and LLM reasoning, achieving 50% best-pose accuracy that matches the Smina baseline on a balanced 10-system, 162-pose benchmark.
Rewriting the Response Path: Silent Tampering and Provider-Signed Defense in BYOK LLM Agents cs.CR · 2026-05-04 · conditional · none · ref 57
A malicious BYOK relay can rewrite an LLM agent's execution-bearing response fields after safety alignment, achieving 73.5-99.1% attack success on agent benchmarks while bypassing model defenses.
Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction cs.LG · 2026-05-18 · unverdicted · none · ref 34
Structural protection of boundary tokens in globally capped KV cache eviction recovers 69-90% of full-cache quality at 13% retention and dominates differences among scoring policies.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer