The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.
Detectiveqa: Evaluating long-context reasoning on detective novels, 2025 b
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
MemoryAgentBench is a new multi-turn benchmark assessing four memory competencies in LLM agents—accurate retrieval, test-time learning, long-range understanding, and selective forgetting—showing that existing methods fall short.
citing papers explorer
-
Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs
The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.
-
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions
MemoryAgentBench is a new multi-turn benchmark assessing four memory competencies in LLM agents—accurate retrieval, test-time learning, long-range understanding, and selective forgetting—showing that existing methods fall short.