Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure

Carlo Ferrero; Federico Bottino; Nicholas Dosio; Pierfrancesco Beneventano

arxiv: 2604.11759 · v2 · pith:3L7WU6I5new · submitted 2026-04-13 · 💻 cs.AI

Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure

Federico Bottino , Carlo Ferrero , Nicholas Dosio , Pierfrancesco Beneventano This is my paper

Pith reviewed 2026-05-25 06:23 UTC · model grok-4.3

classification 💻 cs.AI

keywords organizational AIepistemic infrastructureRAGknowledge representationAI agentscontradiction trackingmodeled ignorance

0 comments

The pith

Organizational AI performance is capped by missing epistemic structure rather than retrieval accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current systems surface relevant documents but cannot distinguish binding decisions from abandoned ideas, settled facts from open questions, or known content from organizational ignorance. The paper claims this absence of computable epistemic properties sets a hard ceiling on what AI agents can reliably do inside organizations. It introduces the OIDA framework to represent knowledge as typed objects that carry commitment strength, class-specific decay, signed contradictions, and modeled ignorance via a QUESTION primitive with inverse urgency. An Epistemic Quality Score is defined to measure these properties, with initial comparisons showing large gaps versus full-context baselines that are partly attributable to token budget differences. The formal convergence properties of the scoring engine are proved under a maximum-degree condition.

Core claim

The ceiling on organizational AI is epistemic fidelity—the ability to treat commitment strength, contradiction status, and organizational ignorance as first-class computable properties—rather than retrieval fidelity; OIDA achieves this by structuring knowledge into typed objects maintained by a deterministic Knowledge Gravity Engine and by introducing QUESTION objects that increase in urgency as ignorance persists.

What carries the argument

OIDA framework: typed Knowledge Objects carrying epistemic class, importance scores with class-specific decay, signed contradiction edges, and the QUESTION primitive for modeled ignorance with inverse decay, all maintained by the Knowledge Gravity Engine.

If this is right

Organizations can surface unresolved questions with increasing urgency instead of only retrieving known content.
Contradictions become explicit signed edges that affect downstream scores deterministically rather than remaining hidden in retrieved text.
Importance and commitment can decay at class-specific rates, allowing abandoned hypotheses to lose influence automatically.
The Knowledge Gravity Engine converges under proved conditions (max degree below 7, empirically to 43), enabling reliable maintenance of large knowledge graphs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Adoption would require organizations to annotate or infer epistemic classes for existing documents rather than treating all content as equivalent.
The approach could extend to multi-agent settings where different agents carry different commitment levels to the same claim.
If EQS proves stable, it offers a way to benchmark retrieval systems on epistemic rather than semantic metrics.

Load-bearing premise

The Epistemic Quality Score with its five components supplies a valid, non-circular measure of epistemic quality that can be compared across systems even when token budgets differ by large factors.

What would settle it

Running the pre-registered equal-token-budget ablation (E4) and obtaining no statistically significant EQS advantage for the OIDA condition over standard RAG would falsify the claim that epistemic infrastructure raises performance beyond retrieval alone.

Figures

Figures reproduced from arXiv: 2604.11759 by Carlo Ferrero, Federico Bottino, Nicholas Dosio, Pierfrancesco Beneventano.

**Figure 2.** Figure 2: K-score dynamics over 28 days under stationary inputs ( [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Organizational knowledge used by AI agents typically lacks epistemic structure: retrieval systems surface semantically relevant content without distinguishing binding decisions from abandoned hypotheses, contested claims from settled ones, or known facts from unresolved questions. We argue that the ceiling on organizational AI is not retrieval fidelity but \emph{epistemic} fidelity--the system's ability to represent commitment strength, contradiction status, and organizational ignorance as computable properties. We present OIDA, a framework that structures organizational knowledge as typed Knowledge Objects carrying epistemic class, importance scores with class-specific decay, and signed contradiction edges. The Knowledge Gravity Engine maintains scores deterministically with proved convergence guarantees (sufficient condition: max degree $< 7$; empirically robust to degree 43). OIDA introduces QUESTION-as-modeled-ignorance: a primitive with inverse decay that surfaces what an organization does \emph{not} know with increasing urgency--a mechanism absent from all surveyed systems. We describe the Epistemic Quality Score (EQS), a five-component evaluation methodology with explicit circularity analysis. In a controlled comparison ($n{=}10$ response pairs), OIDA's RAG condition (3,868 tokens) achieves EQS 0.530 vs.\ 0.848 for a full-context baseline (108,687 tokens); the $28.1\times$ token budget difference is the primary confound. The QUESTION mechanism is statistically validated (Fisher $p{=}0.0325$, OR$=21.0$). The formal properties are established; the decisive ablation at equal token budget (E4) is pre-registered and not yet run.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper argues that organizational AI performance is limited by lack of epistemic structure in knowledge (distinguishing commitments, contradictions, and ignorance) rather than retrieval fidelity alone. It introduces the OIDA framework, which represents knowledge as typed Knowledge Objects with epistemic classes, class-specific decay, and signed contradiction edges; the Knowledge Gravity Engine, which maintains scores with a proved convergence guarantee (sufficient condition: max degree <7); the QUESTION primitive for modeling organizational ignorance via inverse decay; and the Epistemic Quality Score (EQS), a five-component metric with circularity analysis. A controlled comparison (n=10) reports OIDA-RAG EQS of 0.530 versus 0.848 for a full-context baseline, but notes the 28.1× token-budget confound (3,868 vs. 108,687 tokens); the QUESTION mechanism is statistically validated (Fisher p=0.0325, OR=21.0), while the equal-budget ablation E4 is pre-registered but unexecuted.

Significance. If the EQS metric and ablation results hold, the distinction between semantic retrieval and epistemic fidelity could usefully redirect research on organizational AI agents toward explicit modeling of commitment strength, contradictions, and ignorance. The formal convergence guarantee for the Knowledge Gravity Engine and the statistical validation of the QUESTION primitive are concrete strengths that could be cited even if the head-to-head comparison requires further controls.

major comments (3)

[Abstract] Abstract: The central empirical claim that OIDA primitives yield superior epistemic fidelity rests on the EQS comparison (0.530 vs. 0.848), yet this comparison is performed under a 28.1× token-budget difference explicitly identified as the primary confound. Because the pre-registered equal-budget ablation (E4) has not been executed, the data do not yet isolate the contribution of epistemic structure from raw context length.
[EQS methodology] EQS methodology section: The five-component EQS is described with explicit circularity analysis, but the reported scores derive from an uncontrolled token-budget comparison; without the equal-token ablation or an independent external benchmark, it remains unclear whether EQS differences reflect epistemic quality rather than token volume.
[Knowledge Gravity Engine] Knowledge Gravity Engine section: The convergence guarantee (max degree <7) is formally established and independent of the EQS comparison, but the manuscript does not demonstrate that this guarantee translates into measurable EQS gains once token budget is controlled; the empirical link between the OIDA primitives and epistemic fidelity therefore remains untested.

minor comments (1)

[Abstract / Methods] The abstract and methods description provide no error bars, variance estimates, or full implementation details for the n=10 comparison, which limits inspectability of the EQS results.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive review. The manuscript already flags the token-budget confound and the unexecuted E4 ablation as central limitations. Below we respond point-by-point, agreeing where the critique is accurate and clarifying the scope of our claims. We will make targeted revisions to strengthen the presentation of limitations without overstating the current empirical results.

read point-by-point responses

Referee: [Abstract] The central empirical claim that OIDA primitives yield superior epistemic fidelity rests on the EQS comparison (0.530 vs. 0.848), yet this comparison is performed under a 28.1× token-budget difference explicitly identified as the primary confound. Because the pre-registered equal-budget ablation (E4) has not been executed, the data do not yet isolate the contribution of epistemic structure from raw context length.

Authors: We agree. The abstract already states that the 28.1× token difference is the primary confound and that E4 is pre-registered but unexecuted. The reported EQS numbers are therefore illustrative rather than conclusive. We will revise the abstract to foreground this limitation more explicitly and to separate the formal contributions (convergence guarantee, QUESTION validation) from the confounded head-to-head comparison. revision: yes
Referee: [EQS methodology] The five-component EQS is described with explicit circularity analysis, but the reported scores derive from an uncontrolled token-budget comparison; without the equal-token ablation or an independent external benchmark, it remains unclear whether EQS differences reflect epistemic quality rather than token volume.

Authors: We concur that the current EQS numbers cannot isolate epistemic structure from token volume. The EQS methodology itself (including the circularity analysis) is independent of any particular comparison and is intended as a reusable instrument. We will add a dedicated limitations paragraph in the EQS section reiterating that the reported scores are preliminary and that E4 is required to test whether the metric tracks epistemic fidelity once context length is controlled. revision: yes
Referee: [Knowledge Gravity Engine] The convergence guarantee (max degree <7) is formally established and independent of the EQS comparison, but the manuscript does not demonstrate that this guarantee translates into measurable EQS gains once token budget is controlled; the empirical link between the OIDA primitives and epistemic fidelity therefore remains untested.

Authors: The convergence result is a purely formal theorem whose proof does not rely on the EQS experiments. The manuscript does not claim that the guarantee has been shown to produce EQS improvements under equal token budgets; that link is precisely what the pre-registered E4 ablation is designed to test. We will add a short forward-reference in the Knowledge Gravity Engine section noting that empirical validation of downstream EQS impact awaits execution of E4. revision: partial

standing simulated objections not resolved

Execution of the pre-registered equal-budget ablation E4, which would be required to isolate the contribution of epistemic structure from token volume.

Circularity Check

0 steps flagged

No circularity; formal convergence and EQS stand independent of the flagged token confound

full rationale

The paper's core derivations—the Knowledge Gravity Engine convergence (sufficient condition max degree <7 with proved guarantees) and the QUESTION primitive—are presented as mathematically established properties independent of the EQS metric or any empirical comparison. The EQS itself is introduced with an explicit circularity analysis, and the single head-to-head evaluation is reported with the 28.1× token gap explicitly called out as the primary confound while noting the equal-budget ablation as pre-registered but unrun. No self-definitional loops, fitted parameters renamed as predictions, load-bearing self-citations, or ansatzes smuggled via prior work appear in the derivation chain; the formal results and provisional empirical observations remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 3 invented entities

Abstract-only review; ledger populated from named components only. The convergence guarantee and EQS validity rest on unshown mathematics and evaluation design.

free parameters (1)

max degree threshold for convergence
Sufficient condition stated as max degree <7; empirical robustness claimed to degree 43.

axioms (1)

domain assumption Knowledge Gravity Engine scores converge deterministically under the stated degree condition
Invoked to support the maintenance of importance scores.

invented entities (3)

Knowledge Object with epistemic class and class-specific decay no independent evidence
purpose: Carry commitment strength and importance as computable properties
Core data structure of OIDA; no independent evidence supplied in abstract.
QUESTION primitive with inverse decay no independent evidence
purpose: Surface organizational ignorance with increasing urgency
Novel mechanism claimed absent from surveyed systems; no external validation shown.
Epistemic Quality Score (EQS) no independent evidence
purpose: Five-component evaluation methodology
Defined to compare systems; circularity analysis mentioned but not detailed.

pith-pipeline@v0.9.0 · 5827 in / 1671 out tokens · 29341 ms · 2026-05-25T06:23:19.334590+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 9 internal anchors

[1]

PoggioAI/MSc: ML theory research with humans on the loop

MahmoudAbdelmoneum, PierfrancescoBeneventano, andTomasoPoggio. PoggioAI/MSc: ML theory research with humans on the loop. Technical Report Technical Report v0, MIT, 2026

work page 2026
[2]

Anderson, Daniel Bothell, Michael D

John R. Anderson, Daniel Bothell, Michael D. Byrne, et al. An integrated theory of the mind.Psychological Review, 111(4):1036–1060, 2004. 12

work page 2004
[3]

Anderson and Lael J

John R. Anderson and Lael J. Schooler. Reflections of the environment in memory.Psy- chological Science, 2(6):396–408, 1991

work page 1991
[4]

The semantic web.Scientific American, 284(5):34–43, 2001

Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web.Scientific American, 284(5):34–43, 2001

work page 2001
[5]

A survey on temporal knowledge graph: Represen- tation learning and applications.arXiv preprint arXiv:2403.04782, 2024

Bingnan Cai, Yongqiang Xiang, et al. A survey on temporal knowledge graph: Represen- tation learning and applications.arXiv preprint arXiv:2403.04782, 2024

work page arXiv 2024
[6]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Prateek Chhikara, Deshraj Khant, et al. Mem0: Building production-ready AI agents with scalable long-term memory.arXiv preprint arXiv:2504.19413, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[7]

Early impacts of M365 Copilot.arXiv preprint arXiv:2504.11443, 2025

Eleanor Wiske Dillon et al. Early impacts of M365 Copilot.arXiv preprint arXiv:2504.11443, 2025

work page arXiv 2025
[8]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Darren Edge, Ha Trinh, Newman Cheng, et al. From local to global: A graph RAG approach to query-focused summarization.arXiv preprint arXiv:2404.16130, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[9]

Signed graph representation learning: A survey.arXiv preprint arXiv:2402.15980, 2024

others. Signed graph representation learning: A survey.arXiv preprint arXiv:2402.15980, 2024

work page arXiv 2024
[10]

Dealing with inconsistency for reasoning over knowledge graphs: A survey.arXiv preprint arXiv:2502.19023, 2025

others. Dealing with inconsistency for reasoning over knowledge graphs: A survey.arXiv preprint arXiv:2502.19023, 2025

work page arXiv 2025
[11]

Knowledge management in a world of generative AI: Impact and implications.ACM Transactions on Management Information Systems, 2025

others. Knowledge management in a world of generative AI: Impact and implications.ACM Transactions on Management Information Systems, 2025. Verify author names against published ACM version before submission

work page 2025
[12]

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.arXiv preprint arXiv:2312.10997, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[13]

Glean: AI-powered enterprise search and knowledge discovery.https://www.glean.com/resources/guides/ glean-ai-enterprise-search-knowledge-discovery, 2024

Glean Technologies. Glean: AI-powered enterprise search and knowledge discovery.https://www.glean.com/resources/guides/ glean-ai-enterprise-search-knowledge-discovery, 2024. Product documenta- tion

work page 2024
[14]

LightRAG: Simple and Fast Retrieval-Augmented Generation

ZiruiGuo, LianghaoShi, ZhenWang, etal. LightRAG:Simpleandfastretrieval-augmented generation.arXiv preprint arXiv:2410.05779, 2024. Accepted at EMNLP 2025

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Confidence is not timeless: Modeling temporal validity for rule-based temporal knowl- edge graph forecasting

Rikui Huang, Wei Wei, Xiaoye Qu, Shengzhe Zhang, Dangyang Chen, and Yu Cheng. Confidence is not timeless: Modeling temporal validity for rule-based temporal knowl- edge graph forecasting. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10783–10794, 2024

work page 2024
[16]

Uncertainty management in the con- struction of knowledge graphs: a survey.Transactions on Graph Data and Knowledge (TGDK), 3(1), 2024

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty management in the con- struction of knowledge graphs: a survey.Transactions on Graph Data and Knowledge (TGDK), 3(1), 2024

work page 2024
[17]

A survey on temporal knowledge graph embedding: Models and appli- cations.Knowledge-Based Systems, 304, 2024

Yishi Jiang et al. A survey on temporal knowledge graph embedding: Models and appli- cations.Knowledge-Based Systems, 304, 2024

work page 2024
[18]

Active retrieval augmented generation.arXiv preprint arXiv:2305.06983, 2023

Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, and Graham Neubig. Active retrieval augmented generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 7969–7992, 2023. arXiv:2305.06983. 13

work page arXiv 2023
[19]

Long context RAG performance of large language models

Quinn Leng, Jacob Portes, Sam Havens, Matei Zaharia, and Michael Carbin. Long context RAG performance of large language models. InNeurIPS 2024 Workshop on Adaptive Foundation Models, 2024. arXiv:2411.03538

work page arXiv 2024
[20]

Retrieval-augmented generation for knowledge-intensive NLP tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems, volume 33, 2020

work page 2020
[21]

Continuous knowledge graph refinement with confidence propagation

Junheng Li et al. Continuous knowledge graph refinement with confidence propagation. IEEE Transactions on Knowledge and Data Engineering, 2023

work page 2023
[22]

Long context vs

Xinze Li, Yixin Cao, Yubo Ma, and Aixin Sun. Long context vs. RAG for LLMs: An evaluation and revisits.arXiv preprint arXiv:2501.01880, 2025

work page arXiv 2025
[23]

Retrieval augmented generation or long-context LLMs? a comprehensive study and hybrid approach

Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, and Michael Bendersky. Retrieval augmented generation or long-context LLMs? a comprehensive study and hybrid approach. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Process- ing (Industry Track), 2024. arXiv:2407.16833

work page arXiv 2024
[24]

Memory in the Age of AI Agents

Shichun Liu et al. Memory in the age of AI agents: A survey.arXiv preprint arXiv:2512.13564, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[25]

MemOS: An operating system for memory-augmented generation.arXiv preprint arXiv:2505.22101, 2025

MemTensor. MemOS: An operating system for memory-augmented generation.arXiv preprint arXiv:2505.22101, 2025

work page arXiv 2025
[26]

PROV-DM: The PROV data model.https://www.w3

Luc Moreau, Paolo Missier, et al. PROV-DM: The PROV data model.https://www.w3. org/TR/prov-dm/, 2013. W3C Recommendation

work page 2013
[27]

A dynamic theory of organizational knowledge creation.Organization Science, 5(1):14–37, 1994

Ikujiro Nonaka. A dynamic theory of organizational knowledge creation.Organization Science, 5(1):14–37, 1994

work page 1994
[28]

The ultimate guide to AI-powered knowl- edge hubs in notion.https://www.notion.com/help/guides/ ultimate-guide-to-ai-powered-knowledge-hubs-in-notion, 2024

Notion Labs. The ultimate guide to AI-powered knowl- edge hubs in notion.https://www.notion.com/help/guides/ ultimate-guide-to-ai-powered-knowledge-hubs-in-notion, 2024. Product doc- umentation

work page 2024
[29]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, et al. MemGPT: Towards LLMs as operating systems.arXiv preprint arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[30]

Palantirontology: Connectingdatatotherealworld, 2023

PalantirTechnologies. Palantirontology: Connectingdatatotherealworld, 2023. Platform Documentation

work page 2023
[31]

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

Preston Rasmussen et al. Zep: A temporal knowledge graph architecture for agent memory. arXiv preprint arXiv:2501.13956, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Chi, Nathanael Schärli, and Denny Zhou

Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, and Denny Zhou. Large language models can be easily distracted by irrelevant context. InProceedings of the 40th International Conference on Machine Learn- ing, volume 202 ofProceedings of Machine Learning Research, pages 31210–31227, 2023. arXiv:2302.00093

work page arXiv 2023
[33]

Stein and Vladimir Zwass

Eric W. Stein and Vladimir Zwass. Actualizing organizational memory with information systems.Information Systems Research, 6(2):85–117, 1995

work page 1995
[34]

Walsh and Gerardo Rivera Ungson

James P. Walsh and Gerardo Rivera Ungson. Organizational memory.Academy of Man- agement Review, 16(1):57–91, 1991. 14

work page 1991
[35]

Knowledge conflicts for LLMs: A survey

Rongwu Xu et al. Knowledge conflicts for LLMs: A survey. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

work page 2024
[36]

A-MEM: Agentic Memory for LLM Agents

Wujiang Xu, Zujie Liang, et al. A-MEM: Agentic memory for LLM agents.arXiv preprint arXiv:2502.12110, 2025. NeurIPS 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[37]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, et al. Judging LLM-as-a-judge with MT- Bench and chatbot arena. InAdvances in Neural Information Processing Systems, vol- ume 36, 2023. arXiv:2306.05685. A KOC Axis Specification The Knowledge Object Coordinate is a 7-axis structured identifier: [Entity]-[Domain]-[Class]-[Epoch]-[Depth]- [Author]-[Variant] Each a...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[1] [1]

PoggioAI/MSc: ML theory research with humans on the loop

MahmoudAbdelmoneum, PierfrancescoBeneventano, andTomasoPoggio. PoggioAI/MSc: ML theory research with humans on the loop. Technical Report Technical Report v0, MIT, 2026

work page 2026

[2] [2]

Anderson, Daniel Bothell, Michael D

John R. Anderson, Daniel Bothell, Michael D. Byrne, et al. An integrated theory of the mind.Psychological Review, 111(4):1036–1060, 2004. 12

work page 2004

[3] [3]

Anderson and Lael J

John R. Anderson and Lael J. Schooler. Reflections of the environment in memory.Psy- chological Science, 2(6):396–408, 1991

work page 1991

[4] [4]

The semantic web.Scientific American, 284(5):34–43, 2001

Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web.Scientific American, 284(5):34–43, 2001

work page 2001

[5] [5]

A survey on temporal knowledge graph: Represen- tation learning and applications.arXiv preprint arXiv:2403.04782, 2024

Bingnan Cai, Yongqiang Xiang, et al. A survey on temporal knowledge graph: Represen- tation learning and applications.arXiv preprint arXiv:2403.04782, 2024

work page arXiv 2024

[6] [6]

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

Prateek Chhikara, Deshraj Khant, et al. Mem0: Building production-ready AI agents with scalable long-term memory.arXiv preprint arXiv:2504.19413, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[7] [7]

Early impacts of M365 Copilot.arXiv preprint arXiv:2504.11443, 2025

Eleanor Wiske Dillon et al. Early impacts of M365 Copilot.arXiv preprint arXiv:2504.11443, 2025

work page arXiv 2025

[8] [8]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Darren Edge, Ha Trinh, Newman Cheng, et al. From local to global: A graph RAG approach to query-focused summarization.arXiv preprint arXiv:2404.16130, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[9] [9]

Signed graph representation learning: A survey.arXiv preprint arXiv:2402.15980, 2024

others. Signed graph representation learning: A survey.arXiv preprint arXiv:2402.15980, 2024

work page arXiv 2024

[10] [10]

Dealing with inconsistency for reasoning over knowledge graphs: A survey.arXiv preprint arXiv:2502.19023, 2025

others. Dealing with inconsistency for reasoning over knowledge graphs: A survey.arXiv preprint arXiv:2502.19023, 2025

work page arXiv 2025

[11] [11]

Knowledge management in a world of generative AI: Impact and implications.ACM Transactions on Management Information Systems, 2025

others. Knowledge management in a world of generative AI: Impact and implications.ACM Transactions on Management Information Systems, 2025. Verify author names against published ACM version before submission

work page 2025

[12] [12]

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.arXiv preprint arXiv:2312.10997, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[13] [13]

Glean: AI-powered enterprise search and knowledge discovery.https://www.glean.com/resources/guides/ glean-ai-enterprise-search-knowledge-discovery, 2024

Glean Technologies. Glean: AI-powered enterprise search and knowledge discovery.https://www.glean.com/resources/guides/ glean-ai-enterprise-search-knowledge-discovery, 2024. Product documenta- tion

work page 2024

[14] [14]

LightRAG: Simple and Fast Retrieval-Augmented Generation

ZiruiGuo, LianghaoShi, ZhenWang, etal. LightRAG:Simpleandfastretrieval-augmented generation.arXiv preprint arXiv:2410.05779, 2024. Accepted at EMNLP 2025

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Confidence is not timeless: Modeling temporal validity for rule-based temporal knowl- edge graph forecasting

Rikui Huang, Wei Wei, Xiaoye Qu, Shengzhe Zhang, Dangyang Chen, and Yu Cheng. Confidence is not timeless: Modeling temporal validity for rule-based temporal knowl- edge graph forecasting. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10783–10794, 2024

work page 2024

[16] [16]

Uncertainty management in the con- struction of knowledge graphs: a survey.Transactions on Graph Data and Knowledge (TGDK), 3(1), 2024

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty management in the con- struction of knowledge graphs: a survey.Transactions on Graph Data and Knowledge (TGDK), 3(1), 2024

work page 2024

[17] [17]

A survey on temporal knowledge graph embedding: Models and appli- cations.Knowledge-Based Systems, 304, 2024

Yishi Jiang et al. A survey on temporal knowledge graph embedding: Models and appli- cations.Knowledge-Based Systems, 304, 2024

work page 2024

[18] [18]

Active retrieval augmented generation.arXiv preprint arXiv:2305.06983, 2023

Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, and Graham Neubig. Active retrieval augmented generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 7969–7992, 2023. arXiv:2305.06983. 13

work page arXiv 2023

[19] [19]

Long context RAG performance of large language models

Quinn Leng, Jacob Portes, Sam Havens, Matei Zaharia, and Michael Carbin. Long context RAG performance of large language models. InNeurIPS 2024 Workshop on Adaptive Foundation Models, 2024. arXiv:2411.03538

work page arXiv 2024

[20] [20]

Retrieval-augmented generation for knowledge-intensive NLP tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems, volume 33, 2020

work page 2020

[21] [21]

Continuous knowledge graph refinement with confidence propagation

Junheng Li et al. Continuous knowledge graph refinement with confidence propagation. IEEE Transactions on Knowledge and Data Engineering, 2023

work page 2023

[22] [22]

Long context vs

Xinze Li, Yixin Cao, Yubo Ma, and Aixin Sun. Long context vs. RAG for LLMs: An evaluation and revisits.arXiv preprint arXiv:2501.01880, 2025

work page arXiv 2025

[23] [23]

Retrieval augmented generation or long-context LLMs? a comprehensive study and hybrid approach

Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, and Michael Bendersky. Retrieval augmented generation or long-context LLMs? a comprehensive study and hybrid approach. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Process- ing (Industry Track), 2024. arXiv:2407.16833

work page arXiv 2024

[24] [24]

Memory in the Age of AI Agents

Shichun Liu et al. Memory in the age of AI agents: A survey.arXiv preprint arXiv:2512.13564, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[25] [25]

MemOS: An operating system for memory-augmented generation.arXiv preprint arXiv:2505.22101, 2025

MemTensor. MemOS: An operating system for memory-augmented generation.arXiv preprint arXiv:2505.22101, 2025

work page arXiv 2025

[26] [26]

PROV-DM: The PROV data model.https://www.w3

Luc Moreau, Paolo Missier, et al. PROV-DM: The PROV data model.https://www.w3. org/TR/prov-dm/, 2013. W3C Recommendation

work page 2013

[27] [27]

A dynamic theory of organizational knowledge creation.Organization Science, 5(1):14–37, 1994

Ikujiro Nonaka. A dynamic theory of organizational knowledge creation.Organization Science, 5(1):14–37, 1994

work page 1994

[28] [28]

The ultimate guide to AI-powered knowl- edge hubs in notion.https://www.notion.com/help/guides/ ultimate-guide-to-ai-powered-knowledge-hubs-in-notion, 2024

Notion Labs. The ultimate guide to AI-powered knowl- edge hubs in notion.https://www.notion.com/help/guides/ ultimate-guide-to-ai-powered-knowledge-hubs-in-notion, 2024. Product doc- umentation

work page 2024

[29] [29]

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Sarah Wooders, Kevin Lin, et al. MemGPT: Towards LLMs as operating systems.arXiv preprint arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[30] [30]

Palantirontology: Connectingdatatotherealworld, 2023

PalantirTechnologies. Palantirontology: Connectingdatatotherealworld, 2023. Platform Documentation

work page 2023

[31] [31]

Zep: A Temporal Knowledge Graph Architecture for Agent Memory

Preston Rasmussen et al. Zep: A temporal knowledge graph architecture for agent memory. arXiv preprint arXiv:2501.13956, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Chi, Nathanael Schärli, and Denny Zhou

Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H. Chi, Nathanael Schärli, and Denny Zhou. Large language models can be easily distracted by irrelevant context. InProceedings of the 40th International Conference on Machine Learn- ing, volume 202 ofProceedings of Machine Learning Research, pages 31210–31227, 2023. arXiv:2302.00093

work page arXiv 2023

[33] [33]

Stein and Vladimir Zwass

Eric W. Stein and Vladimir Zwass. Actualizing organizational memory with information systems.Information Systems Research, 6(2):85–117, 1995

work page 1995

[34] [34]

Walsh and Gerardo Rivera Ungson

James P. Walsh and Gerardo Rivera Ungson. Organizational memory.Academy of Man- agement Review, 16(1):57–91, 1991. 14

work page 1991

[35] [35]

Knowledge conflicts for LLMs: A survey

Rongwu Xu et al. Knowledge conflicts for LLMs: A survey. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

work page 2024

[36] [36]

A-MEM: Agentic Memory for LLM Agents

Wujiang Xu, Zujie Liang, et al. A-MEM: Agentic memory for LLM agents.arXiv preprint arXiv:2502.12110, 2025. NeurIPS 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[37] [37]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, et al. Judging LLM-as-a-judge with MT- Bench and chatbot arena. InAdvances in Neural Information Processing Systems, vol- ume 36, 2023. arXiv:2306.05685. A KOC Axis Specification The Knowledge Object Coordinate is a 7-axis structured identifier: [Entity]-[Domain]-[Class]-[Epoch]-[Depth]- [Author]-[Variant] Each a...

work page internal anchor Pith review Pith/arXiv arXiv 2023