pith. machine review for the scientific record. sign in

arxiv: 2604.14930 · v1 · submitted 2026-04-16 · 💻 cs.CL

Recognition: unknown

IE as Cache: Information Extraction Enhanced Agentic Reasoning

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:45 UTC · model grok-4.3

classification 💻 cs.CL
keywords information extractionagentic reasoninglarge language modelscognitive cachemulti-step inferencereasoning accuracy
0
0 comments X

The pith

Repurposing information extraction as a reusable cache improves accuracy in multi-step language model reasoning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Traditional information extraction produces structured data from text but treats the output as a one-time result rather than something to keep and consult during later steps. This paper proposes IE-as-Cache, a method that keeps the extracted structures alive as a compact memory resource while the model reasons. Query-driven extraction pulls out only what the current step needs, and cache-aware reasoning decides when to update or reuse that information to cut noise. The design borrows the idea of fast, small caches from computer memory systems. Tests on several large language models and hard reasoning benchmarks show measurable gains in final answer correctness.

Core claim

The paper establishes that information extraction can be turned into an active, reusable cognitive cache for agentic reasoning: query-driven extraction builds compact intermediate records, cache-aware reasoning consults and refreshes them across steps, and the result is higher accuracy on complex benchmarks because noise is filtered while task-critical details remain available.

What carries the argument

IE-as-Cache framework that pairs query-driven extraction with cache-aware reasoning, modeled on hierarchical computer memory, to hold and reuse compact intermediate information during multi-step inference.

If this is right

  • Agentic systems gain accuracy by keeping extracted facts available across multiple reasoning steps instead of re-processing raw text each time.
  • Large language models can manage longer inference chains when compact caches reduce the amount of noise that reaches later steps.
  • Information extraction shifts from a terminal task to an ongoing support layer inside reasoning loops.
  • Benchmarks that reward reuse of intermediate structure will favor models that implement this caching behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same caching idea could be applied to other preprocessing steps such as summarization or entity linking inside agent loops.
  • Integration with retrieval-augmented generation might further stabilize the cache contents on very long tasks.
  • Real-world deployment would require testing how often the cache must be refreshed to avoid carrying forward stale facts.
  • The approach points toward new evaluation metrics that measure memory efficiency alongside final accuracy.

Load-bearing premise

Query-driven extraction combined with cache-aware reasoning can dynamically maintain compact intermediate information that filters noise without discarding details critical for multi-step inference.

What would settle it

A controlled run on one of the paper's benchmarks in which the cache is turned off or forced to drop a known necessary fact, followed by a clear drop in reasoning accuracy relative to the cached version.

Figures

Figures reproduced from arXiv: 2604.14930 by Defu Lian, Enhong Chen, Hang Lv, Hao Wang, Hongchao Gu, Sheng Liang, Wei Guo, Yong Liu.

Figure 1
Figure 1. Figure 1: Cognitive analogy between hierarchical computer memory and text [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of context mechanisms. Unlike (a) and (b) which treat raw text as a static read-only block, (c) IE-as-Cache introduces a dynamic [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
read the original abstract

Information Extraction aims to distill structured, decision-relevant information from unstructured text, serving as a foundation for downstream understanding and reasoning. However, it is traditionally treated merely as a terminal objective: once extracted, the resulting structure is often consumed in isolation rather than maintained and reused during multi-step inference. Moving beyond this, we propose \textit{IE-as-Cache}, a framework that repurposes IE as a cognitive cache to enhance agentic reasoning. Drawing inspiration from hierarchical computer memory, our approach combines query-driven extraction with cache-aware reasoning to dynamically maintain compact intermediate information and filter noise. Experiments on challenging benchmarks across diverse LLMs demonstrate significant improvements in reasoning accuracy, indicating that IE can be effectively repurposed as a reusable cognitive resource and offering a promising direction for future research on downstream uses of IE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes IE-as-Cache, a framework that repurposes information extraction as a reusable cognitive cache for agentic reasoning in LLMs. It combines query-driven extraction with cache-aware reasoning to dynamically maintain compact intermediate information while filtering noise, and reports significant reasoning accuracy gains on challenging benchmarks across diverse LLMs.

Significance. If the results hold, the work provides a concrete way to treat IE outputs as persistent intermediate state rather than one-shot artifacts, which could improve efficiency and accuracy in multi-step LLM reasoning tasks. The hierarchical-memory analogy and emphasis on reuse are promising directions for integrating structured extraction into agent pipelines.

major comments (2)
  1. [Method (cache-aware reasoning and maintenance)] The core mechanism relies on query-driven extraction at each reasoning turn to populate and maintain the cache, yet the description of the cache policy (relevance scoring, size limits, and eviction rules) provides no mechanism or evaluation for retaining facts that are irrelevant to the current query but required for a later step. This forward-visibility gap is load-bearing for the multi-step inference claim, because any pruned fact that later becomes necessary would cause the observed accuracy gains to disappear even though the source text contained it.
  2. [Experiments] The experimental results assert significant accuracy improvements across LLMs and benchmarks, but the manuscript supplies no ablation isolating the contribution of the cache component versus the base LLM or the extraction model, nor any error analysis of cases where the cache dropped critical future-relevant facts. Without these, it is impossible to confirm that the gains derive from the proposed reuse mechanism rather than incidental factors.
minor comments (2)
  1. [Abstract] The abstract and introduction use the term 'significant improvements' without numerical values or specific benchmark names; adding these would make the central claim easier to evaluate at a glance.
  2. [Method] Notation for the cache state (e.g., how extracted triples or spans are keyed and retrieved) is introduced informally; a small table or pseudocode snippet would clarify the data structures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and constructive feedback on our manuscript. We address each major comment point by point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Method (cache-aware reasoning and maintenance)] The core mechanism relies on query-driven extraction at each reasoning turn to populate and maintain the cache, yet the description of the cache policy (relevance scoring, size limits, and eviction rules) provides no mechanism or evaluation for retaining facts that are irrelevant to the current query but required for a later step. This forward-visibility gap is load-bearing for the multi-step inference claim, because any pruned fact that later becomes necessary would cause the observed accuracy gains to disappear even though the source text contained it.

    Authors: We appreciate the referee identifying this critical design consideration. The IE-as-Cache framework is intended to treat extracted information as reusable intermediate state across reasoning steps, with query-driven extraction adding to the cache while filtering noise. However, the current manuscript provides only a high-level description of cache maintenance and does not specify detailed eviction rules or evaluate retention of facts with delayed utility. This represents a genuine limitation in the presented method. We will revise the method section to include a more explicit account of the cache policy (including relevance scoring and size limits) and add a dedicated limitations paragraph discussing the forward-visibility challenge, along with directions for future extensions such as lookahead-based retention. revision: partial

  2. Referee: [Experiments] The experimental results assert significant accuracy improvements across LLMs and benchmarks, but the manuscript supplies no ablation isolating the contribution of the cache component versus the base LLM or the extraction model, nor any error analysis of cases where the cache dropped critical future-relevant facts. Without these, it is impossible to confirm that the gains derive from the proposed reuse mechanism rather than incidental factors.

    Authors: We agree that isolating the cache's contribution and providing error analysis would strengthen the experimental claims. The reported results compare full IE-as-Cache against non-cached baselines across LLMs and benchmarks, but the manuscript lacks fine-grained ablations (e.g., cache disabled) and targeted error analysis on dropped facts. We will incorporate an ablation study and error analysis section in the revised manuscript to better attribute performance gains to the reuse mechanism and examine cases involving potential information loss. revision: yes

Circularity Check

0 steps flagged

No circularity; framework proposal and experimental claims are independent of inputs

full rationale

The paper introduces IE-as-Cache as a conceptual framework combining query-driven extraction and cache-aware reasoning, with no equations, fitted parameters, or self-referential definitions present. The central claim of accuracy gains rests on benchmark experiments across LLMs rather than any derivation that reduces to its own inputs by construction. No self-citation chains, ansatzes smuggled via prior work, or renamings of known results appear as load-bearing elements. The derivation chain is self-contained and externally falsifiable via the reported experiments.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.0 · 5441 in / 939 out tokens · 31747 ms · 2026-05-10T10:45:49.355957+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 43 canonical work pages · 8 internal anchors

  1. [1]

    Information extraction: Beyond document retrieval,

    R. Gaizauskas and Y . Wilks, “Information extraction: Beyond document retrieval,”Journal of documentation, vol. 54, no. 1, pp. 70–105, 1998

  2. [2]

    Information extraction,

    S. Sarawagiet al., “Information extraction,”Foundations and Trends® in Databases, vol. 1, no. 3, pp. 261–377, 2008

  3. [3]

    Information extraction: Distilling structured data from unstructured text,

    A. McCallum, “Information extraction: Distilling structured data from unstructured text,”Queue, vol. 3, no. 9, pp. 48–57, 2005

  4. [4]

    A survey of named entity recognition and classification,

    D. Nadeau and S. Sekine, “A survey of named entity recognition and classification,” inNamed Entities: Recognition, classification and use. John Benjamins publishing company, 2009, pp. 3–28

  5. [5]

    A survey on deep learning for named entity recognition,

    J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recognition,”IEEE transactions on knowledge and data engineering, vol. 34, no. 1, pp. 50–70, 2020

  6. [6]

    Relation extraction: A survey,

    S. Pawar, G. K. Palshikar, and P. Bhattacharyya, “Relation extraction: A survey,”arXiv preprint arXiv:1712.05191, 2017

  7. [7]

    A review on entity relation extraction,

    Q. Zhang, M. Chen, and L. Liu, “A review on entity relation extraction,” in2017 second international conference on mechanical, control and computer engineering (ICMCCE). IEEE, 2017, pp. 178–183

  8. [8]

    A survey of event extraction from text,

    W. Xiang and B. Wang, “A survey of event extraction from text,”IEEE Access, vol. 7, pp. 173 111–173 137, 2019

  9. [9]

    Event extraction as machine reading comprehension,

    J. Liu, Y . Chen, K. Liu, W. Bi, and X. Liu, “Event extraction as machine reading comprehension,” inProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y . He, and Y . Liu, Eds. Online: Association for Computational Linguistics, Nov. 2020, pp. 1641–1651. [Online]. Available: https://aclanthol...

  10. [10]

    Large language models for generative information extraction: A survey

    D. Xu, W. Chen, W. Peng, C. Zhang, T. Xu, X. Zhao, X. Wu, Y . Zheng, Y . Wang, and E. Chen, “Large language models for generative information extraction: A survey,” 2024. [Online]. Available: https://arxiv.org/abs/2312.17617

  11. [11]

    Agentic reasoning: A streamlined framework for enhancing llm reasoning with agentic tools, 2025

    J. Wu, J. Zhu, and Y . Liu, “Agentic reasoning: Reasoning llms with tools for the deep research,” 2025. [Online]. Available: https://arxiv.org/abs/2502.04644

  12. [12]

    Long range arena: A benchmark for efficient transformers,

    Y . Tay, M. Dehghani, S. Abnar, Y . Shen, D. Bahri, P. Pham, J. Rao, L. Yang, S. Ruder, and D. Metzler, “Long range arena: A benchmark for efficient transformers,” 2020. [Online]. Available: https://arxiv.org/abs/2011.04006

  13. [13]

    Large Language Models Can Be Easily Distracted by Irrelevant Context

    F. Shi, X. Chen, K. Misra, N. Scales, D. Dohan, E. Chi, N. Sch ¨arli, and D. Zhou, “Large language models can be easily distracted by irrelevant context,” 2023. [Online]. Available: https://arxiv.org/abs/2302.00093

  14. [14]

    Lost in the Middle: How Language Models Use Long Contexts

    N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilacqua, F. Petroni, and P. Liang, “Lost in the middle: How language models use long contexts,” 2023. [Online]. Available: https://arxiv.org/abs/2307.03172

  15. [15]

    A survey on in-context learning,

    Q. Dong, L. Li, D. Dai, C. Zheng, J. Ma, R. Li, H. Xia, J. Xu, Z. Wu, B. Chang, X. Sun, L. Li, and Z. Sui, “A survey on in-context learning,” inProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Y . Al-Onaizan, M. Bansal, and Y .-N. Chen, Eds. Miami, Florida, USA: Association for Computational Linguistics, Nov. 2024, p...

  16. [16]

    MemGPT: Towards LLMs as Operating Systems

    C. Packer, S. Wooders, K. Lin, V . Fang, S. G. Patil, I. Stoica, and J. E. Gonzalez, “Memgpt: Towards llms as operating systems,” 2024. [Online]. Available: https://arxiv.org/abs/2310.08560

  17. [17]

    Unified structure generation for universal information extraction,

    Y . Lu, Q. Liu, D. Dai, X. Xiao, H. Lin, X. Han, L. Sun, and H. Wu, “Unified structure generation for universal information extraction,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association for Computational Linguisti...

  18. [18]

    Extract, define, canonicalize: An llm- based framework for knowledge graph construction,

    B. Zhang and H. Soh, “Extract, define, canonicalize: An llm- based framework for knowledge graph construction,” 2024. [Online]. Available: https://arxiv.org/abs/2404.03868

  19. [19]

    A survey on application of knowledge graph,

    X. Zou, “A survey on application of knowledge graph,” inJournal of Physics: Conference Series, vol. 1487, no. 1. IOP Publishing, 2020, p. 012016

  20. [20]

    Towards large reasoning models: A survey of reinforced reasoning with large language models.arXiv preprint arXiv:2501.09686, 2025

    F. Xu, Q. Hao, Z. Zong, J. Wang, Y . Zhang, J. Wang, X. Lan, J. Gong, T. Ouyang, F. Menget al., “Towards large reasoning models: A survey of reinforced reasoning with large language models,”arXiv preprint arXiv:2501.09686, 2025

  21. [21]

    A scientific-article key-insight extraction system based on multi-actor of fine-tuned open-source large language models,

    Z. Song, G.-Y . Hwang, X. Zhang, S. Huang, and B.-K. Park, “A scientific-article key-insight extraction system based on multi-actor of fine-tuned open-source large language models,”Scientific Reports, vol. 15, no. 1, p. 1608, 2025

  22. [22]

    DeepKE: A deep learning based knowledge extraction toolkit for knowledge base population,

    N. Zhang, X. Xu, L. Tao, H. Yu, H. Ye, S. Qiao, X. Xie, X. Chen, Z. Li, and L. Li, “DeepKE: A deep learning based knowledge extraction toolkit for knowledge base population,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, W. Che and E. Shutova, Eds. Abu Dhabi, UAE: Association for Computati...

  23. [23]

    A datasets A.1 Mathematical Reasoning Datasets We evaluate CLoT on four mathematical QA datasets and one multiple-choice dataset:

    D.-H. Zhu, Y .-J. Xiong, J.-C. Zhang, X.-J. Xie, and C.-M. Xia, “Understanding before reasoning: Enhancing chain-of-thought with iterative summarization pre-prompting,” 2025. [Online]. Available: https://arxiv.org/abs/2501.04341

  24. [24]

    Does prompt formatting have any impact on llm performance?

    J. He, M. Rungta, D. Koleczek, A. Sekhon, F. X. Wang, and S. Hasan, “Does prompt formatting have any impact on llm performance?” 2024. [Online]. Available: https://arxiv.org/abs/2411.10541

  25. [25]

    Chain-of- table: Evolving tables in the reasoning chain for table understanding,

    Z. Wang, H. Zhang, C.-L. Li, J. M. Eisenschlos, V . Perot, Z. Wang, L. Miculicich, Y . Fujii, J. Shang, C.-Y . Lee, and T. Pfister, “Chain-of- table: Evolving tables in the reasoning chain for table understanding,”

  26. [26]
  27. [27]

    Table as thought: Exploring structured thoughts in llm reasoning,

    Z. Sun, N. Deng, H. Yu, and J. You, “Table as thought: Exploring structured thoughts in llm reasoning,” 2025. [Online]. Available: https://arxiv.org/abs/2501.02152

  28. [28]

    Instruct and extract: Instruction tuning for on-demand information extraction,

    Y . Jiao, M. Zhong, S. Li, R. Zhao, S. Ouyang, H. Ji, and J. Han, “Instruct and extract: Instruction tuning for on-demand information extraction,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds. Singapore: Association for Computational Linguistics, Dec. 2023, pp. 10 030–10 05...

  29. [29]

    On-demand information extraction,

    S. Sekine, “On-demand information extraction,” inProceedings of the COLING/ACL 2006 Main Conference Poster Sessions, 2006, pp. 731– 738

  30. [30]

    Jianxing Liao, Tian Zhang, Xiao Feng, Yusong Zhang, Rui Yang, Haorui Wang, Bosi Wen, Ziying Wang, and Runzhi Shi

    S. Liang, H. Lv, Z. Wen, Y . Wu, Y . Zhang, H. Wang, and Y . Liu, “Adaptive schema-aware event extraction with retrieval-augmented generation,” 2025. [Online]. Available: https://arxiv.org/abs/2505.08690

  31. [31]

    ReAct: Synergizing Reasoning and Acting in Language Models

    S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “React: Synergizing reasoning and acting in language models,” 2023. [Online]. Available: https://arxiv.org/abs/2210.03629

  32. [32]

    Tact: Advancing complex aggregative reasoning with information extraction tools,

    A. Caciularu, A. Jacovi, E. B. David, S. Goldshtein, T. Schuster, J. Herzig, G. Elidan, and A. Globerson, “Tact: Advancing complex aggregative reasoning with information extraction tools,” 2024

  33. [33]

    arXiv preprint arXiv:2406.04520 , year=

    H. S. Zheng, S. Mishra, H. Zhang, X. Chen, M. Chen, A. Nova, L. Hou, H.-T. Cheng, Q. V . Le, E. H. Chi, and D. Zhou, “Natural plan: Benchmarking llms on natural language planning,” 2024. [Online]. Available: https://arxiv.org/abs/2406.04520

  34. [34]

    Qmsum: A new benchmark for query-based multi-domain meeting summarization,

    M. Zhong, D. Yin, T. Yu, A. Zaidi, M. Mutuma, R. Jha, A. H. Awadallah, A. Celikyilmaz, Y . Liu, X. Qiu, and D. Radev, “Qmsum: A new benchmark for query-based multi-domain meeting summarization,”

  35. [35]

    Qmsum: A new benchmark for query-based multi-domain meeting summarization

    [Online]. Available: https://arxiv.org/abs/2104.05938

  36. [36]

    The llama 3 herd of models,

    A. Grattafiori, A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, and A. M. et al., “The llama 3 herd of models,” 2024

  37. [37]

    Qwen3 technical report,

    A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lv, C. Zheng, D. Liu, F. Zhou, F. Huang, F. Hu, H. Ge, H. Wei, H. Lin, J. Tang, J. Yang, J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Zhou, J. Lin, K. Dang, K. Bao, K. Yang, L. Yu, L. Deng, M. Li, M. Xue, M. Li, P. Zhang, P. Wang, Q. Zhu, R. Men, R. Gao, S. Liu, S. Luo, T. ...

  38. [38]

    Qwen3 Technical Report

    [Online]. Available: https://arxiv.org/abs/2505.09388

  39. [39]

    Rouge: A package for automatic evaluation of summaries,

    C.-Y . Lin, “Rouge: A package for automatic evaluation of summaries,” inText summarization branches out, 2004, pp. 74–81

  40. [40]

    Sentence-t5: Scalable sentence encoders from pre- trained text-to-text models,

    J. Ni, G. Hernandez Abrego, N. Constant, J. Ma, K. Hall, D. Cer, and Y . Yang, “Sentence-t5: Scalable sentence encoders from pre- trained text-to-text models,” inFindings of the Association for Computational Linguistics: ACL 2022, S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 186...

  41. [41]

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

    J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2201.11903

  42. [42]

    Learning from Emptiness: De-biasing Listwise Rerankers with Content-Agnostic Probability Calibration

    H. Lv, H. Gu, R. Yang, L. Li, Z. Chen, D. Lian, H. Wang, and E. Chen, “Learning from emptiness: De-biasing listwise rerankers with content-agnostic probability calibration,” 2026. [Online]. Available: https://arxiv.org/abs/2604.10150

  43. [43]

    Killing two birds with one stone: Unifying retrieval and ranking with a single generative recommendation model,

    L. Zhang, K. Song, Y . Q. Lee, W. Guo, H. Wang, Y . Li, H. Guo, Y . Liu, D. Lian, and E. Chen, “Killing two birds with one stone: Unifying retrieval and ranking with a single generative recommendation model,”

  44. [44]

    Available: https://arxiv.org/abs/2504.16454

    [Online]. Available: https://arxiv.org/abs/2504.16454

  45. [45]

    Efficient personalized reranking with semi-autoregressive generation and online knowledge distillation,

    K. Cheng, H. Wang, W. Guo, W. Liu, Y . Liu, Y . Li, and E. Chen, “Efficient personalized reranking with semi-autoregressive generation and online knowledge distillation,” 2026. [Online]. Available: https://arxiv.org/abs/2603.07107

  46. [46]

    Rapid: Efficient retrieval-augmented long text generation with writing planning and information discovery,

    H. Gu, D. Li, K. Dong, H. Zhang, H. Lv, H. Wang, D. Lian, Y . Liu, and E. Chen, “Rapid: Efficient retrieval-augmented long text generation with writing planning and information discovery,” 2025. [Online]. Available: https://arxiv.org/abs/2503.00751

  47. [47]

    Rag-igbench: Innovative evaluation for rag-based interleaved generation in open-domain question answering,

    R. Zhang, Y . Huang, C. Lu, Q. Wang, Y . Gao, Y . Wu, Y . Hu, Y . Xu, W. Wang, H. Wang, and E. Chen, “Rag-igbench: Innovative evaluation for rag-based interleaved generation in open-domain question answering,” 2025. [Online]. Available: https://arxiv.org/abs/2512.05119

  48. [48]

    Selfaug: Mitigating catastrophic forgetting in retrieval-augmented generation via distribution self- alignment,

    Y . Huang, R. Zhang, Q. Wang, C. Lu, Y . Gao, Y . Wu, Y . Hu, X. Zhi, G. Liu, X. Li, H. Wang, and E. Chen, “Selfaug: Mitigating catastrophic forgetting in retrieval-augmented generation via distribution self- alignment,” 2025. [Online]. Available: https://arxiv.org/abs/2509.03934

  49. [49]

    The next paradigm is user-centric agent, not platform-centric service,

    L. Zhang, H. Lv, Q. Pan, K. Wang, Y . Huang, X. Miao, Y . Xu, W. Guo, Y . Liu, H. Wang, and E. Chen, “The next paradigm is user-centric agent, not platform-centric service,” 2026. [Online]. Available: https://arxiv.org/abs/2602.15682

  50. [50]

    Costeer: Collaborative decoding-time personalization via local delta steering,

    H. Lv, S. Liang, H. Wang, H. Gu, Y . Wu, W. Guo, D. Lian, Y . Liu, and E. Chen, “Costeer: Collaborative decoding-time personalization via local delta steering,” 2026. [Online]. Available: https://arxiv.org/abs/ 2507.04756

  51. [51]

    Specsteer: Synergizing local context and global reasoning for efficient personalized generation,

    H. Lv, S. Liang, H. Wang, Y . Zhang, H. Gu, W. Guo, D. Lian, Y . Liu, and E. Chen, “Specsteer: Synergizing local context and global reasoning for efficient personalized generation,” 2026. [Online]. Available: https://arxiv.org/abs/2603.16219

  52. [52]

    Thought-augmented planning for llm-powered interactive recommender agent,

    H. Yu, Y . Wu, H. Wang, W. Guo, Y . Liu, Y . Li, Y . Ye, J. Du, and E. Chen, “Thought-augmented planning for llm-powered interactive recommender agent,” 2025. [Online]. Available: https://arxiv.org/abs/2506.23485

  53. [53]

    Generative large recommendation models: Emerging trends in llms for recommendation,

    H. Wang, W. Guo, L. Zhang, J. Y . Chin, Y . Ye, H. Guo, Y . Liu, D. Lian, R. Tang, and E. Chen, “Generative large recommendation models: Emerging trends in llms for recommendation,” 2025. [Online]. Available: https://arxiv.org/abs/2502.13783

  54. [54]

    A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness,

    R. Zhou, Q. Jia, B. Chen, P. Xu, Y . Sun, S. Lou, C. Fu, M. Fu, G. Shen, Z. Zhou, J. Jiao, N. Zhou, S. Guan, Y . Qi, S. Wang, X. Luo, Q. Hu, C. Ma, X. Lv, Q. Luo, Y . Ye, L. Zhang, D. Lian, R. Tang, G. Zhou, H. Li, K. Gai, H. Wang, and E. Chen, “A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness,”Preprints, January 2...

  55. [55]

    Exploring user retrieval integration towards large language models for cross-domain sequential recommendation,

    T. Shen, H. Wang, J. Zhang, S. Zhao, L. Li, Z. Chen, D. Lian, and E. Chen, “Exploring user retrieval integration towards large language models for cross-domain sequential recommendation,” 2024. [Online]. Available: https://arxiv.org/abs/2406.03085

  56. [56]

    A unified framework for adaptive representation enhancement and inversed learning in cross-domain recommendation,

    L. Zhang, H. Wang, S. Zhang, M. Yin, Y . Han, J. Zhang, D. Lian, and E. Chen, “A unified framework for adaptive representation enhancement and inversed learning in cross-domain recommendation,”

  57. [57]

    Available: https://arxiv.org/abs/2404.00268

    [Online]. Available: https://arxiv.org/abs/2404.00268

  58. [58]

    Fuxi-alpha: Scaling recommendation model with feature interaction enhanced transformer.arXiv preprint arXiv:2502.03036, 2025

    Y . Ye, W. Guo, J. Y . Chin, H. Wang, H. Zhu, X. Lin, Y . Ye, Y . Liu, R. Tang, D. Lian, and E. Chen, “Fuxi-α: Scaling recommendation model with feature interaction enhanced transformer,” 2025. [Online]. Available: https://arxiv.org/abs/2502.03036

  59. [59]

    Fuxi-linear: Unleashing the power of linear attention in long-term time-aware sequential recommendation.arXiv preprint arXiv:2602.23671, 2026

    Y . Ye, W. Guo, H. Wang, L. Zhang, H. Chang, H. Zhu, Y . Ye, Y . Liu, D. Lian, and E. Chen, “Fuxi-linear: Unleashing the power of linear attention in long-term time-aware sequential recommendation,” 2026. [Online]. Available: https://arxiv.org/abs/2602.23671

  60. [60]

    Optimizing sequential recommendation models with scaling laws and approximate entropy,

    T. Shen, H. Wang, C. Wu, J. Y . Chin, W. Guo, Y . Liu, H. Guo, D. Lian, R. Tang, and E. Chen, “Optimizing sequential recommendation models with scaling laws and approximate entropy,” 2025. [Online]. Available: https://arxiv.org/abs/2412.00430

  61. [61]

    Why thinking hurts? diagnosing and rectifying the reasoning shift in foundation recommender models,

    L. Zhang, Y . Huang, H. Lv, M. Yin, L. Li, Z. Chen, H. Wang, and E. Chen, “Why thinking hurts? diagnosing and rectifying the reasoning shift in foundation recommender models,” 2026. [Online]. Available: https://arxiv.org/abs/2602.16587

  62. [62]

    Can recommender systems teach themselves? a recursive self-improving framework with fidelity control,

    L. Zhang, H. Wang, Z. Liu, M. Yin, Y . Huang, J. Li, W. Guo, Y . Liu, H. Guo, D. Lian, and E. Chen, “Can recommender systems teach themselves? a recursive self-improving framework with fidelity control,” 2026. [Online]. Available: https://arxiv.org/abs/2602.15659

  63. [63]

    From feature interaction to feature generation: A generative paradigm of ctr prediction models,

    M. Yin, J. Pan, H. Wang, X. Wang, S. Zhang, J. Jiang, D. Lian, and E. Chen, “From feature interaction to feature generation: A generative paradigm of ctr prediction models,” 2025. [Online]. Available: https://arxiv.org/abs/2512.14041

  64. [64]

    Dlf: Enhancing explicit-implicit interaction via dynamic low-order-aware fusion for ctr prediction,

    K. Wang, H. Wang, W. Guo, Y . Liu, J. Lin, D. Lian, and E. Chen, “Dlf: Enhancing explicit-implicit interaction via dynamic low-order-aware fusion for ctr prediction,” inProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, ser. SIGIR ’25. ACM, Jul. 2025, p. 2213–2223. [Online]. Available: http://d...

  65. [65]

    Enhancing ctr prediction with de-correlated expert networks,

    J. Wang, M. Yin, H. Wang, and E. Chen, “Enhancing ctr prediction with de-correlated expert networks,” 2025. [Online]. Available: https://arxiv.org/abs/2505.17925

  66. [66]

    A universal framework for compressing embeddings in ctr prediction,

    K. Wang, H. Wang, K. Song, W. Guo, K. Cheng, Z. Li, Y . Liu, D. Lian, and E. Chen, “A universal framework for compressing embeddings in ctr prediction,” 2025. [Online]. Available: https://arxiv.org/abs/2502.15355

  67. [67]

    SPARD: Self-Paced Curriculum for RL Alignment via Integrating Reward Dynamics and Data Utility

    X. Zhi, P. zhou, C. Lu, H. Lv, Y . Liang, R. Zhang, Y . Gao, Y . WU, Y . Hu, H. Gu, D. Lian, H. Wang, and E. Chen, “Spard: Self-paced curriculum for rl alignment via integrating reward dynamics and data utility,” 2026. [Online]. Available: https://arxiv.org/abs/2604.07837 APPENDIXA BROADERCONNECTIONS ANDPOTENTIALDOWNSTREAM IMPLICATIONS OFIE-AS-CACHE The m...