pith. machine review for the scientific record. sign in

arxiv: 2605.14169 · v1 · submitted 2026-05-13 · 💻 cs.CL

Recognition: no theorem link

BOOKMARKS: Efficient Active Storyline Memory for Role-playing

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:48 UTC · model grok-4.3

classification 💻 cs.CL
keywords role-playing agentsmemory systemssearch-based memorystoryline consistencybookmarksactive groundingsynchronizationRPA memory
0
0 comments X

The pith

BOOKMARKS replaces recurrent summarization with search-based bookmarks that keep task-relevant details alive across long role-play storylines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing RPA memory methods rely on recurrent summarization that compresses the story and inevitably drops important details over long horizons. BOOKMARKS instead treats memory as a set of bookmarks, each defined as the answer to a task-relevant question fixed at a particular point in the storyline. For any current task the system selects reusable bookmarks or initializes fresh ones at the start, then synchronizes their answers forward to the present story point so they remain accurate and ready for reuse. This design supplies active grounding for task-specific facts while performing only passive updates, avoiding full recomputation each turn. Experiments on 85 characters drawn from 16 artifacts show clear gains over standard baselines, confirming that search-based memory improves consistency without the compression penalty.

Core claim

BOOKMARKS is a search-based memory framework that actively initializes, maintains, and updates task-relevant bookmarks for role-playing agents. Each bookmark is structured as the answer to a question at a specific storyline point. For the current task the framework selects existing reusable bookmarks or creates new ones at the storyline beginning, then applies synchronization to bring their answers up to the present story point for efficient reuse in future grounding rounds. The implementation supports concept, behavior, and state searches, each driven by its own efficient synchronization method.

What carries the argument

A bookmark, defined as the answer to a question at a specific point in the storyline, which carries active task grounding and enables passive forward synchronization.

If this is right

  • Role-playing agents can sustain longer storylines while retaining task-specific details that summarization discards.
  • Computation is reduced because bookmarks are updated passively rather than regenerated recurrently.
  • Bookmarks initialized for one task become reusable across later grounding rounds without re-initialization.
  • Separate support for concept, behavior, and state searches gives the memory system flexibility to handle different kinds of character information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bookmark-and-synchronization pattern could be applied to other long-context agents that need to track evolving facts without full context recompression.
  • If question selection or synchronization occasionally introduces small errors, those errors could compound over hundreds of turns and would need explicit correction mechanisms.
  • Evaluating the system on live user-driven conversations rather than static artifacts would test whether the performance gains hold outside curated storylines.

Load-bearing premise

Task-relevant questions for bookmarks can be reliably selected or initialized and synchronization methods can accurately update their answers without introducing errors or missing critical context as the storyline advances.

What would settle it

If BOOKMARKS fails to produce higher consistency scores than recurrent summarization baselines when evaluated on the same 85 characters from 16 artifacts, the claim that search-based memory is more effective would be refuted.

Figures

Figures reproduced from arXiv: 2605.14169 by Jingbo Shang, Kun Zhou, Letian Peng, Longfei Yun, Yiming Huang, Yupeng Hou, Ziche Liu.

Figure 1
Figure 1. Figure 1: BOOKMARKS grounds role-playing by ac￾tively searching for useful information from the preced￾ing storyline, while passively updating previous search results for efficiency. that only a partial preceding storyline can reach the grounding stage, either due to the filtering mech￾anism in retrieval or the compression of details in profiling. In contrast, search-based grounding (Jin et al., 2025) can utilize th… view at source ↗
Figure 2
Figure 2. Figure 2: The role-playing grounding workflow of B [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The hit rate (matching an existing bookmark) and efficiency analysis of B [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Case study on comparing BOOKMARKS and conventional profiling, based on multiple action prediction. involved, as shown in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Memory systems are critical for role-playing agents (RPAs) to maintain long-horizon consistency. However, existing RPA memory methods (e.g., profiling) mainly rely on recurrent summarization, whose compression inevitably discards important details. To address this issue, we propose a search-based memory framework called BOOKMARKS, which actively initializes, maintains, and updates task-relevant pieces of bookmarks for the current task (e.g., character acting). A bookmark is structured as the answer to a question at a specific point in the storyline. For each current task, BOOKMARKS selects reusable existing bookmarks or initializes new ones (at storyline beginning) with useful questions. These bookmarks are then synchronized to the current story point, with their answers updated accordingly, so they can be efficiently reused in future grounding rounds. Compared with recurrent summarization, BOOKMARKS offers (1) active grounding for capturing task-specific details and (2) passive updating to avoid unnecessary computation. In implementation, BOOKMARKS supports concept, behavior, and state searches, each powered by an efficient synchronization method. BOOKMARKS significantly outperforms RPA memory baselines on 85 characters from 16 artifacts, demonstrating the effectiveness of search-based memory for RPAs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes BOOKMARKS, a search-based memory framework for role-playing agents that actively selects or initializes task-relevant bookmarks (structured as question-answer pairs tied to specific storyline points), synchronizes their answers to the current narrative state via concept/behavior/state searches, and reuses them for efficient grounding. It claims this avoids the detail loss of recurrent summarization, offering active grounding and passive updating, with significant outperformance over RPA memory baselines demonstrated on 85 characters from 16 artifacts.

Significance. If the empirical claims are supported by detailed metrics, controls, and ablations, the approach could meaningfully advance long-horizon consistency in RPAs by providing a non-compressive alternative to profiling methods, with potential for broader use in interactive narrative systems where task-specific details must be preserved without full history recomputation.

major comments (2)
  1. [Abstract] Abstract: The central claim that BOOKMARKS 'significantly outperforms RPA memory baselines' on 85 characters from 16 artifacts is asserted without any reported metrics (e.g., accuracy, consistency scores), baseline descriptions, statistical tests, or ablation results, rendering the empirical contribution impossible to evaluate from the provided text.
  2. [Method] Method (synchronization description): The synchronization methods for updating bookmark answers as the storyline evolves are outlined at a high level for concept, behavior, and state searches, but no error-rate analysis, failure-case examination (e.g., narrative contradictions or context loss), or fidelity comparison versus recurrent summarization is supplied; this directly undermines validation of the claimed advantage over compression-based methods.
minor comments (1)
  1. [Abstract] Abstract: The term 'artifacts' is introduced without definition or reference to a specific dataset or collection of stories; this should be clarified with a citation or brief description for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and completeness.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that BOOKMARKS 'significantly outperforms RPA memory baselines' on 85 characters from 16 artifacts is asserted without any reported metrics (e.g., accuracy, consistency scores), baseline descriptions, statistical tests, or ablation results, rendering the empirical contribution impossible to evaluate from the provided text.

    Authors: We agree that the abstract would be strengthened by including specific quantitative results. The full manuscript reports these details in the Experiments section, including consistency and accuracy metrics, baseline descriptions (e.g., recurrent summarization and profiling methods), ablation studies, and statistical tests. We will revise the abstract to incorporate key performance numbers, baseline references, and significance indicators. revision: yes

  2. Referee: [Method] Method (synchronization description): The synchronization methods for updating bookmark answers as the storyline evolves are outlined at a high level for concept, behavior, and state searches, but no error-rate analysis, failure-case examination (e.g., narrative contradictions or context loss), or fidelity comparison versus recurrent summarization is supplied; this directly undermines validation of the claimed advantage over compression-based methods.

    Authors: The synchronization procedures are specified in the Method section, including the three search types with algorithmic descriptions. We acknowledge the absence of dedicated error-rate analysis and failure-case discussion. We will add a new subsection providing quantitative error rates, examples of edge cases such as narrative contradictions, and fidelity comparisons (e.g., information retention metrics) against recurrent summarization. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical claims rest on external benchmarks

full rationale

The paper introduces BOOKMARKS as a search-based memory framework for role-playing agents, with bookmarks defined as question-answer pairs synchronized across storyline points. All load-bearing claims are empirical comparisons against RPA baselines on 85 characters from 16 artifacts. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. The synchronization and initialization procedures are presented as engineering choices evaluated by performance metrics, not as results forced by definition or prior self-work. The derivation chain is therefore self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the domain assumption that useful questions can be chosen to capture task-relevant details and that synchronization preserves accuracy; the bookmark entity itself is newly introduced without external validation.

axioms (1)
  • domain assumption Task-relevant bookmarks can be initialized at storyline beginnings and synchronized to current points without critical information loss.
    Invoked to justify passive updating and reuse across grounding rounds.
invented entities (1)
  • Bookmark no independent evidence
    purpose: Structured memory unit consisting of the answer to a question at a specific storyline point.
    Newly postulated memory primitive enabling search-based rather than summarization-based retention.

pith-pipeline@v0.9.0 · 5526 in / 1149 out tokens · 42048 ms · 2026-05-15T04:48:45.692385+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

136 extracted references · 136 canonical work pages · 14 internal anchors

  1. [1]

    Proceedings of the Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025) Poster Session , year =

    Codifying Character Logic in Role-Playing , author =. Proceedings of the Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025) Poster Session , year =

  2. [2]

    Computational linguistics , volume=

    Class-based n-gram models of natural language , author=. Computational linguistics , volume=

  3. [3]

    Second Conference on Language Modeling , year=

    Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning , author=. Second Conference on Language Modeling , year=

  4. [4]

    Character-LLM:

    Yunfan Shao and Linyang Li and Junqi Dai and Xipeng Qiu , editor =. Character-LLM:. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing,. 2023 , url =. doi:10.18653/V1/2023.EMNLP-MAIN.814 , timestamp =

  5. [5]

    SimCSE: Simple Contrastive Learning of Sentence Embeddings , booktitle =

    Tianyu Gao and Xingcheng Yao and Danqi Chen , editor =. SimCSE: Simple Contrastive Learning of Sentence Embeddings , booktitle =. 2021 , url =. doi:10.18653/V1/2021.EMNLP-MAIN.552 , timestamp =

  6. [6]

    Learning to Retrieve In-Context Examples for Large Language Models

    Wang, Liang and Yang, Nan and Wei, Furu. Learning to Retrieve In-Context Examples for Large Language Models. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.eacl-long.105

  7. [7]

    Human-AI Interaction in the Age of Large Language Models , booktitle =

    Diyi Yang , editor =. Human-AI Interaction in the Age of Large Language Models , booktitle =. 2024 , url =. doi:10.1609/AAAISS.V3I1.31183 , timestamp =

  8. [8]

    Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

    Yanzhao Zhang and Mingxin Li and Dingkun Long and Xin Zhang and Huan Lin and Baosong Yang and Pengjun Xie and An Yang and Dayiheng Liu and Junyang Lin and Fei Huang and Jingren Zhou , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2506.05176 , eprinttype =. 2506.05176 , timestamp =

  9. [9]

    Qwen3 Technical Report

    An Yang and Anfeng Li and Baosong Yang and Beichen Zhang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Gao and Chengen Huang and Chenxu Lv and Chujie Zheng and Dayiheng Liu and Fan Zhou and Fei Huang and Feng Hu and Hao Ge and Haoran Wei and Huan Lin and Jialong Tang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jian Yang and Jiaxi Yang and Ji...

  10. [10]

    SweetieChat:

    Jing Ye and Lu Xiang and Yaping Zhang and Chengqing Zong , editor =. SweetieChat:. Proceedings of the 31st International Conference on Computational Linguistics,. 2025 , url =

  11. [11]

    CoRR , volume =

    Alexander Gurung and Mirella Lapata , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2503.22828 , eprinttype =. 2503.22828 , timestamp =

  12. [12]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    Bert: Pre-training of deep bidirectional transformers for language understanding , author=. arXiv preprint arXiv:1810.04805 , year=

  13. [13]

    Advances in neural information processing systems , volume=

    A neural probabilistic language model , author=. Advances in neural information processing systems , volume=

  14. [14]

    CoRR , volume =

    Pengfei Yu and Dongming Shen and Silin Meng and Jaewon Lee and Weisu Yin and Andrea Yaoyun Cui and Zhenlin Xu and Yi Zhu and Xingjian Shi and Mu Li and Alex Smola , title =. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2502.00595 , eprinttype =. 2502.00595 , timestamp =

  15. [15]

    Advances in neural information processing systems , volume=

    Language models are few-shot learners , author=. Advances in neural information processing systems , volume=

  16. [16]

    The Curious Case of Neural Text Degeneration

    The curious case of neural text degeneration , author=. arXiv preprint arXiv:1904.09751 , year=

  17. [17]

    arXiv preprint arXiv:2402.17532 , archivePrefix=

    Retrieval is Accurate Generation , author=. arXiv preprint arXiv:2402.17532 , archivePrefix=. 2024 , eprint=

  18. [18]

    arXiv preprint arXiv:2004.04906 , year=

    Dense passage retrieval for open-domain question answering , author=. arXiv preprint arXiv:2004.04906 , year=

  19. [19]

    arXiv preprint arXiv:1909.01066 , year=

    Language models as knowledge bases? , author=. arXiv preprint arXiv:1909.01066 , year=

  20. [20]

    2022 , eprint=

    A Review on Language Models as Knowledge Bases , author=. 2022 , eprint=

  21. [21]

    arXiv preprint arXiv:2206.14268 , year=

    BertNet: Harvesting knowledge graphs with arbitrary relations from pretrained language models , author=. arXiv preprint arXiv:2206.14268 , year=

  22. [22]

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    Llama 2: Open foundation and fine-tuned chat models , author=. arXiv preprint arXiv:2307.09288 , year=

  23. [23]

    The Llama 3 Herd of Models

    The llama 3 herd of models , author=. arXiv preprint arXiv:2407.21783 , year=

  24. [24]

    GPT-4 Technical Report

    Gpt-4 technical report , author=. arXiv preprint arXiv:2303.08774 , year=

  25. [25]

    Gemma: Open Models Based on Gemini Research and Technology

    Gemma: Open models based on gemini research and technology , author=. arXiv preprint arXiv:2403.08295 , year=

  26. [26]

    Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics , pages=

    Greedy decoding for statistical machine translation in almost linear time , author=. Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics , pages=

  27. [27]

    Self-Consistency Improves Chain of Thought Reasoning in Language Models

    Self-consistency improves chain of thought reasoning in language models , author=. arXiv preprint arXiv:2203.11171 , year=

  28. [28]

    arXiv preprint arXiv:2402.10200 , year=

    Chain-of-thought reasoning without prompting , author=. arXiv preprint arXiv:2402.10200 , year=

  29. [29]

    arXiv preprint arXiv:2007.00808 , year=

    Approximate nearest neighbor negative contrastive learning for dense text retrieval , author=. arXiv preprint arXiv:2007.00808 , year=

  30. [30]

    ArXiv , year=

    Copy is All You Need , author=. ArXiv , year=

  31. [31]

    arXiv preprint arXiv:2404.10877 , year=

    Incubating Text Classifiers Following User Instruction with Nothing but LLM , author=. arXiv preprint arXiv:2404.10877 , year=

  32. [32]

    arXiv preprint arXiv:2202.07922 , year=

    Zerogen: Efficient zero-shot learning via dataset generation , author=. arXiv preprint arXiv:2202.07922 , year=

  33. [33]

    arXiv preprint arXiv:2210.12329 , year=

    Progen: Progressive zero-shot dataset generation via in-context feedback , author=. arXiv preprint arXiv:2210.12329 , year=

  34. [34]

    Advances in neural information processing systems , volume=

    Chain-of-thought prompting elicits reasoning in large language models , author=. Advances in neural information processing systems , volume=

  35. [35]

    arXiv preprint arXiv:2004.13897 , year=

    Empower entity set expansion via language model probing , author=. arXiv preprint arXiv:2004.13897 , year=

  36. [36]

    , author=

    Visualizing data using t-SNE. , author=. Journal of machine learning research , volume=

  37. [37]

    IEEE Trans

    Least squares quantization in PCM , author=. IEEE Trans. Inf. Theory , year=

  38. [38]

    arXiv preprint arXiv:2404.10877 , year=

    Letian Peng and Jingbo Shang , title =. CoRR , volume =. 2024 , url =. doi:10.48550/ARXIV.2404.10877 , eprinttype =. 2404.10877 , timestamp =

  39. [39]

    ZeroGen: Efficient Zero-shot Learning via Dataset Generation , booktitle =

    Jiacheng Ye and Jiahui Gao and Qintong Li and Hang Xu and Jiangtao Feng and Zhiyong Wu and Tao Yu and Lingpeng Kong , editor =. ZeroGen: Efficient Zero-shot Learning via Dataset Generation , booktitle =. 2022 , url =. doi:10.18653/V1/2022.EMNLP-MAIN.801 , timestamp =

  40. [40]

    N-gram Counts and Language Models from the Common Crawl , booktitle =

    Christian Buck and Kenneth Heafield and Bas van Ooyen , editor =. N-gram Counts and Language Models from the Common Crawl , booktitle =. 2014 , url =

  41. [41]

    Mark Everingham and Luc Van Gool and Christopher K. I. Williams and John M. Winn and Andrew Zisserman , title =. Int. J. Comput. Vis. , volume =. 2010 , url =. doi:10.1007/S11263-009-0275-4 , timestamp =

  42. [42]

    Diversity of Thought Improves Reasoning Abilities of Large Language Models , journal =

    Ranjita Naik and Varun Chandrasekaran and Mert Y. Diversity of Thought Improves Reasoning Abilities of Large Language Models , journal =. 2023 , url =. doi:10.48550/ARXIV.2310.07088 , eprinttype =. 2310.07088 , timestamp =

  43. [43]

    Training Verifiers to Solve Math Word Problems

    Karl Cobbe and Vineet Kosaraju and Mohammad Bavarian and Mark Chen and Heewoo Jun and Lukasz Kaiser and Matthias Plappert and Jerry Tworek and Jacob Hilton and Reiichiro Nakano and Christopher Hesse and John Schulman , title =. CoRR , volume =. 2021 , url =. 2110.14168 , timestamp =

  44. [44]

    Arkil Patel and Satwik Bhattamishra and Navin Goyal , editor =. Are. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies,. 2021 , url =. doi:10.18653/V1/2021.NAACL-MAIN.168 , timestamp =

  45. [45]

    2017 , howpublished =

    Google DeepMind , title =. 2017 , howpublished =

  46. [46]

    Manning and Andrew Y

    Richard Socher and Alex Perelygin and Jean Wu and Jason Chuang and Christopher D. Manning and Andrew Y. Ng and Christopher Potts , title =. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing,. 2013 , url =

  47. [47]

    Character-level Convolutional Networks for Text Classification , booktitle =

    Xiang Zhang and Junbo Jake Zhao and Yann LeCun , editor =. Character-level Convolutional Networks for Text Classification , booktitle =. 2015 , url =

  48. [48]

    CARER : Contextualized Affect Representations for Emotion Recognition

    Saravia, Elvis and Liu, Hsien-Chi Toby and Huang, Yen-Hao and Wu, Junlin and Chen, Yi-Shin. CARER : Contextualized Affect Representations for Emotion Recognition. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. doi:10.18653/v1/D18-1404

  49. [49]

    Abdelzaher and Heng Ji , editor =

    Chi Han and Jialiang Xu and Manling Li and Yi Fung and Chenkai Sun and Nan Jiang and Tarek F. Abdelzaher and Heng Ji , editor =. Word Embeddings Are Steers for Language Models , booktitle =. 2024 , url =. doi:10.18653/V1/2024.ACL-LONG.864 , timestamp =

  50. [50]

    Answer is All You Need: Instruction-following Text Embedding via Answering the Question , booktitle =

    Letian Peng and Yuwei Zhang and Zilong Wang and Jayanth Srinivasa and Gaowen Liu and Zihan Wang and Jingbo Shang , editor =. Answer is All You Need: Instruction-following Text Embedding via Answering the Question , booktitle =. 2024 , url =. doi:10.18653/V1/2024.ACL-LONG.27 , timestamp =

  51. [51]

    Morris and Richard Antonello and Ion Stoica and Alexander G

    Vinamra Benara and Chandan Singh and John X. Morris and Richard Antonello and Ion Stoica and Alexander G. Huth and Jianfeng Gao , title =. CoRR , volume =. 2024 , url =. doi:10.48550/ARXIV.2405.16714 , eprinttype =. 2405.16714 , timestamp =

  52. [52]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov , title =. CoRR , volume =. 2019 , url =. 1907.11692 , timestamp =

  53. [53]

    7th International Conference on Learning Representations,

    Ilya Loshchilov and Frank Hutter , title =. 7th International Conference on Learning Representations,. 2019 , url =

  54. [54]

    Dirk Groeneveld and Iz Beltagy and Evan Pete Walsh and Akshita Bhagia and Rodney Kinney and Oyvind Tafjord and Ananya Harsh Jha and Hamish Ivison and Ian Magnusson and Yizhong Wang and Shane Arora and David Atkinson and Russell Authur and Khyathi Raghavi Chandu and Arman Cohan and Jennifer Dumas and Yanai Elazar and Yuling Gu and Jack Hessel and Tushar Kh...

  55. [55]

    Stolen Probability:

    David Demeter and Gregory Kimmel and Doug Downey , editor =. Stolen Probability:. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,. 2020 , url =. doi:10.18653/V1/2020.ACL-MAIN.198 , timestamp =

  56. [56]

    Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice , booktitle =

    Andreas Grivas and Nikolay Bogoychev and Adam Lopez , editor =. Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice , booktitle =. 2022 , url =. doi:10.18653/V1/2022.ACL-LONG.465 , timestamp =

  57. [57]

    Cohen , title =

    Zhilin Yang and Zihang Dai and Ruslan Salakhutdinov and William W. Cohen , title =. 6th International Conference on Learning Representations,. 2018 , url =

  58. [58]

    Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions , booktitle =

    Haw. Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions , booktitle =. 2022 , url =. doi:10.18653/V1/2022.ACL-LONG.554 , timestamp =

  59. [59]

    Contrastive Decoding: Open-ended Text Generation as Optimization , booktitle =

    Xiang Lisa Li and Ari Holtzman and Daniel Fried and Percy Liang and Jason Eisner and Tatsunori Hashimoto and Luke Zettlemoyer and Mike Lewis , editor =. Contrastive Decoding: Open-ended Text Generation as Optimization , booktitle =. 2023 , url =. doi:10.18653/V1/2023.ACL-LONG.687 , timestamp =

  60. [60]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Interactive narrative: A novel application of artificial intelligence for computer games , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  61. [61]

    International Workshop on Intelligent Virtual Agents , pages=

    An objective character believability evaluation procedure for multi-agent story generation systems , author=. International Workshop on Intelligent Virtual Agents , pages=. 2005 , organization=

  62. [62]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    Character-LLM: A Trainable Agent for Role-Playing , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

  63. [63]

    Transactions on Machine Learning Research , year=

    From Persona to Personalization: A Survey on Role-Playing Language Agents , author=. Transactions on Machine Learning Research , year=

  64. [64]

    Larp: Language-agent role play for open-world games

    Larp: Language-agent role play for open-world games , author=. arXiv preprint arXiv:2312.17653 , year=

  65. [65]

    Findings of the Association for Computational Linguistics: ACL 2024 , pages=

    Rolellm: benchmarking, eliciting, and enhancing role-playing abilities of large language models , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=. 2024 , publisher=

  66. [66]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

    Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

  67. [67]

    arXiv preprint arXiv:2506.01748 , year=

    Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning , author=. arXiv preprint arXiv:2506.01748 , year=

  68. [68]

    arXiv preprint arXiv:2505.12814 , year=

    PsyMem: Fine-grained psychological alignment and Explicit Memory Control for Advanced Role-Playing LLMs , author=. arXiv preprint arXiv:2505.12814 , year=

  69. [69]

    Information and control , volume=

    Finite state languages , author=. Information and control , volume=. 1958 , publisher=

  70. [70]

    Proceedings of the 22nd international conference on Software engineering , pages=

    Bandera: Extracting finite-state models from Java source code , author=. Proceedings of the 22nd international conference on Software engineering , pages=

  71. [71]

    2013 IEEE Global Engineering Education Conference (EDUCON) , pages=

    Teaching finite state machines with case method and role play , author=. 2013 IEEE Global Engineering Education Conference (EDUCON) , pages=. 2013 , organization=

  72. [72]

    arXiv preprint arXiv:2310.05161 , year=

    Recurrent neural language models as probabilistic finite-state automata , author=. arXiv preprint arXiv:2310.05161 , year=

  73. [73]

    Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

    GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

  74. [74]

    arXiv preprint arXiv:2407.19412 , year=

    Identity-driven hierarchical role-playing agents , author=. arXiv preprint arXiv:2407.19412 , year=

  75. [75]

    International Conference on Machine Learning , pages=

    Pal: Program-aided language models , author=. International Conference on Machine Learning , pages=. 2023 , organization=

  76. [76]

    Science , volume=

    Competition-level code generation with alphacode , author=. Science , volume=. 2022 , publisher=

  77. [77]

    WizardCoder: Empowering Code Large Language Models with Evol-Instruct , author=

  78. [78]

    International Conference on Learning Representations (ICLR) , year=

    React: Synergizing reasoning and acting in language models , author=. International Conference on Learning Representations (ICLR) , year=

  79. [79]

    Autonomous Robots , volume=

    Tidybot: Personalized robot assistance with large language models , author=. Autonomous Robots , volume=. 2023 , publisher=

  80. [80]

    Code as Policies: Language Model Programs for Embodied Control

    Code as policies: Language model programs for embodied control , author=. arXiv preprint arXiv:2209.07753 , year=

Showing first 80 references.