pith. sign in

arxiv: 2605.06898 · v1 · submitted 2026-05-07 · 💻 cs.AI

Self-Programmed Execution for Language-Model Agents

Pith reviewed 2026-05-11 01:04 UTC · model grok-4.3

classification 💻 cs.AI
keywords self-programmed executionlanguage model agentsagentic machinesSpell languageself-modificationorchestration policyside-effect managementLisp-based execution
0
0 comments X

The pith

Language models can act as agents by generating and executing their own orchestrator programs rather than following any fixed policy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces self-programmed execution as an architecture where the model's own completion serves as the program that controls state transitions between turns. This replaces the usual fixed harness that dictates orchestration rules in advance. To make this practical, it defines agentic machines whose states permit arbitrary loading of embedded copies and supplies the Spell language, a Lisp variant that lets programs edit themselves while structuring side effects so re-evaluation does not repeat prior actions. Experiments with unmodified frontier models show they can already carry out demanding agent tasks under this regime. A reader would care because the result implies agents need not be locked into designer-specified control flows and could instead develop their own strategies for managing their execution.

Core claim

The paper establishes that a language model can operate as an agent without any fixed orchestration policy. It formalizes this via agentic machines in which an SPE state is one from which a model completion can load any state of an embedded copy of the machine. The practical realization uses the Spell language, in which programs edit and re-evaluate themselves and effectful expressions such as model invocations are arranged so that re-evaluation after editing does not replay side effects. Experiments confirm that existing frontier models, without any training for SPE or Spell, can already succeed at challenging agentic tasks under this setup.

What carries the argument

Self-programmed execution (SPE) realized through agentic machines and the Spell Lisp-based language, in which a model completion becomes the orchestrator program that can edit and re-evaluate itself without replaying side effects.

If this is right

  • Existing models can already complete complex agentic tasks without any pre-specified turn-to-turn orchestration.
  • No external harness needs to impose a fixed orchestration policy once the model outputs its own executable program.
  • Training models specifically for self-programmed execution could allow them to discover and refine their own orchestration strategies.
  • The architecture separates the model’s generative role from any rigid control structure, enabling fully model-driven state management.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Agents built this way could dynamically change their own reasoning loops mid-task without external intervention.
  • The approach may scale to systems that maintain long-running self-modifying control programs across many interactions.
  • It opens the possibility of measuring and comparing different self-orchestration patterns that models learn when trained under SPE.
  • Integration with other self-referential mechanisms could let agents optimize their own resource use or error recovery.

Load-bearing premise

The same data can function simultaneously as model context and executable program while preventing unintended replay of side effects during self-modification and re-evaluation.

What would settle it

Running the provided Spell programs with frontier models on the reported agentic tasks and observing either repeated unintended side effects on re-evaluation or consistent failure to complete the tasks without a fixed external policy.

Figures

Figures reproduced from arXiv: 2605.06898 by Luke J. O'Connor.

Figure 1
Figure 1. Figure 1: Three agent architectures. Colored boxes distinguish program logic that is implemented in the harness (blue) vs. written by the model (yellow). In all cases, these programs are executed by a harness runtime which is external to the model. (a) In ReAct [Yao et al., 2023], the model selects from a prescribed action space. An orchestrator program runs the agent loop, maintaining state (e.g., conversation hist… view at source ↗
Figure 2
Figure 2. Figure 2: Accuracy and fatal Spell-error rate by model. Each model was run on a set of 32 Terminal-Bench 1.1 and SWE-bench Lite tasks. A task was counted as a fatal error if its final turn produced an unrecovered Spell/runtime error. GPT-5.4 and Opus 4.6 were configured with medium reasoning effort, GLM-5.1 and Qwen3.6 Plus with high effort, and Kimi-K2.6 with default effort. programs can these models write, and wha… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison with Codex CLI on coding benchmarks. Left: Terminal￾Bench 1.1. Right: SWE-bench Lite. Each point is one full benchmark run with GPT-5.4 at low, medium, or high reasoning effort. For numerical results, see Appendix C.5. 0 20 40 60 80 Resolved tasks (%) Terminal-Bench 1.1 (n=80) SWE-bench Lite (n=300) LongBench v2 (n=200) AppWorld dev (n=57) $25.72 $46.96 $102.12 $161.82 $27.83 $25.58 * $32.99 $10… view at source ↗
Figure 5
Figure 5. Figure 5: Mean cached input, uncached input, and output tokens per task in medium-effort [PITH_FULL_IMAGE:figures/full_fig_p071_5.png] view at source ↗
read the original abstract

At the heart of existing language model agents is a fixed orchestrator program responsible for the state transition between consecutive turns. This paper introduces self-programmed execution (SPE), an agent architecture in which the model completion is itself the orchestrator program, and the harness evaluates this program but does not impose its own orchestration policy. I formalize this idea using agentic machines: an SPE state is one from which a model completion can load any state of an embedded copy of the machine, meaning that it is subject to no fixed turn-to-turn orchestration policy. Realizing SPE in practice is nontrivial because the same data is both model context and executable program. I therefore introduce Spell, a Lisp-based language in which programs can edit and re-evaluate themselves, and effectful expressions like model invocations are structured such that re-evaluating an edited program does not replay its side effects. Experiments with existing models, not trained for SPE or Spell, show that frontier models can operate in this regime and accomplish challenging agentic tasks. These results demonstrate how an LM can act as an agent without any fixed orchestration policy, and they raise the question of what self-orchestration strategies might be learned by a model trained for self-programmed execution. Code is available at https://github.com/lukejoconnor/spell .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces self-programmed execution (SPE) for LM agents, in which the model completion itself acts as the orchestrator program evaluated by a harness with no fixed turn-to-turn policy. It defines agentic machines such that an SPE state allows a model completion to load any state of an embedded machine copy. To realize this, the paper presents Spell, a Lisp-based language supporting self-editing and re-evaluation where effectful operations (e.g., model calls) are wrapped to avoid replay on edits. Experiments with untrained frontier models are reported to show successful performance on challenging agentic tasks, with code released.

Significance. If the central claims hold, this architecture could enable more adaptive LM agents free of hardcoded orchestration loops, opening questions about learned self-orchestration strategies. The release of code and the parameter-free formalization of SPE states are strengths that support reproducibility and further exploration.

major comments (2)
  1. [Spell language section] § on Spell and effectful expressions: the claim that wrapping prevents replay of side effects after arbitrary self-edits relies on the Lisp evaluator and specific form structure, but the paper does not demonstrate robustness against model-generated code that might re-bind or quote effectful forms (e.g., via (let ((f (lambda () (model-call)))) ...)). This is load-bearing for the 'no fixed orchestration' definition of SPE states.
  2. [Experiments] Experiments section: the abstract asserts success on 'challenging agentic tasks' with frontier models, but without reported task definitions, quantitative metrics, baselines, or failure modes, it is difficult to assess whether the results support the claim that models operate in the SPE regime rather than via prompt engineering that avoids edge cases.
minor comments (2)
  1. The GitHub link is provided; confirm it includes the exact Spell evaluator and prompt templates used in the reported runs.
  2. [Formalization] Notation for agentic machine states could be clarified with a small diagram or pseudocode example of a state transition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Spell language section] § on Spell and effectful expressions: the claim that wrapping prevents replay of side effects after arbitrary self-edits relies on the Lisp evaluator and specific form structure, but the paper does not demonstrate robustness against model-generated code that might re-bind or quote effectful forms (e.g., via (let ((f (lambda () (model-call)))) ...)). This is load-bearing for the 'no fixed orchestration' definition of SPE states.

    Authors: We agree that robustness to arbitrary model-generated Lisp forms is central to the SPE definition. The current manuscript describes the wrapping mechanism for effectful expressions and relies on the evaluator's treatment of these forms to prevent replay. In the revised version we will add a dedicated subsection with concrete examples and a short argument showing that common constructs (let, lambda, quote, and similar) cannot bypass the wrapper, because the harness maintains separate evaluation state that is not captured by re-binding or quoting within the model completion. This will strengthen the formal claim without altering the core architecture. revision: yes

  2. Referee: [Experiments] Experiments section: the abstract asserts success on 'challenging agentic tasks' with frontier models, but without reported task definitions, quantitative metrics, baselines, or failure modes, it is difficult to assess whether the results support the claim that models operate in the SPE regime rather than via prompt engineering that avoids edge cases.

    Authors: The experiments are presented as qualitative demonstrations that untrained frontier models can successfully execute in the SPE regime on non-trivial agentic tasks. We acknowledge that the current text provides limited quantitative detail. In the revision we will expand the experiments section to include explicit task definitions, the success criteria applied, comparison against standard fixed-orchestrator baselines where feasible, and a summary of observed failure modes. These additions will make it easier to evaluate whether the models are genuinely operating without fixed turn-to-turn policy. revision: yes

Circularity Check

0 steps flagged

No circularity: SPE and Spell are independent definitions with experimental support.

full rationale

The paper introduces SPE as a new agent architecture defined via agentic machines and Spell as a Lisp variant for self-editing without side-effect replay. These are presented as architectural proposals rather than derivations from fitted parameters or prior results. The central claim rests on experiments with unmodified frontier models, not on any self-citation load-bearing step, uniqueness theorem, or renaming of known patterns. No equations reduce by construction to inputs, and the formalization is self-contained without smuggling ansatzes or calling fitted quantities predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claim depends on the novel SPE architecture and Spell language, plus the assumption that existing models can handle self-modifying code without side-effect replay.

axioms (1)
  • domain assumption Language models can generate and safely execute self-modifying programs where data serves as both context and code without unintended side-effect replay.
    Required for the Spell design and SPE state definition to function as described.
invented entities (3)
  • Self-programmed execution (SPE) no independent evidence
    purpose: Agent architecture in which the model completion itself serves as the orchestrator program with no fixed turn-to-turn policy.
    Newly introduced formalization using agentic machines.
  • Spell no independent evidence
    purpose: Lisp-based language enabling programs to edit and re-evaluate themselves while structuring effectful expressions to avoid replaying side effects.
    Newly proposed implementation language.
  • Agentic machines no independent evidence
    purpose: Formal model where an SPE state allows a model completion to load any state of an embedded machine copy.
    New formal concept for defining states without fixed orchestration.

pith-pipeline@v0.9.0 · 5522 in / 1253 out tokens · 45723 ms · 2026-05-11T01:04:49.881129+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

  1. [1]

    Anthropic Applied AI Team

    Cited as Anthropic 2025b. Anthropic Applied AI Team. Effective context engineering for AI agents.https:// www.anthropic.com/engineering/effective-context-engineering-for-ai-agents,

  2. [2]

    Wenhu Chen, Xueguang Ma, Xinyi Wang, and William W

    Engineering blog, published September 29, 2025. Wenhu Chen, Xueguang Ma, Xinyi Wang, and William W. Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks,

  3. [3]

    Own your context window

    URLhttps://arxiv.org/abs/2211.12588. Matthias Felleisen and Daniel P. Friedman. Control operators, the SECD-machine, and theλ-calculus. In Martin Wirsing, editor,Formal Description of Programming Concepts III: Proceedings of the Third IFIP WG 2.2 Working Conference, pages 193–219. North- Holland, 1986. Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei...

  4. [4]

    doi: 10.1007/978-3-540-68677-4_7

    Springer, 2007. doi: 10.1007/978-3-540-68677-4_7. Roberto Segala and Nancy A. Lynch. Probabilistic simulations for probabilistic processes. Nordic Journal of Computing, 2(2):250–273, 1995. Brian Cantwell Smith. Reflection and semantics in LISP. InProceedings of the 11th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), pages 23–3...

  5. [5]

    Agentfold: Long-horizon web agents with proactive context management.arXiv preprint arXiv:2510.24699, 2025

    URLhttps://arxiv.org/abs/2510.24699. Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, and William Yang Wang. Gödel agent: A self-referential agent framework for recursive self-improvement, 2024. URL https://arxiv.org/abs/2410.04444. ACL 2025 version adds Li Lin as coauthor. EricZelikman, ElianaLorch, LesterMackey, andAdamTaumanKalai. Self-taughtoptimi...

  6. [6]

    Thusenc(s ′

    =e(s ′ 2), then the two CEK states have the same environment and hence the same retained value for z. Thusenc(s ′

  7. [7]

    = enc(s′ 2), and sinceencis injective,s ′ 1 =s ′

  8. [8]

    self-programmed

    Thereforeeis an embedding ofX ′ intoX CEK. Corollary A.21(Universal seed).The seed statex 0 from the proof of Theorem A.17 completion-generates every agentic machine over(P, C)that is realizable in the underlying CEKevaluator. Inparticular, understandardfiniteencodings, itcompletion-generatesevery agentic machine whose prompt function and harness procedur...

  9. [9]

    It enables self-reference via the outerquine

  10. [10]

    cancelled

    It ensures that only thetrailing expression, which is the last expression of the do block, can have externally visible effects. The outerevalperforms a second evaluation on the value returned by thedoblock, namely itstrailing expression. If this expression is quoted, thenevalevaluates it, allowing it to trigger side effects such as LLM calls. This pattern...

  11. [11]

    A new node can be created, either awake or asleep

  12. [12]

    Ifbgoes from asleep to awake, then any edge(b, c)is deleted

    For a nodeawhich is awake at timet, any number of new edges(a, b)can be created (b̸=a); at timet+ 1,awill be asleep andbwill be awake. Ifbgoes from asleep to awake, then any edge(b, c)is deleted

  13. [13]

    division failed

    A nodebcan go from awake to asleep, and this deletes any edge(a, b). If the out-degree ofabecomes zero, thenabecomes awake at timet+ 1. Deadlock occurs when every node is asleep andEis nonempty. A non-deadlocked state never gives rise to a deadlocked state. In particular, transformation (2) never generates a directed cycle. Clojure provides synchronizatio...

  14. [14]

    *Normally this value is a quote.*

    (do ...) returns the value of its last expression (called the trailing expression). *Normally this value is a quote.*

  15. [15]

    unbound symbol

    (eval ...) evaluates this quote. Effect functions (those with global side effects) can only be evaluated by eval and otherwise throw "unbound symbol": (quine completion (eval (do (!llm-self "No")))) ; unbound symbol exception (quine completion (eval (do ’(!llm-self "Yes")))) ; quote is unwrapped by eval

  16. [16]

    Your task

    (quine completion ...) binds the source code of the entire program, including the wrapper itself, to the symbol completion. This allows you to extend your CoT (see below). This wrapper allows you to extend your CoT by self-prompting with your completion while ensuring that effectful function calls are not re-evaluated. If you see this prefix: (quine compl...

  17. [17]

    calling effect builtins outside the trailing expression: !llm-self, !ask-await, leaf-llm, eval, and describe-fn are effect functions; they must appear in the quoted trailing expression or inside !call-now / !peek / !print

  18. [18]

    confusing def with let: def binds in the environment (visible to later expressions); let creates local scope

  19. [19]

    forgetting quote on the trailing expression: the last expression must be quoted so the outer eval can run it with effect bindings

  20. [20]

    str vs cat vs pr-str: str joins arguments as strings; cat is an alias; pr-str serializes as Spell-readable data (vectors, maps, etc.)

  21. [21]

    which python3 && python3 --version && python3 -m pytest --version && which rg

    using read-string on untrusted input: read-string parses Spell code; only use it on data you control remindersnamespace guide REMINDER: This text belongs to the prefix of a Spell program that you are tasked with completing. Your entire response is code; embed all natural language within string literals. Follow the instructions on how to write correct Spel...

  22. [22]

    calling io/* outside the quoted trailing expression

  23. [23]

    forgetting !call-now when you need the result: ’(io/read-file "x") evaluates but the result is lost

  24. [24]

    using io/sh for everything –use io/str-replace to patch files, io/read-file to read them, io/grep to search them

  25. [25]

    find + see context

    grep-then-read in two turns when one grep with :context N would suffice –prefer ‘(io/grep pat path {:context 20})‘ for "find + see context" In examples, | marks cursor position in a completion. Recommended usage pattern: Patch a file with io/str-replace. Use io/str-replace when you know the exact text to change. It avoids shell escaping issues entirely. ....

  26. [26]

    |’(!call-now code (io/read-lines "main.py"))

    Read the file to see current contents. ...|’(!call-now code (io/read-lines "main.py"))

  27. [27]

    def greet():

    Next turn: code is bound. Identify the line range, replace it. ...(def code ["def greet():" " print(’hello’)" ...]) |(think "Line 2 needs updating.") ’(io/replace-lines "main.py" 2 3 " print(’goodbye’)") Recommended usage pattern: Explore multiple files and persist relevant snippets

  28. [28]

    48 ...|’(!peek-now file-lines (io/read-lines "main.py"))

    Peek full file with one-turn lifetime. 48 ...|’(!peek-now file-lines (io/read-lines "main.py"))

  29. [29]

    many lines

    Next turn: file-lines is available. Persist relevant snippets and peek another file. ...(def file-lines ["... many lines ..."]) (rethink 2 "After persisting what you need, rethink 2 to drop the prior !peek- now call and binding.") |(persist fn-defn (subvec file-lines 99 111)) ’(!peek-now test-lines (io/read-lines "test_main.py"))

  30. [30]

    def target_fn(...):

    Next turn: fn-defn stays in context. The prior !peek-now call and file-lines were dropped by rethink 2, and test-lines is now available. ... (persist fn-defn ["def target_fn(...):" " ..."]) ’(!peek-now test-lines (io/read-lines "test_main.py")) (def test-lines ["... many lines ..."]) (rethink 2 "After persisting what you need, rethink 2 to drop the prior ...

  31. [31]

    big-module.py

    Read the file. ...|’(!call-now code (io/read-file "big-module.py"))

  32. [32]

    1: import os\n2: import sys\n...\n... [truncated, 58302 chars total]

    Next turn: file was too large and got truncated. Rethink to discard it, then grep for what you need. ...(def code "1: import os\n2: import sys\n...\n... [truncated, 58302 chars total]") |(rethink "File too large to scan inline. Grep for the target instead.") ’(!call-now matches (io/grep "def handle_request" "big-module.py")) io-readnamespace guide IO-READ...

  33. [33]

    agents/send and passing turn when expecting a reply: this ends conversation, instead use agents/!ask

  34. [34]

    agents/reply and passing turn: same problem; use agents/!reply-ask if you need the conversation to continue

  35. [35]

    agents/!ask followed by additional expressions: these do not evaluate, instead put them first

  36. [36]

    hallucinating handles: use (agents/parent-handle), :user, :main, or look up (! print (globals/get :roles)) (if globals/ available)

  37. [37]

    calling agents/* outside the quoted trailing expression (for example: (def h ( agents/current-handle))); effect calls must run in trailing expression code

  38. [38]

    agents/send argument order: it is (agents/send target message), consistent with (agents/!ask target message)

  39. [39]

    hello"). Right: ’(agents/reply msg-0

    agents/reply needs two arguments: a received msg-N and a reply value. Wrong: ’( agents/reply "hello"). Right: ’(agents/reply msg-0 "hello")

  40. [40]

    If nobody messaged you yet this turn, you do not have a msg-N to reply to

    spawned children often need send, not reply. If nobody messaged you yet this turn, you do not have a msg-N to reply to. In examples, | marks cursor position in a completion. It is doc-only; do not type it into code. Multi-part example:

  41. [41]

    You are a summarizer. Read long-file.txt and send me a summary

    Main: spawn a summarizer, keep working, then block with !ask. ;; turn 1: start child + continue your own CoT ...|’(do (agents/spawn "You are a summarizer. Read long-file.txt and send me a summary." :summarizer) (!extend)) ;; next turn: ... |(think "...")(think "Ok, I’ll wait for summarizer now")’(agents/!ask : summarizer) ;; main blocks until child responds

  42. [42]

    You are a summarizer. Read long-file.txt and send me a summary

    Summarizer child: use send to return result. 51 ...(quine prompt "You are a summarizer. Read long-file.txt and send me a summary .") |’(!call-now file-contents (io/read-lines "long-file.txt")) ;; next turn ...(def file-contents "...") |(def summary "...") ’(agents/send (agents/parent-handle) summary) ;; child turn ends after send

  43. [43]

    I have a question about the summary

    Main: use !reply-ask to clarify and keep the conversation open. ...’(agents/!ask :summarizer) (def msg-0 {:from :summarizer :body {...}}) (think "I have a question about the summary.") |’(agents/!reply-ask msg-0 "What is the...") ;; child awakens; main blocks for child’s response globalsnamespace guide GLOBALS –Shared state visible to all agents. (globals...

  44. [44]

    Bind to a local with !call-now: ’(!call-now roles (globals/get :roles)) ;; next turn: roles is available as a local binding

  45. [45]

    Orchestrator

    Print directly for quick inspection: ’(!print (globals/get :roles)) Default special keys: :roles {} –Agent registry for handle lookup. Convention: {:main "Orchestrator" :spawn-1 "Worker for CLI" :spawn-2 "Worker for unit testing"} :tasks [] –shared task queue. Convention: [{:id 1 :desc "read file"} {:id 2 :desc "summarize"}] These defaults are conventions...

  46. [46]

    calling globals/* outside the quoted trailing expression: (globals/get :roles) does nothing at eval time; must be quoted

  47. [47]

    forgetting !call-now: ’(globals/get :roles) returns the value; use ’(!call-now roles (globals/get :roles)) if you want to see it

  48. [48]

    hallucinating handles: instead, look them up in roles/ (also see agents/parent- handle and agents/current-handle) Multi-part example –worker pool with a shared task queue: | marks cursor position and is doc-only; do not type it into code

  49. [49]

    summarize A

    Main: populate the queue and spawn workers. ...|’(do (globals/set :results []) 52 (globals/set :tasks [{:id 1 :desc "summarize A"} {:id 2 :desc "summarize B "}]) (agents/spawn "You are a worker. Pop tasks from globals :tasks and process them." :w1) (agents/spawn "You are a worker. Pop tasks from globals :tasks and process them." :w2) (globals/wait-until (...

  50. [50]

    summarize A

    Worker w1: claim a task atomically. ...|’(!call-now task (globals/pop :tasks)) ;; next turn: task is {:id 1 :desc "summarize A"} (or nil if queue empty)

  51. [51]

    summarize A

    Worker w1: post result back. ...(def task {:id 1 :desc "summarize A"}) |(def summary "A is about...") ’(globals/update :results (fn [r] (conj (or r []) {:id 1 :summary summary}))) blockingnamespace guide BLOCKING –Future-only blocking primitives. (blocking/await fut) –await a Spell future token (future-only) (blocking/await-all [f1 f2 ...]) –await multipl...

  52. [52]

    calling check-result outside the trailing expression: must be quoted like all effect calls

  53. [53]

    It is doc-only; do not type it into code

    using team without an io-capable agent profile: workers and verifier need io/ and agents/; blocking/ is future-only and !ask-await is a builtin In examples, | marks cursor position in a completion. It is doc-only; do not type it into code. Example - verify then correct:

  54. [54]

    What is 6 * 9?

    Compute an answer and check it. ...(def answer 42) |’(!call-now verdict (patterns/check-result "What is 6 * 9?" answer))

  55. [55]

    6 * 9 = 54, not 42

    Next turn: handle the verdict. ...(def verdict {:wrong "6 * 9 = 54, not 42"}) |(def answer 54) ’(!call-now verdict (patterns/check-result "What is 6 * 9?" answer)) webnamespace guide WEB –Search and fetch web content. (web/search query) –search web and return [{:title :url :snippet} ...] (web/fetch url) –fetch URL and return markdown/text (web/config) –in...

  56. [56]

    clojure transducers

    Search and peek the results. ...|’(!peek-now results (web/search "clojure transducers"))

  57. [57]

    Transducers - Clojure

    Next turn: results is available. Pick the best URL and fetch it. ...(def results {:ok [{:title "Transducers - Clojure" :url "https://clojure.org/ reference/transducers" :snippet "..."} ...]}) (rethink 2 "After persisting what you need, rethink 2 to drop the prior !peek- now call and binding.") |(persist best-url (get (first (:ok results)) :url)) ’(!peek-n...

  58. [58]

    For each agent, create anevalfunction and install it within agent-specific inside functions

  59. [59]

    Construct an initial program from a user prompt

  60. [60]

    For the root inside function of the main agent, run(box :main init-program root- inside-fn)

  61. [61]

    <benchmark prompt>

    All subsequent execution occurs inside of this function call; for example, the initial program usually makes a self-call, which triggers the creation of a newbox. 61 C Benchmarking methods and results C.1 Shared evaluation configuration C.1.1 Compared Agents Spellagent.TheSpellagent was configured with the tool-call transport agent profile config/agents/i...