pith. sign in

arxiv: 2605.22221 · v1 · pith:GJOPH4CZnew · submitted 2026-05-21 · 💻 cs.LG · cs.AI· cs.LO

Can Transformers Learn to Verify During Backtracking Search?

Pith reviewed 2026-05-22 07:15 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.LO
keywords backtracking searchtransformersselective state attentionconstraint satisfactionhistory dependencereasoningsearch verification
0
0 comments X

The pith

A fixed attention mask lets transformers base backtracking decisions only on the current state rather than on search history.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that when transformers are trained to predict continue-or-backtrack on full cumulative traces of solver steps, they do not learn to ignore irrelevant prior history. This happens because state features get scattered across the sequence and because the model can latch onto trajectory-specific patterns. The authors introduce localization to keep state features local and Selective State Attention, which is a fixed mask that blocks attention to non-state tokens. With this, the model produces the same decision for two different paths that arrive at identical states. This matters for any system where a language model generates its own search tree, since the right next step should not depend on how it got there.

Core claim

Decoder-only transformers trained autoregressively on cumulative solver traces fail to make decisions that depend only on the current search state, due to scattered state features and history entanglement. Selective State Attention addresses this by applying a fixed attention mask that enforces attention only within the current state block, allowing the model to emit identical continue-or-backtrack predictions for same-state pairs reached via different histories.

What carries the argument

Selective State Attention (SSA), a fixed attention mask over the input trace that restricts the model's attention to tokens representing the current search state.

If this is right

  • The model will give consistent verification answers after propagation detects a contradiction, independent of path.
  • Training requires no changes to data, loss, or model parameters.
  • The approach applies across constraint satisfaction problems like 3-SAT and planning domains like Blocks World.
  • Similar issues may arise in any autoregressive model that searches by generating its own intermediate steps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • At inference time, clearing the context to only the current state could achieve similar isolation without needing the mask during training.
  • This structural fix might help in theorem provers or planners that use transformers to explore proof or plan trees.
  • The diagnostic of history entanglement could be applied to other sequential tasks where state determinism is required.

Load-bearing premise

The test domains and the way same-state pairs are constructed are representative enough that matching decisions on those pairs implies the model has learned state-based verification in general.

What would settle it

Run SSA and the baseline on a new backtracking problem, construct pairs of trajectories that reach the identical partial solution state through different sequences, and check if their continue-or-backtrack outputs match exactly.

Figures

Figures reproduced from arXiv: 2605.22221 by Katsumi Inoue, Tony Ribeiro, Tuan Nguyen, Yin Jun Phua.

Figure 1
Figure 1. Figure 1: SSA versus causal attention on a 3-SAT instance with decision blocks [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Three trace encodings for tree traversal: [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: All-correct verification accuracy on star-tree traversal as a function [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: SSA attention mask. Blue: full access; green: causal within the prefix; [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Schematic of two constructs introduced in this section. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Solve rates across three domains (mean ± std). Left: primary 3-SAT un￾der three inference protocols. Center: 3-SAT n=75 phase-transition (8 seeds, cumulative inference only) with per-seed points. Right: Backtracking parsing under two protocols. reliably as trajectories grow. On backtracking parsing of an ambiguous expres￾sion grammar (state block contains only the input tokens, cursor, and parser stack), S… view at source ↗
Figure 7
Figure 7. Figure 7: BFS and DFS training curves on tree traversal under the natural en [PITH_FULL_IMAGE:figures/full_fig_p037_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Solve rate under synthetic corruption of oracle dead-end decisions. False [PITH_FULL_IMAGE:figures/full_fig_p039_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Solve rate versus conditional false-prune rate [PITH_FULL_IMAGE:figures/full_fig_p040_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Length-OOD comparison (3 seeds, held-out validation, shaded regions [PITH_FULL_IMAGE:figures/full_fig_p044_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Padding-control counterfactual (3 seeds, 600 trials per model and [PITH_FULL_IMAGE:figures/full_fig_p045_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Per-seed solve rates for the train-from-scratch ablation. Error bars: [PITH_FULL_IMAGE:figures/full_fig_p048_12.png] view at source ↗
read the original abstract

Backtracking search underlies classical constraint solvers, planners, and theorem provers. Recent transformer-based reasoning systems explore search trees over their own intermediate steps. A common training recipe fits an autoregressive next-token loss on offline solver traces. The model's input at each step is a cumulative trace of all prior decisions. The optimal continue-or-backtrack predictor depends only on the current search state, since two trajectories reaching the same state admit the same viable continuations. We show that decoder-only transformers trained on cumulative traces fail this requirement in two ways: the trace can scatter state features across many positions (scattered retrieval), and the predictor can condition on the trajectory rather than the state (history entanglement). We address scattered retrieval with localization, a trace-level fix that rewrites each decision block to expose state features locally. We address history entanglement with Selective State Attention (SSA), a fixed attention mask that enforces state-based decisions structurally without modifying training data, objective, or parameters. We focus on reactive verification, after propagation has exposed a contradiction. We test SSA on 3-SAT, graph coloring, Blocks World, and backtracking parsing. On same-state pairs that differ only in prior history, SSA emits identical decisions while a cumulative-trained causal baseline does not. Our contribution is a diagnostic of transformer behavior on serialized trajectory data, paired with a structural fix. Pretrained language models that search over their own reasoning steps may face the same failure. Our analysis opens up inference-time context clearing as a candidate way to apply the same isolation without retraining.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that decoder-only transformers trained autoregressively on cumulative solver traces for backtracking search fail to produce state-dependent decisions due to scattered state features across the trace and history entanglement. It introduces localization (a trace rewrite exposing state features locally) and Selective State Attention (SSA), a fixed attention mask enforcing attention only to the current state block. Focused on reactive verification after propagation, experiments on 3-SAT, graph coloring, Blocks World, and backtracking parsing show SSA yields identical decisions on same-state pairs differing only in prior history, unlike a cumulative causal baseline. The contribution is positioned as a diagnostic plus structural fix, with inference-time context clearing suggested as an alternative.

Significance. If the empirical result holds, the work is significant for identifying a concrete failure mode in transformer-based search over serialized trajectories and offering a parameter-free structural intervention (SSA) that does not alter data, loss, or model size. The same-state pair diagnostic is a clear, falsifiable test of state-based verification. Credit is due for the reproducible experimental setup across four domains and for highlighting relevance to pretrained LLMs performing internal search.

major comments (2)
  1. [§4] §4 (Experiments, same-state pair results): The central claim that SSA produces identical decisions while the baseline does not rests on the construction and evaluation of same-state pairs. The manuscript provides no equations, pseudocode, or precise definition of the identity metric (exact token match vs. distribution distance) nor reports error bars or run-to-run variance. This detail is load-bearing because, without it, the observed identity could arise from pair-generation artifacts rather than the mask, as the stress-test concern highlights.
  2. [§3.2] §3.2 (Selective State Attention): The claim that the fixed mask enforces state-based decisions 'structurally without side effects on learning dynamics' is not supported by any ablation on gradient flow, expressivity for non-contradiction steps, or comparison to learned masks. Because the mask is presented as the key load-bearing fix for history entanglement, absence of this analysis leaves the generality of the result open.
minor comments (2)
  1. [§1] The abstract and §1 mention 'inference-time context clearing' as a candidate application but provide no concrete implementation sketch or preliminary result.
  2. [§3.2] Notation for the attention mask in SSA would benefit from an explicit diagram or matrix example showing which positions remain visible after localization.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of the work's significance and for the constructive major comments. We address each point below and commit to revisions that strengthen the presentation without altering the core claims.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments, same-state pair results): The central claim that SSA produces identical decisions while the baseline does not rests on the construction and evaluation of same-state pairs. The manuscript provides no equations, pseudocode, or precise definition of the identity metric (exact token match vs. distribution distance) nor reports error bars or run-to-run variance. This detail is load-bearing because, without it, the observed identity could arise from pair-generation artifacts rather than the mask, as the stress-test concern highlights.

    Authors: We agree that the precise definition and supporting statistics are necessary to substantiate the central claim and to address potential artifacts. In the revised manuscript we will add equations and pseudocode that formally define same-state pair construction (identical current search state reached via distinct histories) and the identity metric (exact token-level match on the verification decision). We will also report mean and standard deviation of the identity rate across five independent training runs with different random seeds, confirming that the effect is robust rather than an artifact of any single pair-generation procedure. revision: yes

  2. Referee: [§3.2] §3.2 (Selective State Attention): The claim that the fixed mask enforces state-based decisions 'structurally without side effects on learning dynamics' is not supported by any ablation on gradient flow, expressivity for non-contradiction steps, or comparison to learned masks. Because the mask is presented as the key load-bearing fix for history entanglement, absence of this analysis leaves the generality of the result open.

    Authors: The fixed mask in SSA is applied identically during training and inference and simply zeros attention logits outside the current state block; the loss, optimizer, and parameter updates therefore remain unchanged for all attended positions. We will add a short clarifying paragraph in §3.2 that explains this construction and notes that expressivity for non-contradiction steps is preserved because the mask never removes state features required for the verification decision. A direct comparison to learned masks would require a different training regime and is therefore outside the scope of the present parameter-free intervention; we view such an ablation as valuable future work rather than a prerequisite for the current result. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical identity result on same-state pairs is not forced by construction

full rationale

The paper's central claim is an empirical observation that SSA produces identical decisions on constructed same-state pairs (differing only in prior history) while a cumulative baseline does not. This rests on experimental comparison across domains rather than any derivation, equation, or fitted parameter that reduces to its own inputs by definition. No self-definitional steps, predictions called from fits, load-bearing self-citations, or ansatzes smuggled via prior work appear in the described chain. The SSA mask is presented as a structural intervention whose effect is measured externally on test pairs, making the analysis self-contained against the benchmark of decision identity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard transformer architecture assumptions and the empirical construction of same-state test pairs; no free parameters are introduced and the new entity is the SSA mask itself.

axioms (1)
  • domain assumption Decoder-only transformers trained autoregressively on cumulative traces can be evaluated on state-equivalence pairs
    Invoked when constructing the diagnostic test for identical decisions on same-state pairs.
invented entities (1)
  • Selective State Attention (SSA) no independent evidence
    purpose: Fixed attention mask that enforces state-based decisions without modifying training data or parameters
    New structural component introduced to address history entanglement.

pith-pipeline@v0.9.0 · 5812 in / 1355 out tokens · 30440 ms · 2026-05-22T07:15:16.647774+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

77 extracted references · 77 canonical work pages

  1. [1]

    Proceedings of the 5th International Joint Conference on Learning & Reasoning (IJCLR) , year=

    Transformers Can Admit Mistakes and Backtrack , author=. Proceedings of the 5th International Joint Conference on Learning & Reasoning (IJCLR) , year=

  2. [2]

    Communications of the ACM , volume =

    Kowalski, Robert , title =. Communications of the ACM , volume =. 1979 , doi =

  3. [3]

    Veli. The. Proceedings of the 39th International Conference on Machine Learning , volume=. 2022 , organization=

  4. [4]

    Deep Learning for Code Workshop at ICLR 2022 , year=

    Show Your Work: Scratchpads for Intermediate Computation with Language Models , author=. Deep Learning for Code Workshop at ICLR 2022 , year=

  5. [5]

    2021 , doi =

    Training verifiers to solve math word problems , author=. 2021 , doi =

  6. [6]

    The Eleventh International Conference on Learning Representations , year =

    Self-Consistency Improves Chain of Thought Reasoning in Language Models , author=. The Eleventh International Conference on Learning Representations , year =

  7. [7]

    Snell, Charlie and Lee, Jaehoon and Xu, Kelvin and Kumar, Aviral , booktitle=. Scaling. 2025 , url=

  8. [8]

    The Twelfth International Conference on Learning Representations , year=

    Let's verify step by step , author=. The Twelfth International Conference on Learning Representations , year=

  9. [9]

    Learning a

    Selsam, Daniel and Lamm, Matthew and B. Learning a. International Conference on Learning Representations (ICLR) , year =

  10. [10]

    The Twelfth International Conference on Learning Representations , year =

    Large Language Models Cannot Self-Correct Reasoning Yet , author=. The Twelfth International Conference on Learning Representations , year =

  11. [11]

    The Twelfth International Conference on Learning Representations , year =

    Vision Transformers Need Registers , author=. The Twelfth International Conference on Learning Representations , year =

  12. [12]

    Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks , volume =

    Hendrycks, Dan and Burns, Collin and Kadavath, Saurav and Arora, Akul and Basart, Steven and Tang, Eric and Song, Dawn and Steinhardt, Jacob , title =. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks , volume =

  13. [13]

    Neurocomputing , volume =

    Su, Jianlin and Ahmed, Murtadha and Lu, Yu and Pan, Shengfeng and Wen, Bo and Liu, Yunfeng , title =. Neurocomputing , volume =. 2024 , doi =

  14. [14]

    Advances in Neural Information Processing Systems , volume=

    Object-centric learning with slot attention , author=. Advances in Neural Information Processing Systems , volume=

  15. [15]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Step Back to Leap Forward: Self-Backtracking for Symbolic Reasoning and Planning in Language Models , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2026 , doi=

  16. [16]

    2025 , doi =

    Kim, Joongwon and Goyal, Anirudh and Tan, Liang and Hajishirzi, Hannaneh and Iyer, Srinivasan and Wang, Tianlu , note =. 2025 , doi =

  17. [17]

    Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics , series=

    A reduction of imitation learning and structured prediction to no-regret online learning , author=. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics , series=

  18. [18]

    Can Transformers Reason Logically?

    Pan, Leyan and Ganesh, Vijay and Abernethy, Jacob and Esposo, Chris and Lee, Wenke , booktitle=. Can Transformers Reason Logically?. 2025 , organization=

  19. [19]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  20. [20]

    2020 , doi =

    Ainslie, Joshua and Ontanon, Santiago and Alberti, Chris and Cvicek, Vaclav and Fisher, Zachary and Pham, Philip and Ravula, Anirudh and Sanghai, Sumit and Wang, Qifan and Yang, Li , booktitle=. 2020 , doi =

  21. [21]

    Advances in Neural Information Processing Systems , volume=

    Block transformer: Global-to-local language modeling for fast inference , author=. Advances in Neural Information Processing Systems , volume=. 2024 , doi =

  22. [22]

    The Thirteenth International Conference on Learning Representations , year =

    Selective Attention Improves Transformer , author=. The Thirteenth International Conference on Learning Representations , year =

  23. [23]

    Advances in Neural Information Processing Systems , volume=

    Tree of thoughts: Deliberate problem solving with large language models , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  24. [24]

    Proceedings of the 41st International Conference on Machine Learning , series=

    Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models , author=. Proceedings of the 41st International Conference on Machine Learning , series=

  25. [25]

    Nature , volume=

    Solving olympiad geometry without human demonstrations , author=. Nature , volume=. 2024 , publisher=

  26. [26]

    Nature , volume=

    Mathematical discoveries from program search with large language models , author=. Nature , volume=. 2024 , publisher=

  27. [27]

    2023 , doi =

    System 2 Attention (is something you might need too) , author=. 2023 , doi =

  28. [28]

    Proceedings of the 40th International Conference on Machine Learning , series=

    Scaling laws for reward model overoptimization , author=. Proceedings of the 40th International Conference on Machine Learning , series=

  29. [29]

    Advances in Neural Information Processing Systems , volume=

    Toolformer: Language models can teach themselves to use tools , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  30. [30]

    The Twelfth International Conference on Learning Representations , year=

    NeuroBack: Improving CDCL SAT Solving using Graph Neural Networks , author=. The Twelfth International Conference on Learning Representations , year=

  31. [31]

    Nature , volume=

    Olympiad-level formal mathematical reasoning with reinforcement learning , author=. Nature , volume=. 2026 , publisher=

  32. [32]

    Second Conference on Language Modeling , year=

    To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning , author=. Second Conference on Language Modeling , year=

  33. [33]

    Findings of the Association for Computational Linguistics: ACL 2025 , pages=

    The lessons of developing process reward models in mathematical reasoning , author=. Findings of the Association for Computational Linguistics: ACL 2025 , pages=. 2025 , doi =

  34. [34]

    and Littman, Michael L

    Li, Lihong and Walsh, Thomas J. and Littman, Michael L. , title=. International Symposium on Artificial Intelligence and Mathematics (

  35. [35]

    and Thomas, Joy A

    Cover, Thomas M. and Thomas, Joy A. , title=. 2006 , doi =

  36. [36]

    , title=

    Tsybakov, Alexandre B. , title=. 2009 , doi =

  37. [37]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Towards clause-learning state space search: Learning to recognize dead-ends , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2016 , doi =

  38. [38]

    Advances in Neural Information Processing Systems , volume =

    What Can Transformers Learn In-Context? A Case Study of Simple Function Classes , author =. Advances in Neural Information Processing Systems , volume =. 2022 , doi =

  39. [39]

    2022 , doi =

    Teaching Algorithmic Reasoning via In-context Learning , author =. 2022 , doi =

  40. [40]

    Transactions on Machine Learning Research , issn =

    Linear Algebra with Transformers , author =. Transactions on Machine Learning Research , issn =. 2022 , url =

  41. [41]

    Advances in Neural Information Processing Systems , volume =

    Exploring Length Generalization in Large Language Models , author =. Advances in Neural Information Processing Systems , volume =. 2022 , doi =

  42. [42]

    The Twelfth International Conference on Learning Representations , year =

    Teaching Arithmetic to Small Transformers , author =. The Twelfth International Conference on Learning Representations , year =

  43. [43]

    2023 , doi =

    Length Generalization in Arithmetic Transformers , author =. 2023 , doi =

  44. [44]

    Proceedings of the 40th International Conference on Machine Learning , series =

    Looped Transformers as Programmable Computers , author =. Proceedings of the 40th International Conference on Machine Learning , series =

  45. [45]

    The Twelfth International Conference on Learning Representations , year =

    Looped Transformers are Better at Learning Learning Algorithms , author =. The Twelfth International Conference on Learning Representations , year =

  46. [46]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume =

    Predicting Propositional Satisfiability via End-to-End Learning , author =. Proceedings of the AAAI Conference on Artificial Intelligence , volume =. 2020 , doi =

  47. [47]

    2023 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , pages =

    SATformer: Transformer-Based UNSAT Core Learning , author =. 2023 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , pages =. 2023 , doi =

  48. [48]

    2021 , doi =

    Transformer-based Machine Learning for Fast SAT Solvers and Logic Synthesis , author =. 2021 , doi =

  49. [49]

    and Lewis, Mike , title =

    Press, Ofir and Smith, Noah A. and Lewis, Mike , title =. International Conference on Learning Representations (ICLR) , year =

  50. [50]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Kazemnejad, Amirhossein and Padhi, Inkit and Natesan Ramamurthy, Karthikeyan and Das, Payel and Reddy, Siva , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =. 2023 , doi =

  51. [51]

    and Lynce, In\^es and Malik, Sharad , title =

    Marques-Silva, Jo\ ao P. and Lynce, In\^es and Malik, Sharad , title =. Handbook of Satisfiability -- Second Edition , editor =. 2021 , doi =

  52. [52]

    2006 , doi =

    Handbook of Constraint Programming , series =. 2006 , doi =

  53. [53]

    Advances in Neural Information Processing Systems , volume=

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models , author=. Advances in Neural Information Processing Systems , volume=. 2022 , doi =

  54. [54]

    Advances in Neural Information Processing Systems , volume=

    Large Language Models are Zero-Shot Reasoners , author=. Advances in Neural Information Processing Systems , volume=. 2022 , doi =

  55. [55]

    Advances in Neural Information Processing Systems , volume=

    Self-Refine: Iterative Refinement with Self-Feedback , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  56. [56]

    2025 , publisher=

    Guo, Daya and Yang, Dejian and Zhang, Haowei and Song, Junxiao and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Zhang, Ruoyu and Ma, Shirong and Bi, Xiao and others , journal=. 2025 , publisher=

  57. [57]

    Advances in Neural Information Processing Systems , volume=

    Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , author=. Advances in Neural Information Processing Systems , volume=

  58. [58]

    Advances in Neural Information Processing Systems , volume=

    Faith and Fate: Limits of Transformers on Compositionality , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  59. [59]

    2022 , doi =

    Solving Math Word Problems With Process- and Outcome-Based Feedback , author=. 2022 , doi =

  60. [60]

    Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William Yang , booktitle=. Logic-. 2023 , doi =

  61. [61]

    Advances in Neural Information Processing Systems , volume=

    On the Planning Abilities of Large Language Models - A Critical Investigation , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  62. [62]

    Advances in Neural Information Processing Systems , volume=

    Reflexion: Language Agents with Verbal Reinforcement Learning , author=. Advances in Neural Information Processing Systems , volume=. 2023 , doi =

  63. [63]

    Gou, Zhibin and Shao, Zhihong and Gong, Yeyun and Shen, Yelong and Yang, Yujiu and Duan, Nan and Chen, Weizhu , booktitle=

  64. [64]

    Journal of the ACM , volume =

    Davis, Martin and Putnam, Hilary , title =. Journal of the ACM , volume =. 1960 , publisher =

  65. [65]

    , title =

    Davis, Martin and Logemann, George and Loveland, Donald W. , title =. Communications of the ACM , volume =. 1962 , publisher =

  66. [66]

    Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , pages =

    Marques-Silva, Jo. Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) , pages =. 1996 , publisher =

  67. [67]

    and Madigan, Conor F

    Moskewicz, Matthew W. and Madigan, Conor F. and Zhao, Ying and Zhang, Lintao and Malik, Sharad , title =. Proceedings of the 38th Design Automation Conference (DAC) , pages =. 2001 , publisher =

  68. [68]

    Computational Intelligence , volume =

    Prosser, Patrick , title =. Computational Intelligence , volume =. 1993 , publisher =

  69. [69]

    An Extensible

    E. An Extensible. Theory and Applications of Satisfiability Testing (SAT 2003) , series =. 2003 , publisher =

  70. [70]

    Gomez and

    Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and. Attention is All You Need , booktitle =. 2017 , publisher =

  71. [71]

    International Conference on Learning Representations , year =

    Ilya Loshchilov and Frank Hutter , title =. International Conference on Learning Representations , year =

  72. [72]

    The Annals of Statistics , volume =

    Bradley Efron , title =. The Annals of Statistics , volume =. 1979 , publisher =

  73. [73]

    Proceedings of the 31st

    Bryan Ford , title =. Proceedings of the 31st. 2004 , publisher =

  74. [74]

    Mitchell and Bart Selman and Hector J

    David G. Mitchell and Bart Selman and Hector J. Levesque , title =. Proceedings of the 10th National Conference on Artificial Intelligence , pages =. 1992 , publisher =

  75. [75]

    Crawford and Larry D

    James M. Crawford and Larry D. Auton , title =. Artificial Intelligence , volume =. 1996 , publisher =

  76. [76]

    Neural Computation , volume=

    Long Short-Term Memory , author=. Neural Computation , volume=. 1997 , publisher=

  77. [77]

    IEEE Transactions on Neural Networks , volume=

    The Graph Neural Network Model , author=. IEEE Transactions on Neural Networks , volume=. 2009 , publisher=