A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.
Najoung Kim and Tal Linzen
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
R-EMID metric with upper bound shows user shifts pose highest risk to role-playing model generalization, with co-evolving RL as most effective mitigation.
LLMs show strong spatial generalization to unseen maps in shortest-path tasks but fail length scaling due to recursive instability, with data coverage setting hard limits.
LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.
citing papers explorer
-
Training Transformers as a Universal Computer
A transformer trained on random meaningless MicroPy programs generalizes to execute diverse human-written programs, providing empirical evidence it can act as a universal computer.
-
Understanding Generalization in Role-Playing Models via Information Theory
R-EMID metric with upper bound shows user shifts pose highest risk to role-playing model generalization, with co-evolving RL as most effective mitigation.
-
Generalization in LLM Problem Solving: The Case of the Shortest Path
LLMs show strong spatial generalization to unseen maps in shortest-path tasks but fail length scaling due to recursive instability, with data coverage setting hard limits.
-
How Do Language Models Compose Functions?
LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.
- Learning to Theorize the World from Observation