The IsalProgram Programming Language
Pith reviewed 2026-05-19 18:43 UTC · model grok-4.3
The pith
IsalProgram is an assembly-like language where every finite string is a valid program, it is regular, and it uses no memory addresses or variable names.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that IsalProgram, defined over a fixed set of instructions and executed on a virtual machine using a circular doubly linked list with three data pointers and two code pointers, forms a regular language in which every string is valid and computations are possible without explicit addressing.
What carries the argument
The virtual machine's circular doubly linked list navigated by three data pointers and controlled by two code pointers, which enables address-free execution.
If this is right
- Program synthesis systems can generate any string without worrying about syntax validity.
- The space of programs can be searched using edit distance metrics such as Levenshtein distance.
- Neural networks can target this language more easily due to its uniform and regular structure.
- Analysis of what is computable can proceed from the pointer navigation rules on the circular list.
Where Pith is reading between the lines
- Genetic programming could benefit by never producing invalid individuals since all strings run.
- Equivalence of programs might be decidable more readily because of the finite automaton basis.
- Extensions to more pointers or different list structures could be tested for greater computational power.
Load-bearing premise
That the described virtual machine with its circular doubly linked list and pointers can carry out the intended computations while maintaining the language's regularity and lack of addresses.
What would settle it
Finding a string over the instruction alphabet that cannot be executed by the virtual machine or that requires more than a finite number of states to recognize would falsify the claims.
Figures
read the original abstract
We introduce IsalProgram (Instruction Set and Language for Programming), a novel assembly-like programming language with three distinctive theoretical properties: (1) it is a regular language in the sense of formal language theory, meaning its programs are accepted by a finite automaton; (2) every finite string over the instruction alphabet is a syntactically valid program; and (3) it makes no explicit use of memory addresses or variable names, absolute or relative. Programs are finite sequences of tokens drawn from a fixed instruction set, and are executed on a virtual machine whose sole data structure is a circular doubly linked list (CDLL) navigated by three data pointers, with control flow governed by two code pointers. We give a complete formal definition of the language and its virtual machine, prove its regularity, and demonstrate its expressive power. We further discuss IsalProgram's potential advantages as a target language for neural program synthesis, the amenability of its program space to metric-based exploration via the Levenshtein edit distance, and directions for analyzing computability and complexity within this framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces IsalProgram, an assembly-like language whose syntax consists of all finite strings over a fixed instruction alphabet (hence the language is Σ*, accepted by a trivial one-state finite automaton). It defines a virtual machine whose only data structure is a circular doubly linked list navigated by three data pointers, with control flow using two code pointers, to realize execution without any explicit memory addresses or variable names. The paper supplies a complete formal definition of the language and VM, proves syntactic regularity, demonstrates expressive power, and discusses applications to neural program synthesis via Levenshtein distance on the program space.
Significance. If the formal definition and proof hold, the work supplies a programming language whose entire program space is regular and metrizable by edit distance, which could simplify neural program synthesis and metric-based search. The CDLL-based VM with pointer navigation is a distinctive design that eliminates addresses while aiming to preserve computational expressiveness; this framework may also support new analyses of computability and complexity in an address-free setting.
major comments (2)
- The regularity claim follows immediately from defining the language as Σ* (every string is syntactically valid), but the manuscript should explicitly state in the formal definition section whether the VM semantics impose any runtime constraints that could affect the claimed properties (1)–(3) for arbitrary strings.
- The expressive-power demonstration must show concrete computations (e.g., basic arithmetic or control structures) that succeed using only the three data pointers and two code pointers on the CDLL; without such examples the claim that the address-free design remains sufficiently powerful is not yet load-bearing.
minor comments (3)
- Notation for the three data pointers and two code pointers should be introduced with a diagram or table in the VM definition section to improve readability.
- The discussion of Levenshtein distance on programs would benefit from a small worked example showing distance between two short valid strings and its relation to program similarity.
- A reference to standard results on regular languages (e.g., Sipser or Hopcroft & Ullman) would help situate the trivial FA acceptance argument.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation and the specific suggestions for improvement. We respond to each major comment below.
read point-by-point responses
-
Referee: The regularity claim follows immediately from defining the language as Σ* (every string is syntactically valid), but the manuscript should explicitly state in the formal definition section whether the VM semantics impose any runtime constraints that could affect the claimed properties (1)–(3) for arbitrary strings.
Authors: We agree that an explicit statement would strengthen the presentation. Properties (1) and (2) are purely syntactic and defined by the language being Σ* with acceptance by a trivial automaton; they are independent of execution. Property (3) concerns the static language design. The VM semantics describe runtime navigation and state changes on the CDLL but do not retroactively constrain which strings are valid programs. We will add a clarifying paragraph in the formal definition section stating that runtime behavior (including potential non-termination or pointer errors) does not affect syntactic validity or the address-free character of any string in Σ*. revision: yes
-
Referee: The expressive-power demonstration must show concrete computations (e.g., basic arithmetic or control structures) that succeed using only the three data pointers and two code pointers on the CDLL; without such examples the claim that the address-free design remains sufficiently powerful is not yet load-bearing.
Authors: The manuscript demonstrates expressive power by establishing that the three data pointers and two code pointers suffice to manipulate the CDLL in ways that simulate standard register-machine operations. To make this concrete as requested, we will insert worked examples in the expressive-power section: (a) incrementing a value by advancing a data pointer and updating the node payload, (b) implementing a conditional branch via code-pointer adjustment based on a list-node test, and (c) a simple loop that traverses and sums nodes. These examples will be added without altering the existing formal claims. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper defines IsalProgram directly as the set of all finite sequences of tokens from a fixed instruction alphabet, which by construction equals Σ* and is therefore regular (accepted by the trivial one-state finite automaton). The virtual machine is specified from first principles as a circular doubly linked list navigated by three data pointers and two code pointers, with no addresses or names. Regularity, validity of every string, and address-free operation follow immediately from these definitions using standard formal language theory and a new machine model. No load-bearing step reduces to a fitted parameter, self-citation chain, imported uniqueness theorem, or smuggled ansatz; the expressive-power demonstration and computability discussion are likewise derived directly from the given syntax and semantics without circular reduction to the inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Programs are finite sequences of tokens from a fixed instruction set executed on a CDLL VM with three data pointers and two code pointers.
- standard math Regular languages are accepted by finite automata.
invented entities (1)
-
IsalProgram virtual machine based on circular doubly linked list
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
virtual machine whose sole data structure is a circular doubly linked list (CDLL) navigated by three data pointers, with control flow governed by two code pointers
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Introduction to Automata Theory, Languages, and Computation , author =
-
[2]
Introduction to the Theory of Computation , author =
-
[3]
The Art of Assembly Language , author =
-
[4]
Patterson, David A. and Hennessy, John L. , edition =. Computer Organization and Design
-
[5]
Turing, Alan M. , journal =. On Computable Numbers, with an Application to the
-
[6]
Computation: Finite and Infinite Machines , author =
-
[7]
The Art of Computer Programming, Volume 1: Fundamental Algorithms , author =
-
[8]
Soviet Physics Doklady , volume =
Binary codes capable of correcting deletions, insertions, and reversals , author =. Soviet Physics Doklady , volume =
-
[9]
A guided tour to approximate string matching , author =. 2001 , publisher =
work page 2001
-
[10]
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code , author =. arXiv preprint arXiv:2107.03374 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models , author =. arXiv preprint arXiv:2108.07732 , year =
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Advances in Neural Information Processing Systems , volume =
Attention Is All You Need , author =. Advances in Neural Information Processing Systems , volume =
-
[13]
Computational Complexity , author =
-
[14]
Genetic Programming: On the Programming of Computers by Means of Natural Selection , author =
-
[15]
Representation of Events in Nerve Nets and Finite Automata , author =. Automata Studies , editor =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.