pith. sign in

arxiv: 2605.17008 · v1 · pith:37HGZ426new · submitted 2026-05-16 · 💻 cs.PL · cs.AI· cs.CL

The IsalProgram Programming Language

Pith reviewed 2026-05-19 18:43 UTC · model grok-4.3

classification 💻 cs.PL cs.AIcs.CL
keywords programming languageregular languageassembly languagevirtual machinecircular doubly linked listprogram synthesisedit distance
0
0 comments X

The pith

IsalProgram is an assembly-like language where every finite string is a valid program, it is regular, and it uses no memory addresses or variable names.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents IsalProgram as a programming language with three special properties that set it apart from conventional languages. It is regular, so its programs can be recognized by a finite automaton. Any sequence of its instructions forms a legal program with no syntax errors possible. It operates without referring to memory locations or variable names by using pointers on a circular list instead. The authors provide a full definition, prove the regularity, and show what it can compute, suggesting uses in machine learning for generating programs.

Core claim

The central claim is that IsalProgram, defined over a fixed set of instructions and executed on a virtual machine using a circular doubly linked list with three data pointers and two code pointers, forms a regular language in which every string is valid and computations are possible without explicit addressing.

What carries the argument

The virtual machine's circular doubly linked list navigated by three data pointers and controlled by two code pointers, which enables address-free execution.

If this is right

  • Program synthesis systems can generate any string without worrying about syntax validity.
  • The space of programs can be searched using edit distance metrics such as Levenshtein distance.
  • Neural networks can target this language more easily due to its uniform and regular structure.
  • Analysis of what is computable can proceed from the pointer navigation rules on the circular list.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Genetic programming could benefit by never producing invalid individuals since all strings run.
  • Equivalence of programs might be decidable more readily because of the finite automaton basis.
  • Extensions to more pointers or different list structures could be tested for greater computational power.

Load-bearing premise

That the described virtual machine with its circular doubly linked list and pointers can carry out the intended computations while maintaining the language's regularity and lack of addresses.

What would settle it

Finding a string over the instruction alphabet that cannot be executed by the virtual machine or that requires more than a finite number of states to recognize would falsify the claims.

Figures

Figures reproduced from arXiv: 2605.17008 by Ezequiel L\'opez-Rubio.

Figure 1
Figure 1. Figure 1: Circular doubly linked list (CDLL) with four nodes. The three data pointers (p [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Full pointer architecture of the IsalProgram VM. Two instruction-level pointers (IP and JP) index into the [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Function call protocol in IsalProgram. Kx saves the return address into the stack, and jumps to the function whose address is stored at the node pointed by x. After the function has completed its calculations, a final R at the end of the function body returns control to the caller. 7 Functions IsalProgram supports a structured notion of functions that avoids any literal memory address constant in the code.… view at source ↗
read the original abstract

We introduce IsalProgram (Instruction Set and Language for Programming), a novel assembly-like programming language with three distinctive theoretical properties: (1) it is a regular language in the sense of formal language theory, meaning its programs are accepted by a finite automaton; (2) every finite string over the instruction alphabet is a syntactically valid program; and (3) it makes no explicit use of memory addresses or variable names, absolute or relative. Programs are finite sequences of tokens drawn from a fixed instruction set, and are executed on a virtual machine whose sole data structure is a circular doubly linked list (CDLL) navigated by three data pointers, with control flow governed by two code pointers. We give a complete formal definition of the language and its virtual machine, prove its regularity, and demonstrate its expressive power. We further discuss IsalProgram's potential advantages as a target language for neural program synthesis, the amenability of its program space to metric-based exploration via the Levenshtein edit distance, and directions for analyzing computability and complexity within this framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces IsalProgram, an assembly-like language whose syntax consists of all finite strings over a fixed instruction alphabet (hence the language is Σ*, accepted by a trivial one-state finite automaton). It defines a virtual machine whose only data structure is a circular doubly linked list navigated by three data pointers, with control flow using two code pointers, to realize execution without any explicit memory addresses or variable names. The paper supplies a complete formal definition of the language and VM, proves syntactic regularity, demonstrates expressive power, and discusses applications to neural program synthesis via Levenshtein distance on the program space.

Significance. If the formal definition and proof hold, the work supplies a programming language whose entire program space is regular and metrizable by edit distance, which could simplify neural program synthesis and metric-based search. The CDLL-based VM with pointer navigation is a distinctive design that eliminates addresses while aiming to preserve computational expressiveness; this framework may also support new analyses of computability and complexity in an address-free setting.

major comments (2)
  1. The regularity claim follows immediately from defining the language as Σ* (every string is syntactically valid), but the manuscript should explicitly state in the formal definition section whether the VM semantics impose any runtime constraints that could affect the claimed properties (1)–(3) for arbitrary strings.
  2. The expressive-power demonstration must show concrete computations (e.g., basic arithmetic or control structures) that succeed using only the three data pointers and two code pointers on the CDLL; without such examples the claim that the address-free design remains sufficiently powerful is not yet load-bearing.
minor comments (3)
  1. Notation for the three data pointers and two code pointers should be introduced with a diagram or table in the VM definition section to improve readability.
  2. The discussion of Levenshtein distance on programs would benefit from a small worked example showing distance between two short valid strings and its relation to program similarity.
  3. A reference to standard results on regular languages (e.g., Sipser or Hopcroft & Ullman) would help situate the trivial FA acceptance argument.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the specific suggestions for improvement. We respond to each major comment below.

read point-by-point responses
  1. Referee: The regularity claim follows immediately from defining the language as Σ* (every string is syntactically valid), but the manuscript should explicitly state in the formal definition section whether the VM semantics impose any runtime constraints that could affect the claimed properties (1)–(3) for arbitrary strings.

    Authors: We agree that an explicit statement would strengthen the presentation. Properties (1) and (2) are purely syntactic and defined by the language being Σ* with acceptance by a trivial automaton; they are independent of execution. Property (3) concerns the static language design. The VM semantics describe runtime navigation and state changes on the CDLL but do not retroactively constrain which strings are valid programs. We will add a clarifying paragraph in the formal definition section stating that runtime behavior (including potential non-termination or pointer errors) does not affect syntactic validity or the address-free character of any string in Σ*. revision: yes

  2. Referee: The expressive-power demonstration must show concrete computations (e.g., basic arithmetic or control structures) that succeed using only the three data pointers and two code pointers on the CDLL; without such examples the claim that the address-free design remains sufficiently powerful is not yet load-bearing.

    Authors: The manuscript demonstrates expressive power by establishing that the three data pointers and two code pointers suffice to manipulate the CDLL in ways that simulate standard register-machine operations. To make this concrete as requested, we will insert worked examples in the expressive-power section: (a) incrementing a value by advancing a data pointer and updating the node payload, (b) implementing a conditional branch via code-pointer adjustment based on a list-node test, and (c) a simple loop that traverses and sums nodes. These examples will be added without altering the existing formal claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper defines IsalProgram directly as the set of all finite sequences of tokens from a fixed instruction alphabet, which by construction equals Σ* and is therefore regular (accepted by the trivial one-state finite automaton). The virtual machine is specified from first principles as a circular doubly linked list navigated by three data pointers and two code pointers, with no addresses or names. Regularity, validity of every string, and address-free operation follow immediately from these definitions using standard formal language theory and a new machine model. No load-bearing step reduces to a fitted parameter, self-citation chain, imported uniqueness theorem, or smuggled ansatz; the expressive-power demonstration and computability discussion are likewise derived directly from the given syntax and semantics without circular reduction to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on a newly defined virtual machine model and the assumption that this model preserves regularity while enabling computation; no numerical fitting is described.

axioms (2)
  • domain assumption Programs are finite sequences of tokens from a fixed instruction set executed on a CDLL VM with three data pointers and two code pointers.
    This is the core execution model stated in the abstract as the basis for all claimed properties.
  • standard math Regular languages are accepted by finite automata.
    Invoked to establish the first distinctive property.
invented entities (1)
  • IsalProgram virtual machine based on circular doubly linked list no independent evidence
    purpose: Provide data storage and navigation without memory addresses or variable names
    Newly postulated data structure and pointer system introduced to realize the three properties.

pith-pipeline@v0.9.0 · 5707 in / 1551 out tokens · 50299 ms · 2026-05-19T18:43:42.547269+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 2 internal anchors

  1. [1]

    Introduction to Automata Theory, Languages, and Computation , author =

  2. [2]

    Introduction to the Theory of Computation , author =

  3. [3]

    The Art of Assembly Language , author =

  4. [4]

    and Hennessy, John L

    Patterson, David A. and Hennessy, John L. , edition =. Computer Organization and Design

  5. [5]

    , journal =

    Turing, Alan M. , journal =. On Computable Numbers, with an Application to the

  6. [6]

    Computation: Finite and Infinite Machines , author =

  7. [7]

    The Art of Computer Programming, Volume 1: Fundamental Algorithms , author =

  8. [8]

    Soviet Physics Doklady , volume =

    Binary codes capable of correcting deletions, insertions, and reversals , author =. Soviet Physics Doklady , volume =

  9. [9]

    2001 , publisher =

    A guided tour to approximate string matching , author =. 2001 , publisher =

  10. [10]

    Evaluating Large Language Models Trained on Code

    Evaluating Large Language Models Trained on Code , author =. arXiv preprint arXiv:2107.03374 , year =

  11. [11]

    Program Synthesis with Large Language Models

    Program Synthesis with Large Language Models , author =. arXiv preprint arXiv:2108.07732 , year =

  12. [12]

    Advances in Neural Information Processing Systems , volume =

    Attention Is All You Need , author =. Advances in Neural Information Processing Systems , volume =

  13. [13]

    Computational Complexity , author =

  14. [14]

    Genetic Programming: On the Programming of Computers by Means of Natural Selection , author =

  15. [15]

    Automata Studies , editor =

    Representation of Events in Nerve Nets and Finite Automata , author =. Automata Studies , editor =