RL unknotter, hard unknots and unknotting number
Pith reviewed 2026-05-15 14:25 UTC · model grok-4.3
The pith
A reinforcement learning pipeline for Reidemeister moves recovers the unknotting-number upper bound of three on the composite knot 4_1#9_10 via diagram inflation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By training an RL policy and value function on sequences of Reidemeister moves, the authors produce an automated simplifier that, when diagram inflation is allowed, reduces 4_1#9_10 to a three-crossing unknot diagram and thereby confirms the upper bound of three on its unknotting number; the same pipeline extends self-improvingly to generate improved unknotting-number bounds for prime knots.
What carries the argument
Reinforcement-learning policy for proposing Reidemeister moves together with a learned value heuristic, augmented by controlled diagram inflation to escape local minima.
If this is right
- The pipeline applies unchanged to arbitrary knots and links.
- It recovers known hard unknot diagrams without manual intervention.
- A self-improving workbook loop systematically lowers unknotting-number upper bounds on the census of prime knots.
- The same move-proposal and value machinery can be retrained on other local simplification problems in knot theory.
Where Pith is reading between the lines
- If the policy generalizes beyond the training set, automated simplification could become a routine first step before human or symbolic computation of knot invariants.
- Diagram inflation combined with learned heuristics may offer a practical route to upper bounds for other crossing-number-like quantities.
- The workbook loop suggests a template for iterative improvement of any search-based bound in low-dimensional topology.
Load-bearing premise
The trained policy and value function can reliably find short sequences of Reidemeister moves that simplify arbitrary diagrams without becoming trapped or requiring unbounded inflation.
What would settle it
A run in which the agent is given the standard diagram of 4_1#9_10, allowed unlimited inflation, and still fails to produce any diagram with crossing number at most three whose unknotting number is obviously one.
read the original abstract
We develop a reinforcement learning pipeline for simplifying knot diagrams. A trained agent learns move proposals and a value heuristic for navigating Reidemeister moves. The pipeline applies to arbitrary knots and links; we test it on ``very hard'' unknot diagrams and, using diagram inflation, on $4_1\#9_{10}$ where we recover the recently established and surprising upper bound of three for the unknotting number. In addition, we explain a self-improving workbook-driven extension of the pipeline that systematically improves unknotting number upper bounds on the list of prime knots.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a reinforcement learning pipeline that trains an agent to propose Reidemeister moves together with a value heuristic for simplifying knot diagrams. It applies the method to hard unknot diagrams and, via diagram inflation on the connected sum 4_1 # 9_10, recovers the recently established upper bound of three for the unknotting number; it further outlines a self-improving workbook extension intended to tighten unknotting-number bounds on the list of prime knots.
Significance. If the trained policy is shown to produce verifiable sequences of Reidemeister moves and crossing changes that achieve the claimed bound without becoming trapped in local minima, the work supplies a new computational route to upper bounds on unknotting numbers, a quantity whose exact values remain unknown for many knots. The self-improving workbook component could, in principle, be applied systematically to knot tables.
major comments (2)
- [Abstract] Abstract: the central claim that diagram inflation on 4_1 # 9_10 recovers the upper bound of three is presented without any reported success rate, training curves, final diagram size after inflation, or explicit sequence of Reidemeister moves plus crossing changes; this information is required to verify that the RL policy actually reaches the claimed simplification rather than becoming trapped.
- [Abstract and pipeline description] The manuscript states that the pipeline applies to arbitrary knots and links, yet supplies no quantitative evidence (e.g., success fraction on a test set of hard unknots or robustness across random seeds) that the learned policy and value heuristic reliably navigate the Reidemeister move graph without excessive inflation or local-minimum trapping; this assumption is load-bearing for all reported results.
minor comments (1)
- Notation for the connected sum 4_1 # 9_10 and the inflation procedure should be defined explicitly in the main text rather than left to the abstract.
Simulated Author's Rebuttal
We thank the referee for their careful reading and valuable comments on our manuscript. We have revised the paper to address the concerns about verification details and quantitative evidence. Our responses to the major comments are as follows.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that diagram inflation on 4_1 # 9_10 recovers the upper bound of three is presented without any reported success rate, training curves, final diagram size after inflation, or explicit sequence of Reidemeister moves plus crossing changes; this information is required to verify that the RL policy actually reaches the claimed simplification rather than becoming trapped.
Authors: We agree that the abstract does not include these verification details. In the revised manuscript, we expand the abstract to report the success rate from our experiments on the 4_1 # 9_10 diagram, include references to the training curves and final diagram sizes shown in the results section, and provide an explicit sequence of Reidemeister moves and crossing changes that achieves the unknotting number of three. This revision ensures the central claim is fully verifiable. revision: yes
-
Referee: [Abstract and pipeline description] The manuscript states that the pipeline applies to arbitrary knots and links, yet supplies no quantitative evidence (e.g., success fraction on a test set of hard unknots or robustness across random seeds) that the learned policy and value heuristic reliably navigate the Reidemeister move graph without excessive inflation or local-minimum trapping; this assumption is load-bearing for all reported results.
Authors: The referee correctly notes the lack of quantitative evidence for general applicability. While the manuscript focuses on specific challenging cases, we acknowledge the need for broader validation. In the revised version, we have added quantitative evidence including success fractions on a test set of hard unknots and performance metrics across multiple random seeds, demonstrating reliable navigation without trapping or excessive inflation. This supports the pipeline's applicability to arbitrary knots and links. revision: yes
Circularity Check
No significant circularity in RL unknotting pipeline
full rationale
The paper trains an RL agent and value heuristic independently on Reidemeister moves, then applies the resulting policy to specific inflated diagrams such as 4_1#9_10 to recover a previously established upper bound of three on the unknotting number. No load-bearing step equates the output sequence or bound to the training inputs by construction, nor does the central claim rest on a self-citation chain, fitted parameter renamed as prediction, or ansatz smuggled via prior work. The computational procedure is external to the mathematical result being recovered and remains falsifiable by independent verification of the move sequence.
Axiom & Free-Parameter Ledger
free parameters (1)
- RL hyperparameters (learning rate, discount factor, network architecture)
axioms (1)
- standard math Reidemeister moves generate the equivalence relation on knot diagrams
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.