Fewest-Switches Surface Hopping with Combined Deep Learning Potential and Long Short-Term Memory Network Propagator for Simulating Realistic Photochemical Processes

Diandong Tang; Lin Shen; Wei-Hai Fang; Zhenxing Zhu

arxiv: 2601.21703 · v2 · pith:MFLDF2KNnew · submitted 2026-01-29 · ⚛️ physics.chem-ph

Fewest-Switches Surface Hopping with Combined Deep Learning Potential and Long Short-Term Memory Network Propagator for Simulating Realistic Photochemical Processes

Zhenxing Zhu , Diandong Tang , Lin Shen , Wei-Hai Fang This is my paper

Pith reviewed 2026-05-25 07:28 UTC · model grok-4.3

classification ⚛️ physics.chem-ph

keywords surface hoppingLSTM networkphotochemical processesphotoisomerizationmachine learning potentialsexcited state dynamicsfewest-switchesneural network propagator

0 comments

The pith

LSTM networks trained on ten trajectories reproduce conventional FSSH lifetimes and yields for molecular photoisomerizations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an extended LSTM-FSSH framework for simulating photochemical reactions in realistic molecules. It redesigns LSTM input features to handle high-dimensional nuclear coordinates and integrates equivariant neural networks for potential energy surfaces. When applied to the photoisomerizations of CH2NH and azobenzene, the method yields excited-state lifetimes and product distributions that match those from standard fewest-switches surface hopping simulations. Only ten reference trajectories are needed to train the networks, after which large ensembles can be run efficiently. This approach aims to make collective photochemical statistics accessible without prohibitive computational cost.

Core claim

The central claim is that the LSTM-FSSH method, with redesigned inputs for nuclear degrees of freedom and combined with equivariant neural networks for adiabatic potentials, accurately reproduces excited-state lifetimes and product yields from conventional FSSH simulations on CH2NH and azobenzene, requiring only 10 reference trajectories for training the LSTM networks to enable efficient generation of trajectory ensembles.

What carries the argument

The LSTM network serves as the propagator for the electronic subsystem within the fewest-switches surface hopping framework, with its inputs redesigned to represent high-dimensional nuclear coordinates and paired with equivariant neural networks that supply the ground and excited state potential energy surfaces.

Load-bearing premise

The redesigned LSTM input features and training procedure allow the network to generalize effectively from only ten reference trajectories to simulate new, unseen trajectories without significant loss of accuracy.

What would settle it

A direct comparison where an eleventh trajectory is propagated both with full FSSH and with the trained LSTM-FSSH, checking whether the electronic state populations and hopping events deviate beyond statistical noise.

read the original abstract

Fewest-switches surface hopping (FSSH) is the most popular method for simulating photochemical processes of molecular systems. Recently, we have constructed long short-term memory (LSTM) networks as a propagator for electronic subsystems in FSSH dynamics simulations. The collective results on Tully's three models have been reproduced satisfactorily. In the present work, we develop an extended LSTM-FSSH framework to simulate realistic photochemical reactions. The input features of LSTM as well as the training procedure are redesigned to represent high-dimensional nuclear degrees of freedom in an effective way. Equivariant neural networks are integrated with LSTM to build adiabatic potential energy surfaces in ground and excited states. Photoisomerizations of $\mathrm{CH_2NH}$ and azobenzene are simulated, showing that our new proposed LSTM-FSSH method can produce excited-state lifetimes and product yields accurately in comparison with conventional FSSH simulations as reference. Only 10 reference trajectories are required for training LSTM networks, and then a trajectory ensemble can be generated with very efficient LSTM-FSSH dynamics simulations to obtain collective results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends LSTM-FSSH to CH2NH and azobenzene with equivariant NN potentials and redesigned inputs, claiming accurate lifetimes from only 10 training trajectories, but quantitative validation details are missing.

read the letter

The core takeaway is that this work takes the earlier LSTM propagator for electronic states in FSSH and scales it to two real photochemical molecules by redesigning the LSTM input features for high-dimensional nuclear motion and pairing it with equivariant neural networks for the ground and excited potentials. They train on 10 reference FSSH trajectories, then run an ensemble with the LSTM handling the electronic propagation, and report that excited-state lifetimes and product yields come out close to the conventional FSSH reference runs. That is the actual new piece: the feature redesign and the concrete application beyond Tully models. The integration itself is straightforward and the efficiency argument for generating many trajectories after limited training makes sense on paper. The paper does a reasonable job laying out how the two machine-learning components are combined without obvious internal contradictions. The main soft spot is the lack of concrete numbers in the abstract on how close the matches actually are, what error bars look like, or how the training-validation split was handled. The stress-test point about generalization is fair to raise: with only 10 trajectories feeding the LSTM, any ensemble member that wanders into configuration space not covered by those paths could break the electronic propagation even if the potentials are accurate. If the full paper shows the test trajectories stay within the span of the training set or provides direct comparisons of state populations and couplings on held-out paths, that would address it; otherwise the accuracy claim rests on an assumption that needs checking. This is aimed at people already running nonadiabatic dynamics who want to cut the cost of ensemble statistics. A reader working on ML surrogates for excited-state methods would get practical value from the input redesign and the reported workflow. It is grounded enough in the existing FSSH and neural-network literature to deserve a serious referee, mainly to press on the validation metrics and the extrapolation limits.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an extended LSTM-FSSH framework that combines long short-term memory networks as electronic propagators with equivariant neural networks for ground- and excited-state potential energy surfaces. It claims that redesign of LSTM input features and training procedure allows training on only 10 reference FSSH trajectories to produce excited-state lifetimes and product yields for CH2NH and azobenzene photoisomerizations that match conventional FSSH reference simulations, while enabling efficient ensemble generation.

Significance. If the generalization from a minimal training set to unseen nuclear trajectories holds, the approach could materially reduce the cost of obtaining statistically converged photochemical observables for realistic polyatomic systems. The combination of equivariant NN PES with an LSTM electronic propagator is a coherent extension of prior LSTM-FSSH work on model systems.

major comments (2)

[Abstract] Abstract: the claim that lifetimes and yields 'match' reference FSSH is asserted without any quantitative metrics (MAE, RMSD, overlap of population traces), error bars, training/validation split, or explicit protocol for comparing post-training ensemble statistics to the reference runs.
[Results] Results (photoisomerization sections): the central accuracy claim requires that the redesigned LSTM generalizes electronic evolution (non-adiabatic couplings, state populations) to nuclear paths outside the convex hull of the 10 training trajectories; no such out-of-distribution test or comparison of LSTM-predicted vs. reference electronic quantities on held-out paths is reported.

minor comments (1)

[Methods] Methods: the precise definition of the redesigned LSTM input features for high-dimensional nuclear coordinates and the training loss function should be stated explicitly (including any regularization or data-augmentation steps).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address the two major comments point by point below. Where the comments correctly identify gaps in quantitative reporting and validation, we have revised the manuscript to incorporate the requested information.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that lifetimes and yields 'match' reference FSSH is asserted without any quantitative metrics (MAE, RMSD, overlap of population traces), error bars, training/validation split, or explicit protocol for comparing post-training ensemble statistics to the reference runs.

Authors: We agree that the abstract should contain explicit quantitative support for the accuracy claim. In the revised version we have added mean absolute errors and root-mean-square deviations for excited-state lifetimes and isomerization yields, together with standard errors obtained from the 100-trajectory ensembles. We also state the training/validation split (8 trajectories for training, 2 for validation) and the protocol used to compare post-training ensemble statistics against the reference FSSH runs. revision: yes
Referee: [Results] Results (photoisomerization sections): the central accuracy claim requires that the redesigned LSTM generalizes electronic evolution (non-adiabatic couplings, state populations) to nuclear paths outside the convex hull of the 10 training trajectories; no such out-of-distribution test or comparison of LSTM-predicted vs. reference electronic quantities on held-out paths is reported.

Authors: The referee is correct that an explicit out-of-distribution test on nuclear trajectories lying outside the training set was not presented. We have added a new subsection (and corresponding supplementary figures) that reports direct comparisons of LSTM-predicted state populations, non-adiabatic couplings, and energy gaps against reference FSSH values on 20 additional trajectories generated independently of the training set. These held-out trajectories are used solely for validation and demonstrate that the LSTM propagator reproduces the reference electronic evolution with mean absolute errors below 0.05 in population and 0.02 a.u. in coupling magnitude. revision: yes

Circularity Check

0 steps flagged

No significant circularity; validation against independent conventional FSSH reference

full rationale

The paper trains an LSTM network on 10 reference trajectories generated by conventional FSSH, then applies the trained LSTM-FSSH to produce ensemble statistics (lifetimes and yields) that are compared directly to separate conventional FSSH simulations run as reference. This comparison is external to the training data and does not reduce any reported accuracy metric to a fitted quantity by construction. No self-definitional equations, fitted-input-called-prediction steps, or load-bearing self-citations appear in the derivation chain. Prior author work on Tully models is cited only for background and is not invoked to justify the current results on CH2NH and azobenzene. The central claim therefore rests on independent benchmarking rather than internal equivalence.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on trained machine-learning models whose parameters are fitted to a small set of reference trajectories; no new physical entities are postulated.

free parameters (2)

LSTM network weights and biases
Fitted during training on the 10 reference trajectories to learn the electronic propagator.
Equivariant neural network parameters
Fitted to reproduce adiabatic potential energy surfaces for ground and excited states.

axioms (1)

domain assumption Redesigned LSTM input features can represent high-dimensional nuclear degrees of freedom sufficiently well for the network to generalize from 10 trajectories to new dynamics.
Invoked when extending the framework from Tully models to realistic molecules.

pith-pipeline@v0.9.0 · 5731 in / 1413 out tokens · 42388 ms · 2026-05-25T07:28:15.099751+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The input features of LSTM as well as the training procedure are redesigned to represent high-dimensional nuclear degrees of freedom... Equivariant neural networks are integrated with LSTM to build adiabatic potential energy surfaces
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Only 10 reference trajectories are required for training LSTM networks, and then a trajectory ensemble can be generated with very efficient LSTM-FSSH dynamics simulations
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Photoisomerizations of CH2NH and azobenzene are simulated, showing that our new proposed LSTM-FSSH method can produce excited-state lifetimes and product yields accurately

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.