Efficient state transition algorithm with guaranteed optimality
Pith reviewed 2026-05-22 18:33 UTC · model grok-4.3
The pith
An enhanced state transition algorithm adds predictive translations, adaptive controls, and a zero-gradient-style stop rule to reach the optimal solution without preset iteration limits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The enhanced STA with novel translation transformations inspired by predictive modeling, adaptive parameter control strategies, and a dedicated termination condition analogous to the zero gradient condition ensures that the algorithm converges at the optimal solution.
What carries the argument
The dedicated termination condition modeled on the zero-gradient condition of mathematical programming, which halts iteration once no further improvement is detected and thereby certifies optimality.
If this is right
- Users no longer need to choose a maximum iteration count by intuition before running the algorithm.
- The method accelerates search in late stages where the landscape is flat, reducing wasted iterations.
- Experimental comparisons show the enhanced version outperforms the original STA on the tested problems.
- The combination of historical-data translations and adaptive parameters produces more diverse candidate solutions at each step.
Where Pith is reading between the lines
- The same termination logic could be tested on other population-based optimizers that suffer from premature stopping.
- Because the method records historical data, it may lend itself to sequential or data-stream optimization settings where past evaluations remain useful.
- If the termination test proves landscape-independent, hybrid schemes that combine this STA with local gradient refinement become easier to justify.
Load-bearing premise
The termination condition reliably locates the global optimum instead of halting at a local flat region or saddle point on any given problem landscape.
What would settle it
Execute the algorithm on a standard benchmark function whose global minimum value is known in advance; if the algorithm stops at a point whose objective value exceeds that known minimum, the optimality guarantee does not hold.
Figures
read the original abstract
The state transition algorithm (STA), as an intelligent optimization method grounded in constructivist learning, has been demonstrated to be highly effective in solving complex optimization problems. However, the standard STA suffers from slow convergence, particularly in the later stages when dealing with flat landscapes. Additionally, users are required to set the maximum number of iterations based on intuition. To address these issues, an enhanced STA with guaranteed optimality is introduced. This improvement involves three key components. First, novel translation transformations, inspired by predictive modeling, are developed to generate a broader set of candidate solutions by leveraging historical data. Second, adaptive parameter control strategies are incorporated to accelerate convergence. Finally, a dedicated termination condition is designed to ensure that the algorithm converges at the optimal solution, analogous to the zero gradient condition in mathematical programming. The comprehensive experimental results validate the effectiveness and superiority of the proposed method. The source codes for ESTA and EXSTA will be publicly available at: https://github.com/tiezhongyu2005/ESTA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces an enhanced State Transition Algorithm (ESTA) that augments the standard STA with three components: novel translation transformations inspired by predictive modeling to broaden candidate solution generation using historical data, adaptive parameter control strategies to accelerate convergence especially in flat landscapes, and a dedicated termination condition designed to be analogous to the zero-gradient condition in mathematical programming. The authors assert that these changes eliminate the need for user-specified maximum iterations and guarantee convergence to the optimal solution, with comprehensive experiments validating superiority over the baseline STA. Source code is promised to be released publicly.
Significance. An optimization method that combines improved candidate generation, adaptive tuning, and a theoretically justified stopping rule with guaranteed optimality would be valuable for non-convex problems in operations research and engineering. The public code release supports reproducibility. However, the significance is limited by the absence of a demonstrated link between the termination rule and global optimality, which is the central claim.
major comments (2)
- [Abstract] Abstract: The claim that the dedicated termination condition 'ensures that the algorithm converges at the optimal solution' is load-bearing for the paper's title and abstract but is not supported by any derivation. The zero-gradient analogy identifies stationary points; in non-convex landscapes this does not imply global optimality without additional elements such as optimality-gap bounds or exhaustive verification, which are not described.
- [Abstract] The description of the termination condition provides no explicit computation from the algorithm's internal state (e.g., how the 'small-change' threshold is derived without post-hoc parameters or fitted models). This leaves open whether the rule certifies a global solution or merely a local flat region, directly undermining the 'guaranteed optimality' assertion.
minor comments (2)
- The abstract states that source codes 'will be publicly available' at a GitHub link; the manuscript should include the exact commit or DOI for the released code to ensure reproducibility.
- Notation for the translation transformations and adaptive parameters should be defined more explicitly with equations rather than descriptive text to aid readers in implementation.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying the need to strengthen the theoretical justification of the termination condition. We respond to each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that the dedicated termination condition 'ensures that the algorithm converges at the optimal solution' is load-bearing for the paper's title and abstract but is not supported by any derivation. The zero-gradient analogy identifies stationary points; in non-convex landscapes this does not imply global optimality without additional elements such as optimality-gap bounds or exhaustive verification, which are not described.
Authors: We agree that the zero-gradient analogy alone identifies stationary points rather than guaranteeing global optimality in non-convex settings. The manuscript presents the termination rule as a practical mechanism that halts when state transitions become negligible under the adaptive controls and historical translations, which are intended to promote escape from local flat regions. No formal convergence theorem establishing global optimality is derived in the current version. We will revise the abstract and add a dedicated subsection on convergence properties that explicitly states the condition detects stationarity and that global optimality is supported empirically on the tested benchmarks rather than proven theoretically. We will also soften the title and abstract wording from 'guaranteed optimality' to 'with a theoretically motivated stopping criterion'. revision: yes
-
Referee: [Abstract] The description of the termination condition provides no explicit computation from the algorithm's internal state (e.g., how the 'small-change' threshold is derived without post-hoc parameters or fitted models). This leaves open whether the rule certifies a global solution or merely a local flat region, directly undermining the 'guaranteed optimality' assertion.
Authors: We acknowledge that the abstract provides only a high-level description. In the full manuscript the threshold is computed from the norm of recent translation vectors and the current adaptive parameter range, using statistics from the historical data buffer without external fitting. To make this fully explicit we will expand the methodology section with the precise formula, an algorithmic listing of the check, and a short discussion distinguishing the rule from simple stagnation detection. This revision will clarify that the condition operates on internal state quantities while still noting its limitation to stationarity rather than certified global optimality. revision: yes
Circularity Check
No circularity: termination condition and transformations presented as independent design choices
full rationale
The paper describes three explicit enhancements to STA: novel translation transformations using historical data, adaptive parameter control, and a termination condition modeled after the zero-gradient test. None of these are shown to reduce by construction to fitted inputs or prior self-citations within the provided abstract and description. The optimality guarantee is asserted via the designed termination rule rather than derived from equations that presuppose the result. Experimental validation is referenced as external support. No load-bearing self-citation chain or self-definitional loop is exhibited.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
a specific termination condition is designed to guarantee that the STA can stop automatically at an optimal point, which is equivalent to the zero gradient in mathematical programming
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 1. ... the sequence {f(Bestk)} generated by the STA can at least converge to a local minimum
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A statistical study on parameter selection of operators in continuous state transition algorithm,
X. Zhou, C. Yang, and W. Gui, “A statistical study on parameter selection of operators in continuous state transition algorithm,” IEEE Transactions on Cybernetics, vol. 49, no. 10, pp. 3722–3730, 2019
work page 2019
-
[2]
The principle of state transition algorithm and its applications,
——, “The principle of state transition algorithm and its applications,” Acta Automatica Sinica , vol. 46, no. 11, pp. 2260–2274, 2020
work page 2020
-
[3]
J. Tang, G. Liu, and Q. Pan, “A review on representative swarm intelligence algorithms for solving optimization problems: Applications and trends,” IEEE/CAA Journal of Automatica Sinica, vol. 8, no. 10, pp. 1627–1643, 2021
work page 2021
-
[4]
I. Supena, A. Darmuki, and A. Hariyadi, “The influence of 4c (con- structive, critical, creativity, collaborative) learning model on students’ learning outcomes.” International Journal of Instruction , vol. 14, no. 3, pp. 873–892, 2021
work page 2021
-
[5]
X. Zhou, C. Yang, and W. Gui, “State transition algorithm,” Journal of Industrial and Management Optimization , vol. 8, no. 4, pp. 1039–1056, 2012
work page 2012
-
[6]
X. Zhou, X. Wang, T. Huang, and C. Yang, “Hybrid intelligence assisted sample average approximation method for chance constrained dynamic optimization,” IEEE Transactions on Industrial Informatics , vol. 17, no. 9, pp. 6409–6418, 2021
work page 2021
-
[7]
Stackelberg game approach for robust optimization with fuzzy variables,
J. Han, C. Yang, C.-C. Lim, X. Zhou, and P. Shi, “Stackelberg game approach for robust optimization with fuzzy variables,” IEEE Transactions on Fuzzy Systems , vol. 30, no. 1, pp. 258–269, 2022
work page 2022
-
[8]
Data-driven state tran- sition algorithm for fuzzy chance-constrained dynamic optimization,
F. Lin, X. Zhou, C. Li, T. Huang, and C. Yang, “Data-driven state tran- sition algorithm for fuzzy chance-constrained dynamic optimization,” IEEE Transactions on Neural Networks and Learning Systems , vol. 34, no. 9, pp. 5322–5331, 2023. 11 TABLE VI: Comparison of the average GradNorm with other algorithms under maxStalls Fcn Dim GL25 CLPSO CMAES SaDE GWO...
work page 2023
-
[9]
Y . Dong, C. Wang, H. Zhang, and X. Zhou, “A novel multi-objective optimization framework for optimal integrated energy system planning with demand response under multiple uncertainties,” Information Sci- ences, vol. 663, p. 120252, 2024
work page 2024
-
[10]
X. Zhou, W. Tan, Y . Sun, T. Huang, and C. Yang, “Multi-objective optimization and decision making for integrated energy system using sta and fuzzy topsis,” Expert systems with applications , vol. 240, p. 122539, 2024
work page 2024
-
[11]
Interpretable multiobjec- tive feature selection via visualization in froth flotation process,
X. Zhou, M. Li, Y . Du, C. Yang, and S. Wen, “Interpretable multiobjec- tive feature selection via visualization in froth flotation process,” IEEE Transactions on Industrial Informatics , vol. 21, no. 3, pp. 2530–2539, 2025
work page 2025
-
[12]
M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, Nonlinear programming: theory and algorithms . John wiley & sons, 2006
work page 2006
-
[13]
New stopping criterion for genetic algorithms,
H. Aytug and G. J. Koehler, “New stopping criterion for genetic algorithms,” European Journal of Operational Research, vol. 126, no. 3, pp. 662–674, 2000
work page 2000
-
[14]
Termination criteria in evolutionary algorithms: A survey
S. N. Ghoreishi, A. Clausen, and B. N. Jørgensen, “Termination criteria in evolutionary algorithms: A survey.” in IJCCI, 2017, pp. 373–384
work page 2017
-
[15]
Termination detection strategies in evolutionary algorithms: a survey,
Y . Liu, A. Zhou, and H. Zhang, “Termination detection strategies in evolutionary algorithms: a survey,” in Proceedings of the Genetic and Evolutionary Computation Conference , 2018, pp. 1063–1070
work page 2018
-
[16]
S. Kukkonen and C. A. C. Coello, “A simple and effective termination condition for both single-and multi-objective evolutionary algorithms,” in 2019 IEEE Congress on Evolutionary Computation (CEC) . IEEE, 2019, pp. 3053–3059
work page 2019
-
[17]
A stopping criterion for multi-objective optimization evolutionary algorithms,
L. Mart ´ı, J. Garc´ıa, A. Berlanga, and J. M. Molina, “A stopping criterion for multi-objective optimization evolutionary algorithms,” Information 13 Sciences, vol. 367, pp. 700–718, 2016
work page 2016
-
[18]
Maximum number of generations as a stopping criterion considered harmful,
M. Ravber, S.-H. Liu, M. Mernik, and M. ˇCrepinˇsek, “Maximum number of generations as a stopping criterion considered harmful,” Applied Soft Computing, vol. 128, p. 109478, 2022
work page 2022
-
[19]
A dynamic state transition algorithm with application to sensor network localization,
X. Zhou, P. Shi, C.-C. Lim, C. Yang, and W. Gui, “A dynamic state transition algorithm with application to sensor network localization,” Neurocomputing, vol. 273, pp. 237–250, 2018
work page 2018
-
[20]
Self-paced ARIMA for robust time series prediction,
Y . Li, K. Wu, and J. Liu, “Self-paced ARIMA for robust time series prediction,” Knowledge-Based Systems, vol. 269, p. 110489, 2023
work page 2023
-
[21]
Time series forecasting using a hybrid ARIMA and neural network model,
G. P. Zhang, “Time series forecasting using a hybrid ARIMA and neural network model,” Neurocomputing, vol. 50, pp. 159–175, 2003
work page 2003
-
[22]
Global and local real-coded genetic algorithms based on parent-centric crossover operators,
C. Garc ´ıa-Mart´ınez, M. Lozano, F. Herrera, D. Molina, and A. M. S´anchez, “Global and local real-coded genetic algorithms based on parent-centric crossover operators,” European Journal of Operational Research, vol. 185, no. 3, pp. 1088–1113, 2008
work page 2008
-
[23]
Comprehensive learning particle swarm optimizer for global optimization of multimodal functions,
J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar, “Comprehensive learning particle swarm optimizer for global optimization of multimodal functions,” IEEE transactions on evolutionary computation , vol. 10, no. 3, pp. 281–295, 2006
work page 2006
-
[24]
N. Hansen, S. D. M ¨uller, and P. Koumoutsakos, “Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es),” Evolutionary computation , vol. 11, no. 1, pp. 1–18, 2003
work page 2003
-
[25]
Differential evolution algorithm with strategy adaptation for global numerical optimization,
A. K. Qin, V . L. Huang, and P. N. Suganthan, “Differential evolution algorithm with strategy adaptation for global numerical optimization,” IEEE transactions on Evolutionary Computation , vol. 13, no. 2, pp. 398–417, 2009
work page 2009
-
[26]
S. Mirjalili, S. M. Mirjalili, and A. Lewis, “Grey wolf optimizer,” Advances in engineering software , vol. 69, pp. 46–61, 2014
work page 2014
-
[27]
The whale optimization algorithm,
S. Mirjalili and A. Lewis, “The whale optimization algorithm,” Advances in engineering software , vol. 95, pp. 51–67, 2016
work page 2016
-
[28]
Harris hawks optimization: Algorithm and applications,
A. A. Heidari, S. Mirjalili, H. Faris, I. Aljarah, M. Mafarja, and H. Chen, “Harris hawks optimization: Algorithm and applications,” Future generation computer systems , vol. 97, pp. 849–872, 2019
work page 2019
-
[29]
Slime mould algorithm: A new method for stochastic optimization,
S. Li, H. Chen, M. Wang, A. A. Heidari, and S. Mirjalili, “Slime mould algorithm: A new method for stochastic optimization,”Future generation computer systems, vol. 111, pp. 300–323, 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.