Towards symbolic regression for interpretable clinical decision scores
Pith reviewed 2026-05-16 23:56 UTC · model grok-4.3
The pith
Brush combines decision-tree splits with symbolic regression to build accurate yet simple clinical risk scores.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Brush is a symbolic regression algorithm that incorporates decision-tree-like splitting algorithms together with non-linear constant optimization. This combination allows symbolic models to include discrete rule-based logic alongside continuous functions. On SRBench the method achieves Pareto-optimal performance. When applied to real clinical data it recapitulates two widely used scoring systems at high accuracy, while producing models that are simpler than those from decision trees, random forests, or competing symbolic regression approaches.
What carries the argument
Brush, the algorithm that merges decision-tree-like splitting with non-linear constant optimization to embed rule-based logic inside symbolic regression models.
If this is right
- Clinical risk scores can be learned directly from data yet remain simple enough for direct inspection and use in standardized pathways.
- Symbolic regression becomes usable for tasks that require both continuous equations and discrete decision rules without separate post-processing.
- Models generated by Brush match or exceed the predictive performance of decision trees and random forests while using fewer components.
- Data-driven versions of existing clinical scores can be created that preserve high accuracy but reduce complexity compared with tree-based alternatives.
Where Pith is reading between the lines
- If Brush scales to larger and more diverse patient records, it could support the creation of new risk scores in areas where current manual systems are limited or outdated.
- The same split-plus-optimization structure may transfer to other domains that need mixed rule and equation models, such as safety-critical control systems.
- Further work could test whether Brush models maintain performance when patient populations shift over time or across hospitals.
- Pairing Brush with external validation by clinicians might help surface any rules that are statistically sound but medically implausible.
Load-bearing premise
That adding decision-tree splits to symbolic regression will reliably produce models that remain clinically meaningful and free of hidden overfitting on real patient data outside the two tested scoring systems.
What would settle it
Running Brush on a fresh clinical dataset for a third scoring system, then measuring predictive accuracy and model size on held-out patients while checking whether the discovered rules match independent medical judgment.
Figures
read the original abstract
Medical decision-making makes frequent use of algorithms that combine risk equations with rules, providing clear and standardized treatment pathways. Symbolic regression (SR) traditionally limits its search space to continuous function forms and their parameters, making it difficult to model this decision-making. However, due to its ability to derive data-driven, interpretable models, SR holds promise for developing data-driven clinical risk scores. To that end we introduce Brush, an SR algorithm that combines decision-tree-like splitting algorithms with non-linear constant optimization, allowing for seamless integration of rule-based logic into symbolic regression and classification models. Brush achieves Pareto-optimal performance on SRBench, and was applied to recapitulate two widely used clinical scoring systems, achieving high accuracy and interpretable models. Compared to decision trees, random forests, and other SR methods, Brush achieves comparable or superior predictive performance while producing simpler models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Brush, a symbolic regression algorithm that augments standard SR search with decision-tree-style splitting operators and non-linear constant optimization. This hybrid approach is intended to discover interpretable models that combine continuous functional forms with explicit rule-based thresholds, addressing limitations of conventional SR in modeling clinical decision scores. The central claims are that Brush attains Pareto-optimal performance on the SRBench benchmark suite and, when applied to two established clinical scoring systems, produces models with high accuracy that are simpler than those obtained from decision trees, random forests, or other SR baselines.
Significance. If the empirical claims are substantiated with proper validation, the work could provide a practical bridge between symbolic regression and rule-based clinical logic, enabling data-driven yet transparent risk scores that align with existing medical workflows. The reported ability to recapitulate known scoring systems while maintaining competitive predictive performance on benchmarks would represent a concrete advance in interpretable ML for healthcare, provided the models generalize beyond the specific datasets examined.
major comments (2)
- [Experimental Results] Experimental section (clinical applications): The claims of 'high accuracy' and 'interpretable models' for the two recapitulated clinical scoring systems are presented without any reported cohort sizes, train/test partitioning strategy, cross-validation procedure, or error analysis. Given that the splitting mechanism introduces discrete thresholds and the constant optimizer tunes continuous parameters on the same data, these omissions leave open the possibility that reported performance reflects overfitting rather than generalizable clinical utility.
- [Benchmark Evaluation] Benchmark comparison: The assertion of Pareto-optimal performance on SRBench and 'comparable or superior predictive performance' relative to decision trees, random forests, and other SR methods is stated without accompanying numeric metrics, complexity measures, or reference to specific tables/figures that would allow verification of the trade-off between accuracy and model simplicity.
minor comments (1)
- [Abstract] The abstract and introduction would benefit from explicit definitions of the complexity metric used to establish Pareto optimality and from a brief statement of the loss function employed during constant optimization.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and have revised the manuscript to strengthen the reporting of experimental details and benchmark results.
read point-by-point responses
-
Referee: [Experimental Results] Experimental section (clinical applications): The claims of 'high accuracy' and 'interpretable models' for the two recapitulated clinical scoring systems are presented without any reported cohort sizes, train/test partitioning strategy, cross-validation procedure, or error analysis. Given that the splitting mechanism introduces discrete thresholds and the constant optimizer tunes continuous parameters on the same data, these omissions leave open the possibility that reported performance reflects overfitting rather than generalizable clinical utility.
Authors: We acknowledge that the original manuscript did not provide sufficient detail on the experimental protocol for the clinical applications. In the revised version we have added an explicit 'Experimental Setup' subsection that reports the cohort sizes drawn from the public datasets, the train/test partitioning strategy (stratified 80/20 split), the 5-fold cross-validation procedure used for model selection, and error analysis (mean performance with standard deviation across folds). Constant optimization was performed inside each cross-validation fold to avoid leakage. These additions allow readers to evaluate whether the reported accuracy and interpretability reflect generalizable performance. revision: yes
-
Referee: [Benchmark Evaluation] Benchmark comparison: The assertion of Pareto-optimal performance on SRBench and 'comparable or superior predictive performance' relative to decision trees, random forests, and other SR methods is stated without accompanying numeric metrics, complexity measures, or reference to specific tables/figures that would allow verification of the trade-off between accuracy and model simplicity.
Authors: The manuscript already contains the relevant numeric results and complexity measures in Table 1 (SRBench) and Table 3 (clinical tasks) together with the Pareto-front visualizations in Figure 4. To improve accessibility we have revised the main text to include a concise summary paragraph that quotes the key accuracy and complexity values directly from those tables, added explicit cross-references (e.g., 'as listed in Table 1, Brush occupies the Pareto front...'), and clarified how model simplicity is quantified (number of operators plus constants). These changes make verification immediate without altering the underlying data. revision: partial
Circularity Check
Brush algorithm and clinical score recapitulation show no significant circularity
full rationale
The paper introduces Brush as a hybrid symbolic regression method that integrates decision-tree splitting with non-linear constant optimization. It reports empirical results on SRBench (Pareto-optimal performance) and successful recovery of two known clinical scoring systems with high accuracy and simpler models than baselines. No load-bearing claims reduce to fitted parameters by construction, self-citations, or ansatz smuggling; performance metrics are presented as outcomes of applying the method to external benchmarks and existing clinical systems rather than as tautological restatements of inputs. The derivation chain is algorithmic and evaluative rather than deductive, remaining self-contained against the stated benchmarks and comparisons.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard assumptions of symbolic regression search (finite expression space, reliable constant optimization)
invented entities (1)
-
Brush algorithm
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Koza JR. 1994 Genetic programming as a means for programming computers by natural selection.Statistics and computing4, 87–112
work page 1994
-
[2]
2020 AI Feynman 2.0: Pareto- optimal symbolic regression exploiting graph modularity
Udrescu SM, Tan A, Feng J, Neto O, Wu T, Tegmark M. 2020 AI Feynman 2.0: Pareto- optimal symbolic regression exploiting graph modularity. In Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors,Advances in Neural Information Processing Systemsvol. 33 pp. 4860–4871. Curran Associates, Inc
work page 2020
-
[3]
2023 A Transformer Model for Symbolic Regression towards Scientific Discovery
Lalande F, Matsubara Y, Chiba N, Taniai T, Igarashi R, Ushiku Y. 2023 A Transformer Model for Symbolic Regression towards Scientific Discovery. InNeurIPS 2023 AI for Science Workshop
work page 2023
-
[4]
Udrescu SM, Tegmark M. 2020 AI Feynman: A physics-inspired method for symbolic regression.Science Advances6, eaay2631. (10.1126/sciadv.aay2631)
-
[5]
2024 Interpretable scientific discovery with symbolic regression: a review
Makke N, Chawla S. 2024 Interpretable scientific discovery with symbolic regression: a review. Artificial Intelligence Review57, 1–32. (10.1007/s10462-023-10622-0)
-
[6]
Angelis D, Sofos F, Karakasidis TE. 2023 Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives.Archives of Computational Methods in Engineering30, 3845–3865. (10.1007/s11831-023-09922-z)
-
[7]
2023 A flexible symbolic regression method for constructing interpretable clinical prediction models
La Cava WG, Lee PC, Ajmal I, Ding X, Solanki P, Cohen JB, Moore JH, Herman DS. 2023 A flexible symbolic regression method for constructing interpretable clinical prediction models. npj Digital Medicine6, 107. (10.1038/s41746-023-00833-8)
-
[8]
Wilstrup C, Cave C. 2022 Combining symbolic regression with the Cox proportional hazards model improves prediction of heart failure deaths.BMC Medical Informatics and Decision Making22, 1–7. (10.1186/s12911-022-01943-1)
-
[9]
Virgolin M, Alderliesten T, Bel A, Witteveen C, Bosman PAN. 2018 Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. InProceedings of the Genetic and Evolutionary Computation ConferenceGECCO ’18 p. 1395–1402 New York, NY, USA. Association for Computing Machinery. (10.1145/32...
-
[10]
2016 Automatic identification of wind turbine models using evolutionary multiobjective optimization
La Cava W, Danai K, Spector L, Fleming P, Wright A, Lackner M. 2016 Automatic identification of wind turbine models using evolutionary multiobjective optimization. Renewable Energy87, 892–902. Optimization Methods in Renewable Energy Systems Design (https://doi.org/10.1016/j.renene.2015.09.068)
-
[11]
2015 Automatic Identification of Closed-Loop Wind Turbine Dynamics via Genetic Programming
La Cava W, Danai K, Lackner M, Spector L, Fleming P, Wright A. 2015 Automatic Identification of Closed-Loop Wind Turbine Dynamics via Genetic Programming. InDynamic Systems and Control Conferencevol. 57250 p. V002T21A002. American Society of Mechanical Engineers. 13royalsocietypublishing.org/journal/rsta Phil. Trans. R. Soc. A 0000000
work page 2015
-
[12]
Wang Y, Wagner N, Rondinelli JM. 2019 Symbolic regression in materials science.MRS Communications9, 793–805. (10.1557/mrc.2019.85)
-
[13]
Food, Administration D. 2022 Clinical Decision Support Software - Guidance for Industry and Food and Drug Administration Staff. https://www.fda.gov/regulatory-information/search- fda-guidance-documents/clinical-decision-support-software
work page 2022
-
[14]
1984Classification and Regression Trees
Breiman L, Friedman J, Stone C, Olshen R. 1984Classification and Regression Trees. Taylor & Francis
-
[15]
Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C. 2022 Interpretable machine learning: Fundamental principles and 10 grand challenges.Statistics Surveys16, 1 – 85. (10.1214/21-SS133)
-
[16]
Churpek MM, Yuen TC, Park SY, Meltzer DO, Hall JB, Edelson DP. 2012 Derivation of a cardiac arrest prediction model using ward vital signs.Critical care medicine40, 2102–2108
work page 2012
-
[17]
2001 Validation of a modified Early Warning Score in medical admissions.Qjm94, 521–526
Subbe CP, Kruger M, Rutherford P, Gemmel L. 2001 Validation of a modified Early Warning Score in medical admissions.Qjm94, 521–526
work page 2001
-
[18]
Tan ADA, Permejo CC, Torres MCD. 2022 Modified early warning score vs cardiac arrest risk triage score for prediction of cardiopulmonary arrest: a case–control study.Indian Journal of Critical Care Medicine: Peer-reviewed, Official Publication of Indian Society of Critical Care Medicine26, 780
work page 2022
-
[19]
Guidetti V et al.. 2024 Symbolic Regression for Transparent Clinical Decision Support: A Data-Centric Framework for Scoring System Development. InCEUR WORKSHOP PROCEEDINGSvol. 3741 pp. 604–614
work page 2024
-
[20]
Xie F, Zhou J, Lee JW, Tan M, Li S, Rajnthern LS, Chee ML, Chakraborty B, Wong AKI, Dagan A et al.. 2022 Benchmarking emergency department prediction models with machine learning and public electronic health records.Scientific Data9, 658
work page 2022
-
[21]
2021 Contemporary Symbolic Regression Methods and their Relative Performance
La Cava W, Orzechowski P, Burlacu B, de Franca F, Virgolin M, Jin Y, Kommenda M, Moore J. 2021 Contemporary Symbolic Regression Methods and their Relative Performance. In Vanschoren J, Yeung S, editors,Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarksvol. 1. Curran
work page 2021
-
[22]
Aldeia GSI, Zhang H, Bomarito G, Cranmer M, Fonseca A, Burlacu B, La Cava WG, de França FO. 2025 Call for Action: towards the next generation of symbolic regression benchmark.arXiv preprint arXiv:2505.03977
-
[23]
2023 MIMIC-IV, a freely accessible electronic health record dataset
Johnson AE, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, Pollard TJ, Hao S, Moody B, Gow B et al.. 2023 MIMIC-IV, a freely accessible electronic health record dataset. Scientific data10, 1
work page 2023
-
[24]
2004 Functional Trees.Machine Learning55, 219–250
Gama J. 2004 Functional Trees.Machine Learning55, 219–250. (10.1023/b:mach.0000027782.67192.13)
-
[25]
Rusch T, Zeileis A. 2013 Gaining insight with recursive partitioning of generalized linear models.Journal of Statistical Computation and Simulation83, 1301–1315
work page 2013
-
[26]
2022 PS-Tree: A piecewise symbolic regression tree.Swarm and Evolutionary Computation71, 101061
Zhang H, Zhou A, Qian H, Zhang H. 2022 PS-Tree: A piecewise symbolic regression tree.Swarm and Evolutionary Computation71, 101061. (https://doi.org/10.1016/j.swevo.2022.101061)
-
[27]
2025 Unified Piecewise Symbolic Regression
Doquet G. 2025 Unified Piecewise Symbolic Regression. In Xue B, Manzoni L, Bakurov I, editors,Genetic Programmingpp. 190–206 Cham. Springer Nature Switzerland
work page 2025
-
[28]
2024 Symbolic Regression Enhanced Decision Trees for Classification Tasks
Fong KS, Motani M. 2024 Symbolic Regression Enhanced Decision Trees for Classification Tasks. InProceedings of the AAAI Conference on Artificial Intelligencevol. 38 pp. 12033– 12042
work page 2024
-
[29]
WangHF, Wu KY. 2004Hybrid geneticalgorithm foroptimization problems with permutation property.Computers & Operations Research31, 2453–2471. (https://doi.org/10.1016/S0305- 0548(03)00198-9)
-
[30]
2017 Elite bases regression: A real-time algorithm for symbolic regression
Chen C, Luo C, Jiang Z. 2017 Elite bases regression: A real-time algorithm for symbolic regression. In2017 13th International conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD)pp. 529–535. IEEE
work page 2017
-
[31]
1944 A method for the solution of certain non-linear problems in least squares
Levenberg K. 1944 A method for the solution of certain non-linear problems in least squares. Quarterly of applied mathematics2, 164–168. 14royalsocietypublishing.org/journal/rsta Phil. Trans. R. Soc. A 0000000
work page 1944
-
[32]
1963 An algorithm for least-squares estimation of nonlinear parameters
Marquardt DW. 1963 An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics11, 431–441
work page 1963
-
[33]
2022 Interaction-Transformation Evolutionary Algorithm with Coefficients Optimization
Aldeia GSI, de França FO. 2022 Interaction-Transformation Evolutionary Algorithm with Coefficients Optimization. InProceedings of the Genetic and Evolutionary Computation Conference CompanionGECCO ’22 p. 2274–2281 New York, NY, USA. Association for Computing Machinery. (10.1145/3520304.3533987)
-
[34]
Parameter identification for symbolic regression using nonlinear least squares , volume =
Kommenda M, Burlacu B, Kronberger G, Affenzeller M. 2019 Parameter identification for symbolic regression using nonlinear least squares.Genetic Programming and Evolvable Machines21, 471–501. (10.1007/s10710-019-09371-3)
-
[35]
2013 Prioritized Grammar Enumeration: Symbolic Regression by Dynamic Programming
Worm T, Chiu K. 2013 Prioritized Grammar Enumeration: Symbolic Regression by Dynamic Programming. InProceedings of the 15th Annual Conference on Genetic and Evolutionary ComputationGECCO ’13 p. 1021–1028 New York, NY, USA. Association for Computing Machinery. (10.1145/2463372.2463486)
-
[36]
Operon c++: An efficient genetic programming framework for symbolic regression,
Burlacu B, Kronberger G, Kommenda M. 2020 Operon C++: an efficient genetic programming framework for symbolic regression. InProceedings of the 2020 Genetic and Evolutionary Computation Conference CompanionGECCO ’20 p. 1562–1570 New York, NY, USA. Association for Computing Machinery. (10.1145/3377929.3398099)
-
[37]
2022 End-to-end Symbolic Regression with Transformers
Kamienny PA, d’Ascoli S, Lample G, Charton F. 2022 End-to-end Symbolic Regression with Transformers. In Oh AH, Agarwal A, Belgrave D, Cho K, editors,Advances in Neural Information Processing Systemspp. 1–13
work page 2022
-
[38]
2022 A Unified Framework for Deep Symbolic Regression
Landajuela M, Lee CS, Yang J, Glatt R, Santiago CP, Aravena I, Mundhenk T, Mulcahy G, Petersen BK. 2022 A Unified Framework for Deep Symbolic Regression. In Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, editors,Advances in Neural Information Processing Systemsvol. 35 pp. 33985–33998. Curran Associates, Inc
work page 2022
-
[39]
2023 Transformer-based Planning for Symbolic Regression
Shojaee P, Meidani K, Farimani AB, Reddy CK. 2023 Transformer-based Planning for Symbolic Regression. InThirty-seventh Conference on Neural Information Processing Systems
work page 2023
-
[40]
Learning concise representations for regression by evolving networks of trees
La Cava W, Singh TR, Taggart J, Suri S, Moore JH. 2018 Learning concise representations for regression by evolving networks of trees.arXiv preprint arXiv:1807.00981
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[41]
LaCavaW,SpectorL,DanaiK.2016Epsilon-LexicaseSelectionforRegression.InProceedings of the Genetic and Evolutionary Computation Conference 2016pp. 741–748. arXiv:1905.13266 [cs] (10.1145/2908812.2908898)
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1145/2908812.2908898 1905
-
[42]
2008A Field Guide to Genetic Programming
Poli R, McPhee NF, Koza JR. 2008A Field Guide to Genetic Programming. [S.I.]: [Lulu Press], lulu.com
-
[43]
Deb K, Pratap A, Agarwal S, Meyarivan T. 2002 A fast and elitist multiobjective genetic algorithm: NSGA-II.IEEE Transactions on Evolutionary Computation6, 182–197. (10.1109/4235.996017)
-
[44]
2024 Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing
Imai Aldeia GS, De França FO, La Cava WG. 2024 Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing. InProceedings of the Genetic and Evolutionary Computation ConferenceGECCO ’24 p. 896–904 New York, NY, USA. Association for Computing Machinery. (10.1145/3638529.3654147)
-
[45]
Romano JD, Le TT, La Cava W, Gregg JT, Goldberg DJ, Chakraborty P, Ray NL, Himmelstein D, Fu W, Moore JH. 2021 PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods.Bioinformatics38, 878–880. (10.1093/bioinformatics/btab727)
-
[46]
2006The Feynman Lectures on Physics
Feynman R, Leighton R, Sands M. 2006The Feynman Lectures on Physics. Number vol. 2 in The Feynman Lectures on Physics. Pearson/Addison-Wesley
-
[47]
2015The Feynman Lectures on Physics, Vol
Feynman R, Leighton R, Sands M. 2015The Feynman Lectures on Physics, Vol. I: The New Millennium Edition: Mainly Mechanics, Radiation, and Heat. Number vol. 1 in The Feynman Lectures on Physics. Basic Books
-
[48]
2018Nonlinear dynamics and chaos: with applications to physics, biology, chemistry, and engineering
Strogatz SH. 2018Nonlinear dynamics and chaos: with applications to physics, biology, chemistry, and engineering. CRC press
-
[49]
La Cava W, Cranmer M, de Franca FO, Orzechowski P, Burlacu B, kahlmeyer94, Marco, Zhang H, Boisbunon A, McDermott J, Matsubara Y, Bouter A, Kartelj A, Jin Y. 2025 cavalab/srbench. 15royalsocietypublishing.org/journal/rsta Phil. Trans. R. Soc. A 0000000
work page 2025
-
[50]
Johnson A, Bulgarelli L, Pollard T, Celi LA, Mark R, Horng S. 2023 MIMIC-IV-ED. (10.13026/5NTK-KM72)
-
[51]
2022 An Extensive Data Processing Pipeline for MIMIC-IV
Gupta M, Gallamoza B, Cutrona N, Dhakal P, Poulain R, Beheshti R. 2022 An Extensive Data Processing Pipeline for MIMIC-IV. InProceedings of the 2nd Machine Learning for Health symposiumvol. 193Proceedings of Machine Learning Researchpp. 311–325. PMLR
work page 2022
-
[52]
2024healthylaife/MIMIC-IV-Data-Pipeline
mehak25, Gallamoza B, UDpranjal, Cutrona N. 2024healthylaife/MIMIC-IV-Data-Pipeline
- [53]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.