pith. sign in

arxiv: 2605.15932 · v1 · pith:UQ4PL3TSnew · submitted 2026-05-15 · 💻 cs.HC

GEMS -- Guided Evolutionary Molecule Design for Sustainable Chemicals

Pith reviewed 2026-05-20 16:16 UTC · model grok-4.3

classification 💻 cs.HC
keywords interactive visual analyticsgenetic algorithmmolecule designsustainable chemicalshuman-AI collaborationde novo designexpert knowledge integrationchemical sustainability
0
0 comments X

The pith

Domain experts guide genetic algorithms to design sustainable molecules by editing scoring functions and populations through a visual interface without coding.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GEMS, a visual analytics system that lets chemists directly steer a genetic algorithm for creating new molecules aimed at environmental safety. Sparse data on chemical impacts leaves standard machine learning predictions unreliable, while numerical scores alone miss the detailed intuition experts bring. Users adjust the scoring rules and select which molecules stay in the population to steer the search toward better candidates. If this works, the approach combines human judgment with computational evolution to propose safer alternatives where automated tools fall short. The authors show its use in designing antioxidant replacements and report feedback from practicing scientists on its practical value.

Core claim

GEMS is an interactive visual analytics tool that enables domain experts to collaborate with a genetic algorithm for molecule design. Users integrate their knowledge by modifying the scoring function and the molecule population, guiding the evolutionary process without requiring programming knowledge or ML developer support. This addresses the shortcomings of low-fidelity ML oracles caused by sparse environmental impact data and the limits of purely numerical scoring functions in capturing nuanced chemical intuition. The system is demonstrated through a usage scenario focused on sustainable antioxidant alternatives and evaluated via interviews with domain scientists who provided feedback on其

What carries the argument

The visual interface allowing direct modification of the scoring function and molecule population to guide the genetic algorithm's evolutionary search for new molecules.

If this is right

  • Chemists can contribute domain knowledge to molecule optimization without needing programming expertise or external ML support.
  • Guidance from experts can offset the unreliability of low-fidelity oracles when data on environmental impacts is limited.
  • The tool supports concrete applications such as identifying sustainable alternatives to antioxidants.
  • Collected feedback from domain scientists indicates the interface successfully incorporates chemical intuition into the design loop.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Wider use of such guided systems could shorten the time from expert idea to testable sustainable chemical candidates in industrial settings.
  • The same visual steering approach might extend to other evolutionary design tasks where expert intuition is hard to encode numerically.
  • Pairing GEMS outputs with targeted laboratory validation experiments would provide direct evidence on whether guided designs reduce real-world environmental harm.

Load-bearing premise

Expert modifications to the scoring function and molecule population through the visual interface will produce meaningfully better or more sustainable molecule candidates than an unguided genetic algorithm or low-fidelity ML oracles.

What would settle it

A side-by-side evaluation where molecules generated after expert guidance in GEMS are tested for actual environmental metrics such as toxicity, persistence, or biodegradability and compared against candidates from the unguided algorithm or standard ML methods.

Figures

Figures reproduced from arXiv: 2605.15932 by Christina Humer, Coelina Robinson, Franziska Weissbach, Kjell Jorner, Mennatallah El-Assady.

Figure 1
Figure 1. Figure 1: A workflow comparison of traditional genetic algorithm pipelines [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The tool provides three views for exploration: ( [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The configuration interface (a) allows users to upload a dataset, customize attributes and the scoring function. The ex￾panded custom property tab shows API integration. The score editor (b) allows viewing and modifying the default scoring function. 4.3 Exploration and Analysis Views The tool provides three distinct interfaces ( [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Interaction interfaces for manual population steering: ( [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
read the original abstract

Designing safe and sustainable chemicals is critical to combat chemical pollution in our environment. Machine learning (ML) methods have been developed to aid with de novo molecule design. However, data on the environmental impacts of chemical compounds are sparse, resulting in low-fidelity ML oracles and unreliable candidate proposals. Furthermore, generative ML models rely on numerical scoring functions that cannot fully capture the nuanced chemical intuition of expert scientists required for real-world molecular design. We present GEMS-an interactive visual analytics tool that enables domain experts to directly collaborate with a genetic algorithm for molecule design. Users can integrate their expert knowledge to guide the evolutionary process by modifying the scoring function and molecule population without programming knowledge or ML developer support. A usage scenario demonstrates the system's application in designing sustainable antioxidant alternatives. In an interview session with domain scientists, we collected feedback on the usefulness of GEMS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces GEMS, an interactive visual analytics tool enabling domain experts to collaborate with a genetic algorithm for de novo design of sustainable chemicals. Users modify the scoring function and molecule population through the visual interface without programming or ML developer support. The paper presents a usage scenario for antioxidant alternatives and reports qualitative feedback from an interview session with domain scientists.

Significance. If the guided process demonstrably improves candidate quality on sustainability metrics, the approach could help address sparse environmental impact data and limitations of purely numerical oracles by incorporating expert chemical intuition directly into evolutionary search. The accessibility focus for non-programmers is a practical strength for interdisciplinary applications.

major comments (2)
  1. [Usage Scenario] Usage Scenario section: The description of expert modifications to the scoring function and population is presented narratively but contains no quantitative head-to-head comparison (e.g., predicted environmental impact scores, fraction of candidates meeting sustainability thresholds, or population diversity metrics) between guided and unguided GA runs. This is load-bearing for the claim that the interface enables meaningfully better outcomes.
  2. [Interview Feedback] Interview Feedback section: Feedback is reported qualitatively without specific examples or measures showing that expert-guided changes produced superior molecule candidates relative to the baseline genetic algorithm; this weakens support for the effectiveness of the visual guidance mechanism.
minor comments (2)
  1. [Abstract] The abstract and system description would benefit from a brief statement of the underlying genetic algorithm parameters (population size, mutation rate, etc.) to allow readers to understand the baseline before modifications.
  2. [Figures] Figure captions for the interface screenshots should explicitly label which visual elements correspond to scoring-function editing versus population curation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We respond to each major comment below, clarifying the scope of our HCI-focused contribution while addressing concerns about evidence for the guidance mechanism.

read point-by-point responses
  1. Referee: [Usage Scenario] Usage Scenario section: The description of expert modifications to the scoring function and population is presented narratively but contains no quantitative head-to-head comparison (e.g., predicted environmental impact scores, fraction of candidates meeting sustainability thresholds, or population diversity metrics) between guided and unguided GA runs. This is load-bearing for the claim that the interface enables meaningfully better outcomes.

    Authors: We agree that the Usage Scenario is presented narratively without quantitative head-to-head comparisons. The section is intended to demonstrate the interactive workflow and how domain experts can apply chemical intuition to modify scoring and populations in the absence of reliable numerical oracles. The manuscript's core claim concerns the accessibility of the visual interface for non-programmers rather than algorithmic superiority on sustainability metrics. We have revised the manuscript to include an explicit statement of scope in the Usage Scenario and Discussion sections, noting that quantitative benchmarking of guided versus unguided runs would require a separate optimization-focused evaluation. revision: partial

  2. Referee: [Interview Feedback] Interview Feedback section: Feedback is reported qualitatively without specific examples or measures showing that expert-guided changes produced superior molecule candidates relative to the baseline genetic algorithm; this weakens support for the effectiveness of the visual guidance mechanism.

    Authors: The Interview Feedback section reports qualitative responses from domain scientists on the perceived usefulness of the visual guidance features. As is standard for HCI evaluations of interactive systems, the study focused on user experience and the ability to incorporate expert knowledge rather than on collecting quantitative measures of candidate superiority. The feedback supports that experts could effectively steer the process, which aligns with our goal of addressing limitations of purely numerical oracles. We have added more detailed paraphrased examples from the interview session to the revised manuscript to illustrate specific ways the interface supported expert input. revision: partial

Circularity Check

0 steps flagged

No significant circularity; descriptive system presentation with qualitative evaluation

full rationale

The paper presents GEMS as an interactive visual analytics tool for guiding a genetic algorithm in molecule design, illustrated via a usage scenario for antioxidant alternatives and supported by domain scientist interview feedback. No equations, fitted parameters, predictions, or derivation chains appear in the provided text. Claims about expert collaboration and guidance rest on system description and qualitative data rather than any self-referential logic, self-citations, or reductions of outputs to inputs by construction. The work is self-contained as a tool-building and evaluation contribution without circular elements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Central claim depends on the premise that expert chemical intuition can be effectively translated into modifications of scoring functions and populations to overcome sparse environmental data; no free parameters or invented physical entities are evident from the abstract.

axioms (1)
  • domain assumption Domain experts possess nuanced chemical intuition that numerical scoring functions cannot fully capture and that can usefully guide evolutionary molecule design.
    Invoked in abstract to justify the need for interactive guidance over pure ML oracles.
invented entities (1)
  • GEMS interactive visual analytics tool no independent evidence
    purpose: To allow non-programmer domain experts to guide genetic algorithms for sustainable molecule design.
    New system introduced in the paper; no independent evidence provided beyond the described usage scenario.

pith-pipeline@v0.9.0 · 5683 in / 1265 out tokens · 56075 ms · 2026-05-20T16:16:05.129644+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    D. M. Anstine and O. Isayev. Generative Models as an Emerging Paradigm in the Chemical Sciences.Journal of the American Chemi- cal Society, 145(16):8736–8750, 2023. doi:10.1021/jacs.2c134672

  2. [2]

    Bhattacharya, S

    A. Bhattacharya, S. Stumpf, and K. Verbert. An Explanatory Model Steering System for Collaboration between Domain Experts and AI. InACM Proceedings of User Modeling, Adaptation and Personaliza- tion, pp. 75–79, 2024. doi:10.1145/3631700.36648862

  3. [3]

    Bienfait and P

    B. Bienfait and P. Ertl. JSME: a free molecule editor in JavaScript. Journal of Cheminformatics, 5(1):24, 2013. doi:10.1186/1758-2946-5 -243

  4. [4]

    Brown, M

    N. Brown, M. Fiscato, M. H. Segler, and A. C. Vaucher. GuacaMol: Benchmarking Models for de Novo Molecular Design.Journal of Chemical Information and Modeling, 59(3):1096–1108, 2019. doi:10 .1021/acs.jcim.8b008392

  5. [5]

    Brunn, G

    H. Brunn, G. Arnold, W. K ¨orner, G. Rippen, K. G. Steinh ¨auser, and I. Valentin. PFAS: forever chemicals—persistent, bioaccumulative and mobile. Reviewing the status and the need for their phase out and remediation of contaminated sites.Environmental Sciences Europe, 35(1):20, 2023. doi:10.1186/s12302-023-00721-81

  6. [6]

    Choung, R

    O.-H. Choung, R. Vianello, M. Segler, N. Stiefl, and J. Jim ´enez- Luna. Extracting medicinal chemistry intuition via preference ma- chine learning.Nature Communications, 14(1):6651, 2023. doi:10. 1038/s41467-023-42242-12

  7. [7]

    Cihan Sorkun, D

    M. Cihan Sorkun, D. Mullaj, J. M. V . A. Koelman, and S. Er. ChemPlot, a Python Library for Chemical Space Visualization.Chem- istry–Methods, 2(7):e202200005, 2022. doi:10.1002/cmtd.2022000052

  8. [8]

    Fenner and M

    K. Fenner and M. Scheringer. The Need for Chemical Simplifica- tion As a Logical Consequence of Ever-Increasing Chemical Pol- lution.Environmental Science & Technology, 55(21):14470–14472,

  9. [9]

    doi:10.1021/acs.est.1c049031

  10. [10]

    Fujii, Y

    S. Fujii, Y . Murakami, T. Yoshizawa, S. Ishida, N. Cho, M. Ohta, T. Honma, K. Yoshizoe, M. Sumita, K. Tsuda, and K. Terayama. ChemTSv3: Generalizing Molecular Design via Flexible Search Space Control. 2025. doi:10.26434/chemrxiv-2025-kdvrt2

  11. [11]

    Goldszal, D

    A. Goldszal, D. Calanzone, V . Taboga, and P.-L. Bacon. Discovery of Sustainable Refrigerants through Physics-Informed RL Fine-Tuning of Sequence Models, 2025. doi:10.48550/arXiv.2509.195881, 4

  12. [12]

    Gratzl, A

    S. Gratzl, A. Lex, N. Gehlenborg, H. Pfister, and M. Streit. LineUp: Visual Analysis of Multi-Attribute Rankings.IEEE Transactions on Visualization and Computer Graphics, 19(12):2277–2286, 2013. doi: 10.1109/TVCG.2013.1732

  13. [13]

    J. He, C. Hua, Y . Wang, and Z. Zheng. Collaborative Intelligence in Sequential Experiments: A Human-in-the-Loop Framework for Drug Discovery.Information Systems Research, 2025. doi:10.1287/isre.2024 .11541

  14. [14]

    Humer, H

    C. Humer, H. Heberle, F. Montanari, T. Wolf, F. Huber, R. Hender- son, J. Heinrich, and M. Streit. ChemInformatics Model Explorer (CIME): exploratory analysis of chemical model explanations.Jour- nal of Cheminformatics, 14(1):21, 2022. doi:10.1186/s13321-022-00600 -z2

  15. [15]

    Humer, R

    C. Humer, R. Nicholls, H. Heberle, M. Heckmann, M. P ¨uhringer, T. Wolf, M. L¨ubbesmeyer, J. Heinrich, J. Hillenbrand, G. V olpin, and M. Streit. CIME4R: Exploring iterative, AI-guided chemical reaction optimization campaigns in their parameter space.Journal of Chemin- formatics, 16(1):51, 2024. doi:10.1186/s13321-024-00840-12

  16. [16]

    M. Jain, T. Deleu, J. Hartford, C.-H. Liu, A. Hernandez-Garcia, and Y . Bengio. GFlowNets for AI-driven scientific discovery.Digital Discovery, 2(3):557–577, 2023. doi:10.1039/D3DD00002H1, 2

  17. [17]

    J. H. Jensen. A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chemical Science, 10(12):3567–3572, 2019. doi:10.1039/C8SC05372C 2

  18. [18]

    B. Kale, A. Clyde, M. Sun, A. Ramanathan, R. Stevens, and M. E. Papka. ChemoGraph: Interactive Visual Exploration of the Chemical Space.Computer Graphics Forum, 42(3):13–24, 2023. doi:10.1111/ cgf.148072

  19. [19]

    Menke, Y

    J. Menke, Y . Nahal, E. J. Bjerrum, M. Kabeshov, S. Kaski, and O. En- gkvist. Metis: a python-based user interface to collect expert feed- back for generative chemistry models.Journal of Cheminformatics, 16(1):100, 2024. doi:10.1186/s13321-024-00892-32

  20. [20]

    Mitchell.An introduction to genetic algorithms

    M. Mitchell.An introduction to genetic algorithms. Complex adaptive systems. MIT Press, 1996. 2

  21. [21]

    Mosqueira-Rey, E

    E. Mosqueira-Rey, E. Hern ´andez-Pereira, D. Alonso-R´ıos, J. Bobes- Bascar´an, and ´A. Fern´andez-Leal. Human-in-the-loop machine learn- ing: a state of the art.Artificial Intelligence Review, 56(4):3005–3054,

  22. [22]

    doi:10.1007/s10462-022-10246-w2

  23. [23]

    Nahal, J

    Y . Nahal, J. Menke, J. Martinelli, M. Heinonen, M. Kabeshov, J. P. Janet, E. Nittinger, O. Engkvist, and S. Kaski. Human-in-the-loop ac- tive learning for goal-oriented molecule generation.Journal of Chem- informatics, 16(1):138, 2024. doi:10.1186/s13321-024-00924-y1, 2

  24. [24]

    Richardson et al

    K. Richardson et al. Earth beyond six of nine planetary boundaries. Science Advances, 9(37):eadh2458, 2023. doi:10.1126/sciadv.adh2458 1

  25. [25]

    Rodr ´ıguez-P´erez and J

    R. Rodr ´ıguez-P´erez and J. Bajorath. Interpretation of Compound Ac- tivity Predictions from Complex Machine Learning Models Using Lo- cal Approximations and Shapley Values.Journal of Medicinal Chem- istry, 63(16):8761–8777, 2020. doi:10.1021/acs.jmedchem.9b011012

  26. [26]

    Rogers and M

    D. Rogers and M. Hahn. Extended-Connectivity Fingerprints.Journal of Chemical Information and Modeling, 50(5):742–754, 2010. doi:10 .1021/ci100050t3

  27. [27]

    B. S. Rubin. Bisphenol A: An endocrine disruptor with widespread exposure and multiple effects.The Journal of Steroid Biochemistry and Molecular Biology, 127(1-2):27–34, Oct. 2011. doi:10.1016/j.js- bmb.2011.05.0024

  28. [28]

    M. V . Sabando, P. Ulbrich, M. Selzer, J. Byska, J. Mican, I. Ponzoni, A. J. Soto, M. L. Ganuza, and B. Kozlikova. ChemV A: Interactive Visual Analysis of Chemical Compound Similarity in Virtual Screen- ing.IEEE Transactions on Visualization and Computer Graphics, 27(2):891–901, 2021. doi:10.1109/TVCG.2020.30304382

  29. [29]

    Seller-Brison, F

    C. Seller-Brison, F. Weissbach, K. Jorner, M. Scheringer, and K. Fen- ner. Hazard Assessment of Antioxidants as Contaminants of Concern. Environmental Science & Technology Letters, 2026. doi:10.1021/acs. estlett.5c012171, 4

  30. [30]

    Sundin, A

    I. Sundin, A. V oronov, H. Xiao, K. Papadopoulos, E. J. Bjerrum, M. Heinonen, A. Patronov, S. Kaski, and O. Engkvist. Human-in-the- loop assisted de novo molecular design.Journal of Cheminformatics, 14(1):86, 2022. doi:10.1186/s13321-022-00667-82

  31. [31]

    Tripp and J

    A. Tripp and J. M. Hern ´andez-Lobato. Genetic algorithms are strong baselines for molecule generation, 2023. doi:10.48550/ARXIV.2310. 092672

  32. [32]

    D. C. Vieira, F. S. Paula, L. E. Yaginuma, and G. Fonseca. iMESc – an interactive machine learning app for environmental sciences.Frontiers in Environmental Science, 13, 2025. doi:10.3389/fenvs.2025.15332921

  33. [33]

    H. Wang, W. Liu, J. Chen, S. Ji, and G. Bi. ParetoGen: Genera- tive Machine Learning Models To Push the Pareto Optimal Frontier of Functionality-Hazard Trade-offs in Per- and Polyfluoroalkyl Sub- stances Green Alternative Designs.Environmental Science & Tech- nology, 2026. doi:10.1021/acs.est.6c003502

  34. [34]

    H. Wang, M. Skreta, C. T. Ser, W. Gao, L. Kong, F. Strieth-Kalthoff, C. Duan, Y . Zhuang, Y . Yu, Y . Zhu, Y . Du, A. Aspuru-Guzik, K. Nek- lyudov, and C. Zhang. Efficient Evolutionary Search Over Chemical Space with Large Language Models. InInternational Conference on Learning Representations, 2025. 2

  35. [35]

    Y . Wang, J. Dong, Y . Zhou, Y . Cheng, X. Zhao, W. J. G. M. Peij- nenburg, M. G. Vijver, K. M. Y . Leung, W. Fan, and F. Wu. Ad- dressing the Data Scarcity Problem in Ecotoxicology via Small Data Machine Learning Methods.Environmental Science & Technology, 59(12):5867–5871, 2025. doi:10.1021/acs.est.5c005101

  36. [36]

    Y . Wang, L. Wang, Y . Du, B. Sundaralingam, X. Yang, Y .-W. Chao, C. P´erez-D’Arpino, D. Fox, and J. Shah. Inference-Time Policy Steer- ing Through Human Interactions. InIEEE International Conference on Robotics and Automation, pp. 15626–15633, 2025. doi:10.1109/ ICRA55743.2025.111279312

  37. [37]

    Y . Yang, Z. Yang, X. Pang, H. Cao, Y . Sun, L. Wang, Z. Zhou, P. Wang, Y . Liang, and Y . Wang. Molecular designing of potential environmentally friendly PFAS based on deep learning and generative models.Science of The Total Environment, 953:176095, 2024. doi:10 .1016/j.scitotenv.2024.1760952