Towards best practices in low-dimensional semi-supervised latent Bayesian optimization for the design of antimicrobial peptides

Jyler Menard; R. A. Mansbach

arxiv: 2510.17569 · v3 · submitted 2025-10-20 · 💻 cs.LG · physics.comp-ph

Towards best practices in low-dimensional semi-supervised latent Bayesian optimization for the design of antimicrobial peptides

Jyler Menard , R. A. Mansbach This is my paper

Pith reviewed 2026-05-18 05:42 UTC · model grok-4.3

classification 💻 cs.LG physics.comp-ph

keywords latent Bayesian optimizationantimicrobial peptidespeptide designdimensionality reductiongenerative modelssemi-supervised learningphysicochemical propertiessequence optimization

0 comments

The pith

Reducing latent space dimensions improves interpretability and can enhance optimization for antimicrobial peptide design.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper explores improvements to latent Bayesian optimization for designing antimicrobial peptides by examining the effects of dimensionality reduction and different ways to organize the latent space with physicochemical properties. The authors test whether lower-dimensional versions of the latent space make the optimization process more effective and easier to understand. They also compare using properties that are easy to calculate but less directly tied to the optimization goal versus properties that are more relevant but sparser. Their findings suggest that reduced dimensions often help with both performance and interpretation, while the best property choice depends on the specific context of the search. This approach addresses the challenge of vast peptide sequence spaces with limited experimental data, potentially speeding up the discovery of new therapeutics against bacterial infections.

Core claim

Employing a dimensionally-reduced version of the latent space is more interpretable and can be advantageous, while the use of less-relevant but more easily-computable physicochemical properties is advantageous to latent space organization in certain contexts and the use of more-relevant but sparser properties associated with the latent Bayesian objective function is advantageous in others.

What carries the argument

Dimensionally-reduced latent spaces organized by varying physicochemical properties for semi-supervised latent Bayesian optimization.

If this is right

Dimensionally reduced latent spaces facilitate more efficient optimization in some cases.
Less-relevant physicochemical properties improve latent space organization in certain contexts.
More-relevant but sparser properties improve organization in other contexts.
This provides groundwork for biophysically-motivated procedures in peptide design.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hybrid use of both types of properties might yield even better results across contexts.
These strategies could extend to designing other types of biomolecules.
Human experts might use the improved interpretability to guide further refinements in peptide sequences.
Validation through wet-lab experiments would test if the computational advantages translate to real peptide performance.

Load-bearing premise

The generative models used produce latent representations that faithfully capture meaningful structures in peptide sequence spaces allowing for effective comparison of optimization strategies.

What would settle it

Running the optimization without dimensionality reduction and observing no advantage or loss in finding optimal peptides, or if the interpretability does not improve as measured by some metric like clustering of similar sequences.

read the original abstract

Generative deep learning techniques have demonstrated an impressive capacity for tackling biomolecular design problems in recent years. Despite their high performance, however, they still suffer from a lack of interpretability and rigorous quantification of associated search spaces, which are necessary to unlock their full potential for scientific inquiry beyond efficient design. An area in which they are of particular interest is in the design of antimicrobial peptides, which are a promising class of therapeutics to treat bacterial infections. Discovering and designing such peptides is difficult because of the vast number of possible sequences and comparatively small amount of experimental information. In this work, we perform a theoretical investigation of latent Bayesian optimization for searching through peptide sequence spaces, with a focus on antimicrobial peptides. We investigate (1) whether searching through a dimensionally-reduced variant of the latent design space may facilitate optimization, (2) how organizing latent spaces by differing amounts of more and less relevant information may improve the efficiency of arriving at an optimal peptide design, and (3) the interpretability of the spaces. We find that employing a dimensionally-reduced version of the latent space is more interpretable and can be advantageous, while the use of less-relevant but more easily-computable physicochemical properties is advantageous to latent space organization in certain contexts and the use of more-relevant but sparser properties associated with the latent Bayesian objective function is advantageous in others. This work lays crucial groundwork for biophysically-motivated peptide design procedures, with an especial focus on antimicrobial peptides.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tests low-dimensional and property-organized variants of latent Bayesian optimization for antimicrobial peptides and reports some context-dependent advantages, but the claims depend on unverified assumptions about latent space quality.

read the letter

Hi there, the main thing to know is that this work compares a dimensionally reduced latent space against the full version and looks at organizing the space with different mixes of physicochemical properties inside a Bayesian optimization setup for antimicrobial peptide design. They conclude that lower dimensions improve interpretability and can help optimization, while easier but less relevant properties sometimes organize the space better and more relevant but sparser ones help in other cases. What the paper does reasonably well is apply these existing latent optimization ideas to a real high-stakes problem with very limited labeled data. It tries to address interpretability and search efficiency rather than just chasing better designs, which is a useful direction for the subfield. The soft spot is exactly what the stress-test flags: the whole comparison rests on the generative model producing a latent space whose geometry actually reflects biologically relevant peptide features. Without clear checks like reconstruction accuracy on held-out sequences or correlation between latent distances and known activity, the reported advantages could just be artifacts of how the embedding was built. The abstract presents concrete findings from a theoretical investigation but does not highlight those validation steps, so the central argument feels lighter than it needs to be. This is for researchers working on machine learning for peptide or molecular design in low-data regimes. A reader already familiar with VAEs and Bayesian optimization might pick up some practical comparison points, but it is not a big methodological shift. I would send it for peer review so the methods section and any supporting metrics can get proper scrutiny.

Referee Report

2 major / 2 minor

Summary. The paper conducts a theoretical investigation of latent Bayesian optimization (BO) for antimicrobial peptide (AMP) design using generative deep learning models. It examines three questions: (1) whether dimensionally-reduced latent spaces facilitate optimization and improve interpretability, (2) how organizing latent spaces with differing amounts of more- versus less-relevant physicochemical properties affects optimization efficiency, and (3) the interpretability of the resulting spaces. The authors report that reduced latent spaces are more interpretable and can be advantageous, while less-relevant but easily-computable properties aid organization in some contexts and more-relevant but sparser properties (tied to the latent BO objective) are advantageous in others.

Significance. If the empirical findings hold under rigorous validation, the work could help establish practical guidelines for semi-supervised latent BO in biomolecular design, particularly by clarifying trade-offs between dimensionality reduction, property relevance, and interpretability for AMP search. This addresses a genuine gap between high-performing generative models and the need for quantifiable, interpretable search spaces in low-data regimes.

major comments (2)

[Abstract / Introduction] The central claims rest on the untested assumption that the generative models (VAEs or similar) produce latent spaces whose geometry meaningfully reflects biologically relevant peptide structure rather than generic sequence statistics. No reconstruction error on held-out sequences, correlation of latent distances with known antimicrobial activity, or ablation of the semi-supervised signal is referenced in the abstract or described as a validation step; without these, reported advantages in interpretability and optimization efficiency risk being artifacts of the embedding.
[Abstract] The reported findings ('we find that...') are presented as concrete outcomes of a 'theoretical investigation,' yet the abstract provides no quantitative results, error bars, statistical tests, or comparison baselines for the three investigated questions. This makes it impossible to assess whether the context-dependent advantages of property types are statistically significant or generalizable beyond the specific experimental setup.

minor comments (2)

[Methods] Clarify the exact generative model architecture, training procedure, and semi-supervised objective used to construct the latent spaces, including any hyperparameters that could affect the reported comparisons.
[Methods] Define 'more-relevant' versus 'less-relevant' physicochemical properties explicitly and state how relevance is quantified relative to the latent BO objective.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments. We address each major comment point-by-point below, clarifying the scope of our theoretical investigation while committing to targeted revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Abstract / Introduction] The central claims rest on the untested assumption that the generative models (VAEs or similar) produce latent spaces whose geometry meaningfully reflects biologically relevant peptide structure rather than generic sequence statistics. No reconstruction error on held-out sequences, correlation of latent distances with known antimicrobial activity, or ablation of the semi-supervised signal is referenced in the abstract or described as a validation step; without these, reported advantages in interpretability and optimization efficiency risk being artifacts of the embedding.

Authors: We appreciate this observation. Our work is explicitly positioned as a theoretical investigation of latent-space properties for Bayesian optimization rather than a re-validation of generative models. The Methods section describes the VAE architectures, semi-supervised training objectives, and property incorporation used to construct the latent spaces. To address the concern directly, we will add a short paragraph to the Introduction that references standard validation practices from the literature (e.g., reconstruction fidelity and property correlation benchmarks for peptide VAEs) and explicitly states that our analysis assumes these established embeddings while focusing on downstream effects of dimensionality reduction and property organization. revision: partial
Referee: [Abstract] The reported findings ('we find that...') are presented as concrete outcomes of a 'theoretical investigation,' yet the abstract provides no quantitative results, error bars, statistical tests, or comparison baselines for the three investigated questions. This makes it impossible to assess whether the context-dependent advantages of property types are statistically significant or generalizable beyond the specific experimental setup.

Authors: We agree that the current abstract is high-level and omits specific metrics. Detailed quantitative results—including optimization trajectories, interpretability scores, and comparisons across property sets—are provided in the Results section with accompanying figures and tables. We will revise the abstract to include one concise sentence summarizing the key quantitative observations (e.g., relative improvements in optimization efficiency for reduced versus full latent spaces and context-dependent advantages of property relevance), while preserving brevity. revision: yes

Circularity Check

0 steps flagged

Comparative investigation of latent space variants shows no load-bearing self-definition or fitted predictions.

full rationale

The paper reports findings from a theoretical investigation comparing dimensionally-reduced latent spaces against full versions and different physicochemical property sets for organizing spaces in semi-supervised latent Bayesian optimization. These are presented as empirical outcomes of the comparisons ('we find that...') rather than derivations that reduce by construction to the same fitted quantities or self-cited premises. No equations, uniqueness theorems, or ansatzes are shown that equate a 'prediction' to an input parameter or rename a known result. The work assumes generative models yield meaningful latent representations (an external premise), but the reported advantages in interpretability and context-dependent efficiency do not collapse to self-referential definitions or self-citation chains within the provided text. This is the expected low-circularity outcome for an ablation-style comparison paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based solely on abstract; detailed ledger cannot be populated without full methods and assumptions.

axioms (1)

domain assumption Generative deep learning techniques create latent spaces suitable for searching peptide sequence spaces in optimization tasks.
Invoked as the foundation for performing latent Bayesian optimization on antimicrobial peptides.

pith-pipeline@v0.9.0 · 5797 in / 1034 out tokens · 41090 ms · 2026-05-18T05:42:02.117780+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We investigate (1) whether further compression of the design space via dimensionality reduction may facilitate optimization, (2) the interpretability of the spaces, and (3) how organizing latent spaces with physicochemical properties may improve the efficiency of optimizing antimicrobial activity.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use a Gaussian Process Regression (GPR) model for the surrogate objective function... Log Expected Improvement

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

[1]

Id", blue dots, solid regression line) a 64-dimensional latent space, or (

For (c) and (d) the distances are computed in the two PCs most correlated with the oracle values. space (Fig. 1). Expanding on that work, we investigated whether such organization persists even in a semi-supervised scenario, finding evidence that just2%of property labels suffices to induce organization along that property; additionally we showed that join...

work page 2021
[2]

8 WHO, 2023 Antibacterial agents in clinical and preclinical development: an overview and analysis., Geneva: World health organization technical report,

work page 2023
[3]

Seyfi, F

9 R. Seyfi, F. A. Kahaki, T. Ebrahimi, S. Montazersaheb, S. Ey- vazi, V. Babaeipour and V. Tarhriz, International Journal of Peptide Research and Therapeutics, 2020,26, 1451–1463. 10 Y. Huan, Q. Kong, H. Mou and H. Yi, Frontiers in Microbiology, 2020,11, year. 11 L. Daruka, M. S. Czikkely , P. Szili, Z. Farkas, D. Balogh, G. Grézal, E. Maharramov, T.-H. V...

work page 2020
[4]

Zakharova, M

17 E. Zakharova, M. Orsi, A. Capecchi and J.-L. Reymond, ChemMedChem, 2022,17, e202200291. 18 P. Szymczak, M. Mo ˙zejko, T. Grzegorzek, R. Jur- czak, M. Bauer, D. Neubauer, K. Sikora, M. Michalski, J. Sroka, P. Setny , W. Kamysz and E. Szczurek, Nature Communications, 2023,14,

work page 2022
[5]

19 P. Das, T. Sercu, K. Wadhawan, I. Padhi, S. Gehrmann, F. Cip- cigan, V. Chenthamarakshan, H. Strobelt, C. dos Santos, P.-Y. Chen, Y. Y. Yang, J. P. K. Tan, J. Hedrick, J. Crain and A. Mo- jsilovic, Nature Biomedical Engineering, 2021,5, 613–623. 20 A. Arnold, S. McLellan and J. M. Stokes, npj Antimicrobials and Resistance, 2025,3,

work page 2021
[6]

Gómez-Bombarelli, J

22 R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams and A. Aspuru-Guzik, ACS central science, 2018,4, 268–276. 23 A. Grosnit, R. Tutunov, A. M. Maraval, R.-R. Griffiths, A. I. Cowen-Rivers, L. Yang, L. Zhu, W. Lyu, Z. Chen, J. Wang, J. Peters and H. ...

work page 2018
[7]

24 S. Lee, J. Chu, S. Kim, J. Ko and H. J. Kim, Advancing Bayesian Optimization via Learning Correlated Latent Space, 2023,http://arxiv.org/abs/2310.20258, arXiv:2310.20258 [cs]. 25 A. Tripp, E. Daxberger and J. M. Hernández-Lobato, Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining, 2020,http:// arxiv.org/...

work page arXiv 2023
[8]

Shahriari, K

29 B. Shahriari, K. Swersky , Z. Wang, R. P. Adams and N. de Fre- itas, Proceedings of the IEEE, 2016,104, 148–175. 30 S. Ament, S. Daulton, D. Eriksson, M. Balandat and E. Bakshy , Advances in Neural Information Processing Systems, 2023, 36, 20577–20612. 31 N. Kade ˇrábková, A. J. S. Mahmood and D. A. I. Mavridou,npj Antimicrobials and Resistance, 2024,2...

work page doi:10.1101/692681v1 2016
[9]

Attention Is All You Need

40 O. Dollar, N. Joshi, D. A. C. Beck and J. Pfaendtner, Chemical Science, 2021,12, 8362–8372. 41 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, 18 | 1–30 + P V S O B M / B N F < Z F B S > < W P M > A. N. Gomez, L. Kaiser and I. Polosukhin, Attention Is All You Need, 2023,http://arxiv.org/abs/1706.03762, arXiv:1706.03762 [cs]. 42 M. Larralde, ...

work page internal anchor Pith review Pith/arXiv arXiv 2021
[10]

55 Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido and A. Rives, Science, 2023,379, 1123–1130. 56 E. C. Meng, T. D. Goddard, E. F. Pettersen, G. S. Couch, Z. J. Pearson, J. H. Morris and T. E. Ferrin, Protein Science, 2023, 32, e4792. 57https://www....

work page 2023
[11]

0 5 20 40 60 80 100 Sequence length 0 1000 2000 3000 4000 5000 6000 7000Count a

This dataset contains peptide sequences and their associated Min- imum Inhibitory Concentration (MIC). 0 5 20 40 60 80 100 Sequence length 0 1000 2000 3000 4000 5000 6000 7000Count a. Training set 0 5 20 40 60 80 100 Sequence length b. T est set Fig. S1 Distribution of peptide sequence lengths in our dataset. 20 | 1–30 + P V S O B M / B N F < Z F B S > < ...

work page 2000
[12]

For the main text, we trained 27 models to 100 epochs, giving an approximate total energy usage of178.2kWh

To train a given model to 100 epochs, approximately24hrs were required, yield- ing275·24/1000=6.6kWh per 100 epochs. For the main text, we trained 27 models to 100 epochs, giving an approximate total energy usage of178.2kWh. The Hyundai Ioniq 6 is a 2022 battery electric sedan. Its long- range battery capacity is77.4kWh, corresponding to an estimated rang...

work page 2022

[1] [1]

Id", blue dots, solid regression line) a 64-dimensional latent space, or (

For (c) and (d) the distances are computed in the two PCs most correlated with the oracle values. space (Fig. 1). Expanding on that work, we investigated whether such organization persists even in a semi-supervised scenario, finding evidence that just2%of property labels suffices to induce organization along that property; additionally we showed that join...

work page 2021

[2] [2]

8 WHO, 2023 Antibacterial agents in clinical and preclinical development: an overview and analysis., Geneva: World health organization technical report,

work page 2023

[3] [3]

Seyfi, F

9 R. Seyfi, F. A. Kahaki, T. Ebrahimi, S. Montazersaheb, S. Ey- vazi, V. Babaeipour and V. Tarhriz, International Journal of Peptide Research and Therapeutics, 2020,26, 1451–1463. 10 Y. Huan, Q. Kong, H. Mou and H. Yi, Frontiers in Microbiology, 2020,11, year. 11 L. Daruka, M. S. Czikkely , P. Szili, Z. Farkas, D. Balogh, G. Grézal, E. Maharramov, T.-H. V...

work page 2020

[4] [4]

Zakharova, M

17 E. Zakharova, M. Orsi, A. Capecchi and J.-L. Reymond, ChemMedChem, 2022,17, e202200291. 18 P. Szymczak, M. Mo ˙zejko, T. Grzegorzek, R. Jur- czak, M. Bauer, D. Neubauer, K. Sikora, M. Michalski, J. Sroka, P. Setny , W. Kamysz and E. Szczurek, Nature Communications, 2023,14,

work page 2022

[5] [5]

19 P. Das, T. Sercu, K. Wadhawan, I. Padhi, S. Gehrmann, F. Cip- cigan, V. Chenthamarakshan, H. Strobelt, C. dos Santos, P.-Y. Chen, Y. Y. Yang, J. P. K. Tan, J. Hedrick, J. Crain and A. Mo- jsilovic, Nature Biomedical Engineering, 2021,5, 613–623. 20 A. Arnold, S. McLellan and J. M. Stokes, npj Antimicrobials and Resistance, 2025,3,

work page 2021

[6] [6]

Gómez-Bombarelli, J

22 R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams and A. Aspuru-Guzik, ACS central science, 2018,4, 268–276. 23 A. Grosnit, R. Tutunov, A. M. Maraval, R.-R. Griffiths, A. I. Cowen-Rivers, L. Yang, L. Zhu, W. Lyu, Z. Chen, J. Wang, J. Peters and H. ...

work page 2018

[7] [7]

24 S. Lee, J. Chu, S. Kim, J. Ko and H. J. Kim, Advancing Bayesian Optimization via Learning Correlated Latent Space, 2023,http://arxiv.org/abs/2310.20258, arXiv:2310.20258 [cs]. 25 A. Tripp, E. Daxberger and J. M. Hernández-Lobato, Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining, 2020,http:// arxiv.org/...

work page arXiv 2023

[8] [8]

Shahriari, K

29 B. Shahriari, K. Swersky , Z. Wang, R. P. Adams and N. de Fre- itas, Proceedings of the IEEE, 2016,104, 148–175. 30 S. Ament, S. Daulton, D. Eriksson, M. Balandat and E. Bakshy , Advances in Neural Information Processing Systems, 2023, 36, 20577–20612. 31 N. Kade ˇrábková, A. J. S. Mahmood and D. A. I. Mavridou,npj Antimicrobials and Resistance, 2024,2...

work page doi:10.1101/692681v1 2016

[9] [9]

Attention Is All You Need

40 O. Dollar, N. Joshi, D. A. C. Beck and J. Pfaendtner, Chemical Science, 2021,12, 8362–8372. 41 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, 18 | 1–30 + P V S O B M / B N F < Z F B S > < W P M > A. N. Gomez, L. Kaiser and I. Polosukhin, Attention Is All You Need, 2023,http://arxiv.org/abs/1706.03762, arXiv:1706.03762 [cs]. 42 M. Larralde, ...

work page internal anchor Pith review Pith/arXiv arXiv 2021

[10] [10]

55 Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido and A. Rives, Science, 2023,379, 1123–1130. 56 E. C. Meng, T. D. Goddard, E. F. Pettersen, G. S. Couch, Z. J. Pearson, J. H. Morris and T. E. Ferrin, Protein Science, 2023, 32, e4792. 57https://www....

work page 2023

[11] [11]

0 5 20 40 60 80 100 Sequence length 0 1000 2000 3000 4000 5000 6000 7000Count a

This dataset contains peptide sequences and their associated Min- imum Inhibitory Concentration (MIC). 0 5 20 40 60 80 100 Sequence length 0 1000 2000 3000 4000 5000 6000 7000Count a. Training set 0 5 20 40 60 80 100 Sequence length b. T est set Fig. S1 Distribution of peptide sequence lengths in our dataset. 20 | 1–30 + P V S O B M / B N F < Z F B S > < ...

work page 2000

[12] [12]

For the main text, we trained 27 models to 100 epochs, giving an approximate total energy usage of178.2kWh

To train a given model to 100 epochs, approximately24hrs were required, yield- ing275·24/1000=6.6kWh per 100 epochs. For the main text, we trained 27 models to 100 epochs, giving an approximate total energy usage of178.2kWh. The Hyundai Ioniq 6 is a 2022 battery electric sedan. Its long- range battery capacity is77.4kWh, corresponding to an estimated rang...

work page 2022