Confidence is the key: how conformal prediction enhances the generative design of permeable peptides

Florian David; G\"ok\c{c}e Geylan; Laura van Weesep; Leonardo De Maria; Ola Engkvist; Sunay Chankeshwara

arxiv: 2605.05770 · v1 · submitted 2026-05-07 · 💻 cs.AI

Confidence is the key: how conformal prediction enhances the generative design of permeable peptides

Laura van Weesep , Sunay Chankeshwara , Leonardo De Maria , Florian David , Ola Engkvist , G\"ok\c{c}e Geylan This is my paper

Pith reviewed 2026-05-08 11:31 UTC · model grok-4.3

classification 💻 cs.AI

keywords conformal predictiongenerative designcyclic peptidespermeability predictionreinforcement learninguncertainty quantificationapplicability domainpeptide optimization

0 comments

The pith

Conformal prediction improves the reliability and efficiency of RL-guided generative design for permeable cyclic peptides.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that adding conformal prediction to the scoring of a permeability predictor within an RL-based generative model leads to better designed cyclic peptides. By using calibrated uncertainty estimates to adjust rewards, the process avoids suggesting molecules where the predictor is likely unreliable. A sympathetic reader would care because generative models frequently propose structures outside the training data, turning high predicted rewards into false leads. The method discourages such exploration by design. This is presented as the first integration of conformal prediction with generative modeling for this task.

Core claim

We present an RL-guided generative framework that designs permeable cyclic peptides using an uncertainty-aware permeability predictor as the scoring component. To address predictive uncertainty, especially impacted by novel chemistry, we integrate conformal prediction (CP) as our uncertainty quantification method. CP assesses designs based on the calibrated model under a user-defined confidence level. We demonstrate that rewarding generated peptides with CP-informed predictions improves both reliability and efficiency of peptide optimization process. This also discourages exploration outside the predictor's applicability domain.

What carries the argument

Conformal prediction intervals on permeability predictions, used as an uncertainty-aware reward signal inside the reinforcement learning loop of the generative model.

Load-bearing premise

The conformal prediction intervals remain well-calibrated and useful when the generative model proposes peptides far from the original training distribution of the permeability predictor.

What would settle it

Synthesize a set of peptides generated with the CP-rewarded model and without it, then measure their actual passive membrane permeability to check whether the CP-informed designs show higher agreement with experiments and whether optimization converges with fewer invalid proposals.

Figures

Figures reproduced from arXiv: 2605.05770 by Florian David, G\"ok\c{c}e Geylan, Laura van Weesep, Leonardo De Maria, Ola Engkvist, Sunay Chankeshwara.

**Figure 4.** Figure 4: The number of peptides classified as permeable according to probability under the raw model scoring function (red) and the CP soft scorin peptides predicted to be permeable in a conformally efficient way (B). view at source ↗

read the original abstract

Generative models coupled with reinforcement learning (RL), such as REINVENT and PepINVENT, have emerged as a powerful framework for de novo molecular design. During the ideation process these generative frameworks utilize various predictive models as part of the optimization objectives. However, the utility of the predictive models can be limited by their domain of applicability. When RL is used to explore the chemical space with predictive models, it can suggest molecules that lie outside the predictor's domain of applicability. As a result, the predictions may become less reliable, potentially steering designs into high reward but also high uncertainty chemical spaces. This is particularly pronounced for cyclic peptides which show therapeutic promise due to their modifiability and large interaction surfaces but are understudied compared to small molecules. While passive membrane permeation in cyclic peptides has attracted interest, identifying optimal permeable designs remains challenging yet crucial for targeting intracellular sites. We present an RL-guided generative framework that designs permeable cyclic peptides using an uncertainty-aware permeability predictor as the scoring component. To address predictive uncertainty, especially impacted by novel chemistry, we integrate conformal prediction (CP) as our uncertainty quantification method. CP assesses designs based on the calibrated model under a user-defined confidence level. We demonstrate that rewarding generated peptides with CP-informed predictions improves both reliability and efficiency of peptide optimization process. This also discourages exploration outside the predictor's applicability domain. This approach bridges the gap between predictive uncertainty and RL-guided exploration, showing how generative modelling and conformal prediction can be combined for the first time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper plugs conformal prediction into an RL generator for cyclic peptides to score permeability and avoid out-of-domain drift, but the coverage claim rests on an assumption the RL loop itself breaks.

read the letter

The core move here is to take a permeability predictor, wrap it with conformal prediction at a user-set confidence level, and feed the resulting interval or nonconformity score back into the RL reward for peptide generation. The authors argue this keeps the optimizer from chasing high-reward but unreliable designs and improves both reliability and efficiency. That is the one concrete thing the work contributes: a working example of CP inside the PepINVENT-style loop for this specific task. Both pieces are established, so the novelty is the integration rather than new theory, but the peptide setting is reasonable because permeability data are sparse and cyclic peptides sit outside most small-molecule training sets. The paper does a clean job stating the problem and showing how the CP wrapper can be dropped in without changing the generator architecture. It also flags that the method discourages exploration outside the predictor's applicability domain, which matches what practitioners actually worry about. The soft spot is exactly the one the stress-test note raises. Standard CP gives marginal coverage only under exchangeability with the calibration set. RL-driven generation is designed to produce peptides far from that set, so the intervals used for scoring are not guaranteed to maintain their nominal coverage on the very candidates that get the highest rewards. The abstract offers no numbers, no ablation removing the CP term, and no post-optimization calibration check on the generated peptides, so the reliability gain is asserted rather than measured. If the full paper contains those checks and shows measurable improvement in success rate or reduced false positives, the claim holds; otherwise it is an assumption carried over from the CP formalism. The work is aimed at groups already running REINVENT-style generators for intracellular peptide targets and who need a lightweight way to add uncertainty awareness. A reader in that niche would get a usable template and could test the calibration themselves. It is worth sending to peer review because the idea is practical, the gap it targets is real, and the experiments needed to close the exchangeability issue are straightforward to request. Referees will likely ask for those empirical checks, but the paper is coherent enough on its own terms to deserve that step rather than a desk reject.

Referee Report

2 major / 2 minor

Summary. The manuscript presents an RL-guided generative framework (building on REINVENT/PepINVENT) for de novo design of permeable cyclic peptides. It integrates conformal prediction (CP) as an uncertainty-aware component in the reward function, asserting that CP-informed rewards improve optimization reliability and efficiency while discouraging exploration outside the permeability predictor's applicability domain.

Significance. If the empirical results demonstrate that the CP intervals remain calibrated and useful on RL-generated peptides, the work would provide a concrete method for mitigating predictor misuse in generative molecular design. This is a timely contribution at the intersection of conformal prediction and RL-based molecule generation, with potential to improve robustness in peptide therapeutics discovery.

major comments (2)

[Abstract] Abstract: the central claim that 'rewarding generated peptides with CP-informed predictions improves both reliability and efficiency' is asserted without any quantitative metrics, ablation results, or calibration details in the provided abstract. The full manuscript must supply these to substantiate the reliability gain.
[Methods / CP integration] The conformal prediction integration (described in the methods/setup): standard inductive CP delivers marginal coverage only under exchangeability with the calibration set. The RL generator is explicitly optimized to produce cyclic peptides outside the original permeability training distribution, violating exchangeability. The manuscript must either restore exchangeability or empirically verify that nominal coverage is achieved on the highest-reward (most novel) generated peptides; without this, the asserted reliability improvement is not guaranteed by the CP formalism.

minor comments (2)

[Abstract] Abstract: the phrasing 'improves both reliability and efficiency of peptide optimization process' is missing the definite article and could be tightened for clarity.
[Abstract] Abstract: the final sentence 'This also discourages exploration...' should explicitly state the mechanism (e.g., how the CP-based reward penalizes high-uncertainty designs) rather than leaving the causal link implicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which have helped us identify areas to strengthen the presentation and rigor of our work. We provide point-by-point responses to the major comments below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'rewarding generated peptides with CP-informed predictions improves both reliability and efficiency' is asserted without any quantitative metrics, ablation results, or calibration details in the provided abstract. The full manuscript must supply these to substantiate the reliability gain.

Authors: We agree that the abstract would be strengthened by including concrete quantitative support for the central claim. In the revised manuscript, we will update the abstract to reference specific metrics from our experiments, including observed gains in optimization efficiency (e.g., reduced number of iterations to reach high-reward permeable designs) and reliability (e.g., lower failure rates due to out-of-domain predictions), while pointing to the ablation studies and calibration results already present in the main text and supplementary materials. revision: yes
Referee: [Methods / CP integration] The conformal prediction integration (described in the methods/setup): standard inductive CP delivers marginal coverage only under exchangeability with the calibration set. The RL generator is explicitly optimized to produce cyclic peptides outside the original permeability training distribution, violating exchangeability. The manuscript must either restore exchangeability or empirically verify that nominal coverage is achieved on the highest-reward (most novel) generated peptides; without this, the asserted reliability improvement is not guaranteed by the CP formalism.

Authors: The referee correctly identifies the exchangeability assumption underlying the marginal coverage guarantee of inductive conformal prediction. Our RL generator is intentionally designed to explore beyond the training distribution, so exchangeability cannot be restored without restricting the generative process. However, we have performed empirical checks of coverage on the RL-generated peptides, including the highest-reward (most novel) designs, by measuring the fraction of true permeability values falling within the CP intervals. In the revised manuscript, we will add explicit tables and figures reporting these coverage rates (targeting the nominal 1-alpha level) on the generated set and include a dedicated discussion of the practical utility of CP even under mild distribution shift. This empirical verification directly addresses the concern and supports the observed reliability improvements. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical integration of standard CP with RL

full rationale

The paper presents an applied framework that trains a permeability predictor on external data, applies standard inductive conformal prediction for uncertainty quantification on a calibration set, and feeds the resulting CP-informed scores (e.g., lower prediction bounds) into an RL reward function for peptide generation. No derivation, equation, or central claim reduces by construction to a fitted quantity defined inside the paper, a self-referential definition, or a load-bearing self-citation chain. The asserted improvements in reliability and domain adherence are demonstrated through experimental runs rather than obtained tautologically from the method's inputs. The approach relies on externally validated CP theory and existing RL architectures without collapsing the argument into its own assumptions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that conformal prediction produces calibrated uncertainty estimates that remain informative when the generative model proposes out-of-distribution peptides. No new entities are postulated.

free parameters (1)

user-defined confidence level
The abstract states that CP assesses designs under a user-defined confidence level; this threshold directly controls which predictions are treated as reliable during RL reward computation.

axioms (1)

domain assumption The underlying permeability predictor can be wrapped by conformal prediction to produce valid prediction sets at the chosen confidence level.
Standard conformal prediction validity requires exchangeability of calibration and test points; the paper implicitly assumes this holds for the peptide permeability task.

pith-pipeline@v0.9.0 · 5592 in / 1277 out tokens · 47084 ms · 2026-05-08T11:31:46.377387+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

[1]

uncertainty

INTRODUCTION The therapeutic potential of peptides is gaining attention due to the potential advantages they offer over other modalities like small molecules1,2. These include their modifiability3,4 and larger interaction surfaces, which can lead to higher affinity and specificity when targeting larger or more shallow surfaces 2,5. Even though peptides ar...

work page
[2]

permeable

RESULTS In the following sections, we will explore the effectiveness of CP as a scoring component in RL, with a focus on improving passive peptide permeability. Our objective is to identify peptides that are predicted to be permeable with high confidence. In this work, the quality of the method is measured by monitoring the efficiency of accessing high-re...

work page
[3]

It steers the generation toward high scoring regions with calibrated confidence

DISCUSSION This research demonstrates that integrating conformal prediction directly into RL -guided molecular design enables more reliable and efficient exploration of chemical space. It steers the generation toward high scoring regions with calibrated confidence . In this research, we investigated how this can be used for de novo design of permeable cyc...

work page
[4]

For the first time, w e present an uncertainty-aware generative framework by integrating conformal prediction with reinforcement learning ‑guided generative design

CONCLUSION In conclusion, our approach addresses the crucial aspect of connecting the uncertainty inherent in predictive models with the strategic exploration and learning behavior dictated by RL. For the first time, w e present an uncertainty-aware generative framework by integrating conformal prediction with reinforcement learning ‑guided generative des...

work page
[5]

METHODS This methods section describes the RL framework for peptide design, the predictive model with CP for permeability prediction, and various strategies used to incorporate this predictor into the RL loop as a scoring component. 5.1. Reinforcement Learning Loop Figure 5: Overview of an example of how RL can be applied to drug design For this research ...

work page
[6]

Both" predictionsclass=0 Number of samples with true label 0 (Equation 4) 𝑉𝑎𝑙𝑖𝑑𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠=1 = Correct single label𝑐𝑙𝑎𝑠𝑠=1 + Class

(Equation 3) Based on the loss, the parameters of the generative model are adjusted to converge towards a space that minimizes the loss. 5.2. Predictive model for Permeability Dataset description For training the permeability classifier , the Parallel Artificial Membrane Permeability Assay (PAMPA)40 dataset from CycPeptMPDB was used 8. This contains data ...

work page 2022
[7]

REFERENCES (1) Choi, J.-S.; Joo, S. H. Recent Trends in Cyclic Peptides as Therapeutic Agents and Biochemical Tools. Biomol. Ther. 2019, 28. https://doi.org/10.4062/biomolther.2019.082. (2) Wang, L.; Wang, N.; Zhang, W.; Cheng, X.; Yan, Z.; Shao, G.; Wang, X.; Wang, R.; Fu, C. Therapeutic Peptides: Current Applications and Future Directions. Signal Transd...

work page doi:10.4062/biomolther.2019.082 2019
[8]

(12) Gentile, F.; Yaacoub, J

https://doi.org/10.1038/s41598-024-58122-7. (12) Gentile, F.; Yaacoub, J. C.; Gleave, J.; Fernandez, M.; Ton, A. -T.; Ban, F.; Stern, A.; Cherkasov, A. Artificial Intelligence –Enabled Virtual Screening of Ultra -Large Chemical Libraries with Deep Docking. Nat. Protoc. 2022, 17 (3), 672–697. https://doi.org/10.1038/s41596-021-00659-2. (13) Liu, W.; Li, J....

work page doi:10.1038/s41598-024-58122-7 2022
[9]

(16) Geylan, G.; De Maria, L.; Engkvist, O.; David, F.; Norinder, U

https://doi.org/10.1021/acs.molpharmaceut.4c00478. (16) Geylan, G.; De Maria, L.; Engkvist, O.; David, F.; Norinder, U. A Methodology to Correctly Assess the Applicability Domain of Cell Membrane Permeability Predictors for Cyclic Peptides. Digit. Discov. 2024, 3 (9), 1761–1775. (17) Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular De -Novo ...

work page doi:10.1021/acs.molpharmaceut.4c00478 2024

[1] [1]

uncertainty

INTRODUCTION The therapeutic potential of peptides is gaining attention due to the potential advantages they offer over other modalities like small molecules1,2. These include their modifiability3,4 and larger interaction surfaces, which can lead to higher affinity and specificity when targeting larger or more shallow surfaces 2,5. Even though peptides ar...

work page

[2] [2]

permeable

RESULTS In the following sections, we will explore the effectiveness of CP as a scoring component in RL, with a focus on improving passive peptide permeability. Our objective is to identify peptides that are predicted to be permeable with high confidence. In this work, the quality of the method is measured by monitoring the efficiency of accessing high-re...

work page

[3] [3]

It steers the generation toward high scoring regions with calibrated confidence

DISCUSSION This research demonstrates that integrating conformal prediction directly into RL -guided molecular design enables more reliable and efficient exploration of chemical space. It steers the generation toward high scoring regions with calibrated confidence . In this research, we investigated how this can be used for de novo design of permeable cyc...

work page

[4] [4]

For the first time, w e present an uncertainty-aware generative framework by integrating conformal prediction with reinforcement learning ‑guided generative design

CONCLUSION In conclusion, our approach addresses the crucial aspect of connecting the uncertainty inherent in predictive models with the strategic exploration and learning behavior dictated by RL. For the first time, w e present an uncertainty-aware generative framework by integrating conformal prediction with reinforcement learning ‑guided generative des...

work page

[5] [5]

METHODS This methods section describes the RL framework for peptide design, the predictive model with CP for permeability prediction, and various strategies used to incorporate this predictor into the RL loop as a scoring component. 5.1. Reinforcement Learning Loop Figure 5: Overview of an example of how RL can be applied to drug design For this research ...

work page

[6] [6]

Both" predictionsclass=0 Number of samples with true label 0 (Equation 4) 𝑉𝑎𝑙𝑖𝑑𝑖𝑡𝑦𝑐𝑙𝑎𝑠𝑠=1 = Correct single label𝑐𝑙𝑎𝑠𝑠=1 + Class

(Equation 3) Based on the loss, the parameters of the generative model are adjusted to converge towards a space that minimizes the loss. 5.2. Predictive model for Permeability Dataset description For training the permeability classifier , the Parallel Artificial Membrane Permeability Assay (PAMPA)40 dataset from CycPeptMPDB was used 8. This contains data ...

work page 2022

[7] [7]

REFERENCES (1) Choi, J.-S.; Joo, S. H. Recent Trends in Cyclic Peptides as Therapeutic Agents and Biochemical Tools. Biomol. Ther. 2019, 28. https://doi.org/10.4062/biomolther.2019.082. (2) Wang, L.; Wang, N.; Zhang, W.; Cheng, X.; Yan, Z.; Shao, G.; Wang, X.; Wang, R.; Fu, C. Therapeutic Peptides: Current Applications and Future Directions. Signal Transd...

work page doi:10.4062/biomolther.2019.082 2019

[8] [8]

(12) Gentile, F.; Yaacoub, J

https://doi.org/10.1038/s41598-024-58122-7. (12) Gentile, F.; Yaacoub, J. C.; Gleave, J.; Fernandez, M.; Ton, A. -T.; Ban, F.; Stern, A.; Cherkasov, A. Artificial Intelligence –Enabled Virtual Screening of Ultra -Large Chemical Libraries with Deep Docking. Nat. Protoc. 2022, 17 (3), 672–697. https://doi.org/10.1038/s41596-021-00659-2. (13) Liu, W.; Li, J....

work page doi:10.1038/s41598-024-58122-7 2022

[9] [9]

(16) Geylan, G.; De Maria, L.; Engkvist, O.; David, F.; Norinder, U

https://doi.org/10.1021/acs.molpharmaceut.4c00478. (16) Geylan, G.; De Maria, L.; Engkvist, O.; David, F.; Norinder, U. A Methodology to Correctly Assess the Applicability Domain of Cell Membrane Permeability Predictors for Cyclic Peptides. Digit. Discov. 2024, 3 (9), 1761–1775. (17) Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular De -Novo ...

work page doi:10.1021/acs.molpharmaceut.4c00478 2024