pith. sign in

arxiv: 2601.22382 · v2 · submitted 2026-01-29 · 💻 cs.LG

Purely Agent-Driven Black-Box Optimization for Biological Design

Pith reviewed 2026-05-16 09:34 UTC · model grok-4.3

classification 💻 cs.LG
keywords black-box optimizationagentic systemslarge language modelsbiological designmolecular optimizationantimicrobial peptidesGuacaMol
0
0 comments X

The pith

A hierarchical system of LLMs optimizes biological designs through language-based reasoning over literature and constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PABLO, a purely agent-driven method that frames black-box optimization in biology as an iterative language reasoning process using scientific LLMs. These agents generate candidate molecules or peptides, refine them based on task descriptions and retrieved knowledge, and avoid the need for direct structural encodings. On GuacaMol molecular design and antimicrobial peptide tasks, the approach delivers higher final objective values and better sample efficiency than prior baselines. It also handles semantic constraints and domain knowledge naturally within the loop. In vitro tests further showed that the resulting peptides retained strong activity against drug-resistant pathogens.

Core claim

PABLO is a hierarchical agentic system that uses LLMs pretrained on chemistry and biology literature to generate and iteratively refine biological candidates, casting the optimization as a language-based reasoning process rather than a structure-centered search; this yields state-of-the-art results on GuacaMol and peptide benchmarks while maintaining competitive token usage and enabling direct use of semantic task descriptions, retrieval-augmented knowledge, and complex constraints.

What carries the argument

The hierarchical agentic system of scientific LLMs that generates, evaluates, and refines candidates through language reasoning, retrieval, and iterative refinement.

Load-bearing premise

That LLMs pretrained on scientific literature can reliably generate chemically valid and synthesizable candidates without external verification beyond the black-box objective.

What would settle it

A controlled run on GuacaMol or a peptide task in which PABLO produces a majority of chemically invalid structures that cannot be synthesized or scored, resulting in no net improvement over baselines.

Figures

Figures reproduced from arXiv: 2601.22382 by Alden Rose, Cesar de la Fuente-Nunez, Fangping Wan, Gaurav Ng Goel, Haydn Thomas Jones, Hyun-Su Lee, Jacob R. Gardner, Kyurae Kim, Marcelo Der Torossian Torres, Mark Yatskar, Natalie Maus, Osbert Bastani, Yimeng Zeng, Yining Huang.

Figure 1
Figure 1. Figure 1: A graphical overview of one optimization iteration of PABLO. Each iteration begins with (i) global candidate exploration, (ii) strategy generation via the Planner Agent, and (iii) local refinement of incumbents via planner-proposed strategies. All candidate generations are filtered for validity, novelty, and feasibility before evaluation. PABLO-pseudocode is also provided in Algorithm 1. tions). We describ… view at source ↗
Figure 2
Figure 2. Figure 2: Comparing the average number of LLM tokens used per run by PABLO and other LLM-based baselines. We evaluate PABLO on two biological design domains: (i) small-molecule optimization on GuacaMol benchmark tasks (Brown et al., 2019) and (ii) antimicrobial peptide (AMP) design using a black-box minimum inhibitory con￾centration (MIC) oracle (Torres et al., 2025). Our experi￾ments are designed to answer: (1) How… view at source ↗
Figure 3
Figure 3. Figure 3: GuacaMol optimization results on 10 tasks. Curves show objective value of the best molecule found so far as a function of black-box evaluations. PABLO achieves state-of-the-art perfor￾mance, rapidly reaching strong objective values across tasks. baselines reported by Gao et al. (2022a). Compared against these baselines, PABLO ranks first in every task, often by substantial margins. 3.3 PABLO Extensions In … view at source ↗
Figure 4
Figure 4. Figure 4: PABLO ablation on representative GuacaMol tasks show￾ing the contribution of the Planner and Explorer Agents. the Planner Agent consistently achieves higher final perfor￾mance and better sample efficiency. The Planner Agent even outperforms the 10-prompt baseline, showing that Planner prompts are more effective than a fixed set. The static prompts are listed in Section L. Removing both the Explorer and Pla… view at source ↗
Figure 5
Figure 5. Figure 5: Antimicrobial peptide (AMP) optimization. We plot predicted MIC versus black-box evaluations (lower MIC is better). (Upper Left): The predicted MIC of the best peptide found so far. (Lower Left): Template-constrained optimization; we show the predicted MIC of the best “feasible” peptide found so far. (Upper Right): Template-free optimization of a diverse portfolio of 20 AMPs; we show the mean predicted MIC… view at source ↗
Figure 6
Figure 6. Figure 6: In vitro MIC results against the 11 target bacteria (B1–B11; see Table E.1) achieved by the three best-performing peptides from the M = 20-peptide portfolio produced by one run of PABLO on the AMP design task. Peptide sequences are listed in Table B.7. For complete in vitro results see Figure B.1. x ∗ = argmaxx∈X f(x), where f might measure binding affinity, solubility, or drug-likeness. The machine learni… view at source ↗
read the original abstract

Many key challenges in biological design -- such as small-molecule drug discovery, antimicrobial peptide development, and protein engineering -- can be framed as black-box optimization over vast, complex structured spaces. Existing methods rely mainly on raw structural data and struggle to exploit the rich scientific literature. While large language models (LLMs) have been added to these pipelines, they have been confined to narrow roles within structure-centered optimizers. We instead cast biological black-box optimization as an agent-driven, language-based reasoning process. We introduce Purely Agent-driven BLack-box Optimization (PABLO), a hierarchical agentic system that uses scientific LLMs pretrained on chemistry and biology literature to generate and iteratively refine biological candidates. On both the standard GuacaMol molecular design and antimicrobial peptide optimization tasks, PABLO achieves state-of-the-art performance, substantially improving sample efficiency and final objective values over established baselines. Compared to prior optimization methods that incorporate LLMs, PABLO achieves competitive token usage per run despite relying on LLMs throughout the optimization loop. Beyond raw performance, the agentic formulation offers key advantages for realistic design: it naturally incorporates semantic task descriptions, retrieval-augmented domain knowledge, and complex constraints. In follow-up in vitro validation, PABLO-optimized peptides showed strong activity against drug-resistant pathogens, underscoring the practical potential of PABLO for therapeutic discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces PABLO, a hierarchical agentic system that casts biological black-box optimization (molecular design, antimicrobial peptide optimization) as a language-based reasoning process using scientific LLMs pretrained on chemistry and biology literature. It claims state-of-the-art performance on standard GuacaMol and peptide tasks with substantially improved sample efficiency and final objective values over baselines, competitive token usage, natural incorporation of semantic constraints, and successful in vitro validation of optimized peptides against drug-resistant pathogens.

Significance. If the performance and validity claims hold after detailed verification, PABLO would demonstrate that purely agent-driven LLM reasoning can outperform structure-centered optimizers on public benchmarks while enabling more realistic design workflows that embed domain knowledge and constraints directly in the loop.

major comments (2)
  1. [Experimental evaluation] Experimental evaluation section (GuacaMol and AMP tasks): the SOTA claims on sample efficiency and final objective values are presented without baseline implementation details, statistical significance tests, ablation studies on hierarchy depth/iteration budget, or explicit handling of invalid SMILES/sequences; this is load-bearing because GuacaMol penalizes invalids at the oracle and the reported efficiency gains rest on the unverified assumption that LLM agents produce valid candidates without external verification oracles.
  2. [Methods] Methods section on the hierarchical agentic loop: no explicit mechanism, prompt engineering, or post-generation filter is described to enforce chemical validity and synthesizability; the paper relies solely on pretrained LLM reasoning, yet the central efficiency advantage over prior LLM-augmented methods depends on this assumption holding without wasting black-box calls on non-candidates.
minor comments (2)
  1. [Abstract and Results] The abstract states 'competitive token usage per run' but the main text provides no quantitative token or API-call comparison table against the cited prior LLM methods.
  2. [In vitro validation] In vitro validation paragraph lacks details on the number of candidates synthesized, controls, or activity metrics relative to the optimization trajectory.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We have revised the paper to address the concerns about experimental details, statistical rigor, and validity enforcement in both the evaluation and methods sections. These changes clarify our implementation and strengthen the claims regarding PABLO's performance and efficiency.

read point-by-point responses
  1. Referee: [Experimental evaluation] Experimental evaluation section (GuacaMol and AMP tasks): the SOTA claims on sample efficiency and final objective values are presented without baseline implementation details, statistical significance tests, ablation studies on hierarchy depth/iteration budget, or explicit handling of invalid SMILES/sequences; this is load-bearing because GuacaMol penalizes invalids at the oracle and the reported efficiency gains rest on the unverified assumption that LLM agents produce valid candidates without external verification oracles.

    Authors: We agree that additional details are required to support the SOTA claims. In the revised manuscript, we have expanded the Experimental Evaluation section with: (i) full baseline implementation details including code repositories, hyperparameter settings, and re-implementation notes for each comparator; (ii) statistical significance testing (paired t-tests and Wilcoxon signed-rank tests across 5 independent runs with reported p-values); (iii) new ablation studies on hierarchy depth (single-level vs. full hierarchical) and iteration budget, shown in an additional figure and table; and (iv) explicit validity handling, including reported validity rates (>96% for GuacaMol runs) and the addition of an RDKit-based pre-oracle validation step that rejects invalid SMILES before any black-box evaluation. These revisions confirm that efficiency gains arise from valid candidates and not from unpenalized invalids. revision: yes

  2. Referee: [Methods] Methods section on the hierarchical agentic loop: no explicit mechanism, prompt engineering, or post-generation filter is described to enforce chemical validity and synthesizability; the paper relies solely on pretrained LLM reasoning, yet the central efficiency advantage over prior LLM-augmented methods depends on this assumption holding without wasting black-box calls on non-candidates.

    Authors: We acknowledge that the original Methods description was insufficiently explicit. The revised version now includes: (i) the complete prompt templates for each agent in the hierarchy, with explicit instructions for chemical validity and synthesizability drawn from the scientific literature; (ii) a description of the post-generation filter that applies RDKit parsing and basic synthesizability heuristics (e.g., valence checks, absence of unstable motifs) to reject invalid outputs and trigger regeneration within the agent loop; and (iii) pseudocode for the full hierarchical iteration that shows how only validated candidates proceed to the oracle. This mechanism ensures no black-box calls are wasted on non-candidates and directly supports the reported sample-efficiency advantage. Example prompts and filter code are provided in the supplementary material. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes PABLO as a hierarchical agentic LLM system for black-box biological optimization and reports performance on external public benchmarks (GuacaMol molecular design and antimicrobial peptide tasks) whose objective functions and validity rules are defined independently of the authors. No mathematical derivations, equations, fitted parameters presented as predictions, or self-citations appear in the provided text as load-bearing steps that reduce the central claims to inputs by construction. The method is presented procedurally with advantages over baselines measured against fixed external oracles.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that scientific LLMs can perform reliable iterative reasoning over biological design spaces; this is treated as a domain capability rather than derived from first principles.

free parameters (1)
  • Agent hierarchy depth and iteration budget
    The hierarchical agent structure and number of refinement steps are design choices that must be set for each task.
axioms (1)
  • domain assumption LLMs pretrained on chemistry and biology literature can generate chemically valid and biologically relevant molecular or peptide candidates.
    Invoked when the agents propose and refine candidates without external structure checkers.

pith-pipeline@v0.9.0 · 5600 in / 1365 out tokens · 26426 ms · 2026-05-16T09:34:42.739295+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

194 extracted references · 194 canonical work pages

  1. [1]

    Pred. MIC

    Curran Associates Inc. Zhou, Z., Kearnes, S., Li, L., Zare, R. N., and Riley, P. Op- timization of molecules via deep reinforcement learning. Scientific Reports, 9(1), July 2019. ISSN 2045-2322. Zou, Y ., Cheng, A. H., Aldossary, A., Bai, J., Leong, S. X., Campos-Gonzalez-Angulo, J. A., Choi, C., Ser, C. T., Tom, G., Wang, A., Zhang, Z., Yakavets, I., Hao...

  2. [2]

    **Analyze the MOLECULE-SCORE DATA: ** What molecular features correlate with high scores? Form 2-3 hypotheses about what the scoring function rewards

  3. [3]

    candidates

    **Generate:** Propose 10-20 NEW molecules that: - Push your hypotheses to their LOGICAL EXTREME for maximum scores - Combine best features from multiple top scorers - Explore creative new structural ideas ## OUTPUT FORMAT Return ONLY a JSON object with a list of VALID SMILES strings called ’candidates’. Example: {"candidates": ["SMILES 1", "SMILES 2", "SM...

  4. [4]

    **Analyze the PEPTIDE-MIC DATA: ** What sequence features correlate with low MIC? Form 2-3 hypotheses about what makes an effective AMP (e.g., cationic charge, amphipathicity, hydrophobic content, length, specific motifs)

  5. [5]

    candidates

    **Generate:** Propose 10-20 NEW antimicrobial peptides that: - Push your hypotheses to their LOGICAL EXTREME for minimum MIC - Combine best features from multiple top performers - Explore creative new sequence ideas using what you know about AMPs ## OUTPUT FORMAT Return ONLY a JSON object with a list of VALID peptide sequences called 29 Purely Agentic Bla...

  6. [7]

    Molecules - Default Task 2 (EXPLORE): TASK: Generate SMILES with different meaningful structural changes to the input

    Keep the core scaffold mostly intact. Molecules - Default Task 2 (EXPLORE): TASK: Generate SMILES with different meaningful structural changes to the input. HINTS:

  7. [8]

    Each output should be a distinct modification type (ring size, linker swap, substituent move)

  8. [9]

    Make significant moves, not minor tweaks

  9. [10]

    Molecules - Default Task 3: (SCAFFOLD HOP): TASK: Generate scaffold hopping variations of the input molecule

    Explore broadly around the input. Molecules - Default Task 3: (SCAFFOLD HOP): TASK: Generate scaffold hopping variations of the input molecule. HINTS:

  10. [11]

    Make large topology-level changes (new ring systems, fusion patterns)

  11. [12]

    Avoid small local edits; make substantial core changes

  12. [13]

    Peptides - Default Task 1 (SIMILAR): TASK: Generate peptides that are conservative variants of the input

    Try: fused<->bridged<->spiro, cyclic<->polycyclic, aromatic<->non-aromatic cores. Peptides - Default Task 1 (SIMILAR): TASK: Generate peptides that are conservative variants of the input. HINTS:

  13. [14]

    Use similar amino acid substitutions (L<->I, D<->E, K<->R, F<->Y)

  14. [15]

    Preserve overall charge and hydrophobicity patterns

  15. [16]

    30 Purely Agentic Black-Box Optimization for Biological Design Peptides - Default Task 2 (EXPLORE): TASK: Generate peptides with meaningfully different modifications to the input

    Keep modifications minimal (1{2 changes). 30 Purely Agentic Black-Box Optimization for Biological Design Peptides - Default Task 2 (EXPLORE): TASK: Generate peptides with meaningfully different modifications to the input. HINTS: 1. Try substitutions from different amino acid classes (polar<->hydrophobic, charged<->neutral)

  16. [17]

    Vary the length by adding or removing 1{3 residues

  17. [18]

    Peptides - Default Task 3: (SHUFFLE): TASK: Generate peptides by rearranging amino acids in the input

    Each output should explore a different modification strategy. Peptides - Default Task 3: (SHUFFLE): TASK: Generate peptides by rearranging amino acids in the input. HINTS:

  18. [21]

    No performance data yet

    Try circular permutations (move N-terminal residues to C-terminus). I.2 Task Performance Statistics (performance stats). At initialization, the Planner Agent receives performance stats = "No performance data yet." . Since the Task Registry is updated online, in subsequent calls performance stats is a table of task success rates computed from the registry,...

  19. [22]

    What small structural change caused one to score slightly higher than another? These small differences are highly informative

    **Study the Score Gradient: ** Compare molecules with SIMILAR scores. What small structural change caused one to score slightly higher than another? These small differences are highly informative

  20. [23]

    **High vs Low Contrast: ** What features appear in top scorers but not low scorers? (ring types, chain lengths, functional groups, heteroatoms, flexibility)

  21. [24]

    TASK NAME

    **Identify Gaps: ** What types of modifications have NOT been tried yet? What regions of chemical space remain unexplored? ## YOUR GOAL Generate task prompts that help a smaller LLM: - **EXPLOIT:** Make targeted modifications based on patterns you observe in the score gradient - **EXPLORE:** Try diverse, creative modifications to discover new promising re...

  22. [27]

    Preserve peripheral substituents." } ## GUIDELINES 32 Purely Agentic Black-Box Optimization for Biological Design

  23. [34]

    New task names: SHORT, DESCRIPTIVE, ALL CAPS (e.g., ATOM SWAP, STABILIZE, RIGIDIFY). ## CREATIVE EXPLORATION IDEAS Consider tasks involving: - Specific functional groups - Specific atoms - Specific ring modifications (aromatic<->aliphatic, 5-ring<->6-ring, fusion, spiro) - Chain modifications (extend, shorten, branch, cyclize) - Polarity changes (add pola...

  24. [35]

    What small sequence change caused one to have slightly lower MIC than another? These small differences are highly informative

    **Study the Score Gradient: ** Compare peptides with SIMILAR MICs. What small sequence change caused one to have slightly lower MIC than another? These small differences are highly informative

  25. [36]

    **High vs Low Contrast: ** What features appear in top performers but not poor performers? (charge distribution, hydrophobic patches, length, specific motifs)

  26. [37]

    TASK NAME

    **Identify Gaps: ** What types of modifications have NOT been tried yet? What regions of sequence space remain unexplored? ## YOUR GOAL Generate task prompts that help a smaller LLM: - **EXPLOIT:** Make targeted modifications based on patterns you observe in the score gradient - **EXPLORE:** Try diverse, creative modifications to discover new promising re...

  27. [38]

    Replace neutral residues with K or R

  28. [39]

    Replace acidic residues (D, E) with neutral or basic ones

  29. [40]

    Add K or R at termini." } ## GUIDELINES

  30. [41]

    Output **8-10 tasks total ** - a mix of existing and new

  31. [42]

    Include 2-3 EXPLOITATION tasks (targeted at patterns you observed)

  32. [43]

    Include 2-3 EXPLORATION tasks (creative, untried modification types)

  33. [44]

    Include 2-4 reliable existing tasks that have (>0%) success rates

  34. [45]

    If a task 0 successes, avoid it or create an improved version (e.g., TASK NAME V2)

  35. [46]

    Keep new task descriptions concise (3-5 hints max)

  36. [47]

    New task names: SHORT, DESCRIPTIVE, ALL CAPS (e.g., CHARGE BOOST, HELIX FORM, TRUNCATE). ## CREATIVE EXPLORATION IDEAS FOR AMPs Consider tasks involving: - Charge modifications (increase/decrease cationic character) - Hydrophobicity changes (add/remove hydrophobic residues) - Secondary structure (promote helix, add proline kinks) - Length modifications (t...

  37. [48]

    Modify side chains, linkers, or substituents

  38. [49]

    OUTPUT FORMAT (REQUIRED): Return ONLY a JSON object with a list of 5-10 SMILES strings called ’candidates’

    Keep the core scaffold mostly intact. OUTPUT FORMAT (REQUIRED): Return ONLY a JSON object with a list of 5-10 SMILES strings called ’candidates’. Full Worker System Prompt Example - Peptides: You are an expert peptide generator operating in amino acid sequence space. INPUT: You will be given a single input peptide in the prompt. TASK: Generate peptides by...

  39. [50]

    Try swapping positions of residues

  40. [51]

    Try reversing short segments (3{5 residues)

  41. [52]

    OUTPUT FORMAT (REQUIRED): Return ONLY a JSON object with a list of 5-10 peptide sequences called ’candidates’

    Try circular permutations (move N-terminal residues to C-terminus). OUTPUT FORMAT (REQUIRED): Return ONLY a JSON object with a list of 5-10 peptide sequences called ’candidates’. J.2 Worker Generation-time Prompt. While the system prompt encodes the task description, the specific seed sequence or molecule is provided separately at generation time. For a c...

  42. [53]

    Insert/remove CH 2 units in aliphatic chains

  43. [54]

    Add/remove rotatable bonds near functional groups

  44. [55]

    Molecules Example 2 - Objective: adip - Task name: RING SIZE MOD, Task text: TASK: Adjust ring sizes in high-scoring scaffolds to explore conformational effects

    Test both rigid (cycloalkyl) and flexible (alkoxy) linkers. Molecules Example 2 - Objective: adip - Task name: RING SIZE MOD, Task text: TASK: Adjust ring sizes in high-scoring scaffolds to explore conformational effects. HINTS:

  45. [56]

    Convert 5-membered rings to 6-membered (or vice versa)

  46. [57]

    Maintain aromaticity where possible

  47. [58]

    Molecules Example 3 - Objective: fexo - Task name: BRANCHING, Task text: TASK: Increase molecular branching in hydrocarbon chains

    Ensure substituents are appropriately positioned. Molecules Example 3 - Objective: fexo - Task name: BRANCHING, Task text: TASK: Increase molecular branching in hydrocarbon chains. HINTS:

  48. [59]

    Add methyl branches to aliphatic chains

  49. [60]

    Create gem-dimethyl groups

  50. [61]

    Molecules Example 4 - Objective: fexo - Task name: ATOM SW AP, Task text: TASK: Replace key carbon atoms with heteroatoms (N, O, S) in aliphatic rings and linkers

    Introduce cyclopropyl rings for rigidity. Molecules Example 4 - Objective: fexo - Task name: ATOM SW AP, Task text: TASK: Replace key carbon atoms with heteroatoms (N, O, S) in aliphatic rings and linkers. HINTS:

  51. [62]

    Prioritize substitutions that maintain ring size but alter electronic 36 Purely Agentic Black-Box Optimization for Biological Design properties

  52. [63]

    Molecules Example 5 - Objective: med1 - Task name: SPIRO FUSE, Task text: TASK: Generate spiro-fused ring systems to explore novel conformational constraints

    Test bioisosteric replacements (e.g., -CH2- --> -O- in linkers). Molecules Example 5 - Objective: med1 - Task name: SPIRO FUSE, Task text: TASK: Generate spiro-fused ring systems to explore novel conformational constraints. HINTS:

  53. [64]

    Identify adjacent rings separated by 1-2 atoms

  54. [65]

    Merge into spiro junctions (shared single atom)

  55. [66]

    Molecules Example 6 - Objective: med1 - Task name: RING EXPANSION V2, Task text: TASK: Expand non-aromatic rings from 5 to 6 members

    Preserve peripheral substituents like isopropyl groups. Molecules Example 6 - Objective: med1 - Task name: RING EXPANSION V2, Task text: TASK: Expand non-aromatic rings from 5 to 6 members. HINTS:

  56. [67]

    Target rings adjacent to ketones

  57. [68]

    Use methylene insertion

  58. [69]

    Maintain bicyclic rigidity. Molecules Example 7 - Objective: med2 - Task name: QUINAZOLINONE KETONE SW APV3, Task text: TASK: Replace the quinazolinone ketone with thiazole or oxazole rings to alter electronic distribution and hydrogen bonding. HINTS:

  59. [70]

    Maintain planarity at the core interaction site

  60. [71]

    Ensure retention of key hydrogen bond acceptors

  61. [72]

    Test both 5-membered and 6-membered heterocycle replacements.", Molecules Example 8 - Objective: med2 - Task name: RING FUSION ENHANCE, Task text: TASK: Generate fused polycyclic variants by merging indole with adjacent aromatic rings through strategic bond formation. HINTS:

  62. [73]

    Create 6-5-6 tricyclic systems

  63. [74]

    Preserve indole’s NH while forming new ring junctions

  64. [75]

    Explore both angular and linear fusion patterns. Molecules Example 9 - Objective: osmb - Task name: CORE SW APBIOISOSTERE, Task text: TASK: Replace pyrimidine cores with bioisosteric heterocycles (e.g., triazine, pyridone, thiazine) while preserving substituent patterns. HINTS:

  65. [76]

    Match nitrogen positioning in new cores

  66. [77]

    Maintain planar aromaticity

  67. [78]

    Molecules Example 10 - Objective: osmb - Task name: HYDROXYL POSITION, Task text: TASK: Systematically relocate hydroxyl groups between chain positions and ring substituents

    Evaluate both 5- and 6-membered alternative cores. Molecules Example 10 - Objective: osmb - Task name: HYDROXYL POSITION, Task text: TASK: Systematically relocate hydroxyl groups between chain positions and ring substituents. 37 Purely Agentic Black-Box Optimization for Biological Design HINTS:

  68. [79]

    Compare terminal vs internal hydroxyl placement

  69. [80]

    Test hydroxyl migration to adjacent carbons

  70. [81]

    Consider diol formation in chains Molecules Example 11 - Objective: pdop - Task name: INDOLE BRANCHING, Task text: TASK: Add alkyl or functionalized branches to indole rings in the input molecule. HINTS:

  71. [82]

    Introduce methyl or hydroxyl groups at indole C4-C7 positions

  72. [83]

    Attach small polar groups (e.g., -CH2OH) to indole nitrogen

  73. [84]

    Molecules Example 12 - Objective: pdop - Task name: CHAIN MOD, Task text: TASK: Modify alkyl chain lengths and branching in linker regions (e.g., +1/-1 CH2, add methyl branches)

    Preserve core indole hydrogen bonding capability. Molecules Example 12 - Objective: pdop - Task name: CHAIN MOD, Task text: TASK: Modify alkyl chain lengths and branching in linker regions (e.g., +1/-1 CH2, add methyl branches). HINTS:

  74. [85]

    Focus on chains between amide bonds

  75. [86]

    Test both elongation and shortening

  76. [87]

    Introduce branching near aromatic systems. Molecules Example 13 - Objective: rano - Task name: FLUORINE CHAIN OPT, Task text: TASK: Optimize fluorinated chain geometry by adjusting double bond positions and terminal fluorine placement. HINTS:

  77. [88]

    Shift F from terminal to penultimate position

  78. [89]

    Alternate E/Z configurations in conjugated system

  79. [90]

    Molecules Example 14 - Objective: rano - Task name: DOUBLE BOND MOD, Task text: TASK: Alter conjugated double bond systems

    Introduce cyclopropane into the chain for rigidity. Molecules Example 14 - Objective: rano - Task name: DOUBLE BOND MOD, Task text: TASK: Alter conjugated double bond systems. HINTS:

  80. [91]

    Shift /C=C/ positions closer to aromatic rings

Showing first 80 references.