MolViBench is the first benchmark designed to evaluate LLMs on generating executable programs for molecular tasks in drug discovery.
Release , volume=
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Large language models exhibit distinct creative patterns in molecule generation, including higher constraint satisfaction when more constraints are added, and this is the first work to reframe molecule generation abilities as creativity.
Mol-Debate applies multi-agent debate in an iterative loop with perspective orchestration to achieve state-of-the-art text-guided molecular design, scoring 59.82% exact match on ChEBI-20 and 50.52% weighted success on S2-Bench.
The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.
citing papers explorer
-
MolViBench: Evaluating LLMs on Molecular Vibe Coding
MolViBench is the first benchmark designed to evaluate LLMs on generating executable programs for molecular tasks in drug discovery.
-
How Creative Are Large Language Models in Generating Molecules?
Large language models exhibit distinct creative patterns in molecule generation, including higher constraint satisfaction when more constraints are added, and this is the first work to reframe molecule generation abilities as creativity.
-
Mol-Debate: Multi-Agent Debate Improves Structural Reasoning in Molecular Design
Mol-Debate applies multi-agent debate in an iterative loop with perspective orchestration to achieve state-of-the-art text-guided molecular design, scoring 59.82% exact match on ChEBI-20 and 50.52% weighted success on S2-Bench.
-
Memory in the Age of AI Agents
The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.