MoleCode is a training-free, LLM-native representation that makes molecular graphs with explicit atoms, bonds, and topology directly readable and editable in language models, improving structural tasks over implicit string encodings.
A molecular multimodal foundation model associating molecule graphs with natural language
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
S^2-Bench is a new one-to-many benchmark for natural language-driven molecule generation with three tasks, and OpenMolIns is an instruction dataset enabling Llama3.1-8B to outperform GPT-4o and Claude-3.5 on it.
BadGraph poisons training data with textual triggers to implant backdoors in latent diffusion models for text-guided graph generation, achieving 50% attack success rate at under 10% poisoning and over 80% at 24% poisoning with negligible clean performance loss.
MolReFlect introduces a teacher-student framework that automatically creates fine-grained molecule-text alignments to achieve SOTA results on molecule-caption translation.
Bolek injects Morgan fingerprint embeddings into an instruction-tuned text model, then fine-tunes on molecular alignment and synthetic chain-of-thought tasks to improve performance and grounding on 15 TDC binary classification endpoints while generalizing to unseen tasks.
citing papers explorer
-
MoleCode unlocks structural intelligence in large language models
MoleCode is a training-free, LLM-native representation that makes molecular graphs with explicit atoms, bonds, and topology directly readable and editable in language models, improving structural tasks over implicit string encodings.
-
Speak-to-Structure: Evaluating LLMs in Open-domain Natural Language-Driven Molecule Generation
S^2-Bench is a new one-to-many benchmark for natural language-driven molecule generation with three tasks, and OpenMolIns is an instruction dataset enabling Llama3.1-8B to outperform GPT-4o and Claude-3.5 on it.
-
BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation
BadGraph poisons training data with textual triggers to implant backdoors in latent diffusion models for text-guided graph generation, achieving 50% attack success rate at under 10% poisoning and over 80% at 24% poisoning with negligible clean performance loss.
-
MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts
MolReFlect introduces a teacher-student framework that automatically creates fine-grained molecule-text alignments to achieve SOTA results on molecule-caption translation.
-
Bolek: A Multimodal Language Model for Molecular Reasoning
Bolek injects Morgan fingerprint embeddings into an instruction-tuned text model, then fine-tunes on molecular alignment and synthetic chain-of-thought tasks to improve performance and grounding on 15 TDC binary classification endpoints while generalizing to unseen tasks.