MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts

Jiatong Li , Yunqing Liu , Wei Liu , Jingdi Le , Di Zhang , Wenqi Fan , Dongzhan Zhou , Yuqiang Li

show 1 more author

Qing Li

Authors on Pith no claims yet

classification 💻 cs.CL cs.LGq-bio.QM

keywords alignmentsfine-grainedllmsmoleculesmolreflectlanguagemolecularmolecule

0 comments

read the original abstract

Molecule discovery is a pivotal research field, impacting everything from medicine to materials. Recently, Large Language Models (LLMs) have been widely adopted in molecular understanding and generation, serving as a bridge between the molecular space and the natural language space, yet the alignment between molecules and their corresponding captions remains a significant challenge. Previous endeavors typically treat molecules as monolithic inputs, lacking an intermediate reasoning process and sacrificing explainability. In this work, we define fine-grained alignments as the precise correspondence between a molecule's sub-structures and the textual phrases that explain their properties. These alignments are crucial for LLMs to understand molecules in a more accurate and explainable manner. Normally, such fine-grained alignments require expert annotation, which is both costly and time-consuming. To allow LLMs to automatically label and learn the fine-grained alignments, we propose MolReFlect, a novel teacher-student framework, where a teacher LLM first generates and refines mappings between caption phrases and SMILES substructures and then explicitly teaches these detailed alignments to a student LLM. Experimental results demonstrate that MolReFlect enables LLMs to significantly outperform previous baselines, achieving the state-of-the-art performance in the molecule-caption translation task. Our codes are available via: https://github.com/phenixace/MolReFlect.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Mol-Debate: Multi-Agent Debate Improves Structural Reasoning in Molecular Design
cs.AI 2026-04 unverdicted novelty 6.0

Mol-Debate applies multi-agent debate in an iterative loop with perspective orchestration to achieve state-of-the-art text-guided molecular design, scoring 59.82% exact match on ChEBI-20 and 50.52% weighted success on...