LitMOF: An LLM Multi-Agent for Literature-Validated Metal-Organic Frameworks Database Correction and Expansion
Pith reviewed 2026-05-17 02:50 UTC · model grok-4.3
The pith
An LLM multi-agent system validates MOF structures against original papers, repairing 8,771 invalid database entries and adding 12,646 previously missing ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LitMOF is an LLM-driven multi-agent system that pulls crystallographic files and synthesis descriptions directly from the original literature, cross-validates them with database entries, and thereby repairs structural errors. When applied to the CSD MOF Subset, the system produces LitMOF-DB containing 186,773 computation-ready structures, successfully repairing 8,771 invalid entries (65.3 percent of the not-computation-ready MOFs in the latest CoRE MOF database) and identifying 12,646 experimentally reported MOFs absent from prior resources. A direct-air-capture screening demonstration establishes that these structural errors distort predicted adsorption energies and CO2/H2O selectivity,造成材料
What carries the argument
LitMOF, the LLM multi-agent framework that extracts and cross-validates crystallographic and synthesis information from literature to repair database entries.
If this is right
- Corrected MOF structures reduce distortion in predicted adsorption energies and CO2/H2O selectivity during high-throughput screening.
- Repairing 65.3 percent of invalid entries produces a larger pool of reliable computation-ready structures for materials discovery.
- Uncovering 12,646 previously absent experimental MOFs expands the searchable design space for new framework synthesis.
- Structural errors cause systematic misranking, false positives, and omission of high-performance candidates in screening workflows.
- The method supplies a scalable route for continuous, literature-driven curation of materials databases.
Where Pith is reading between the lines
- The same multi-agent extraction pattern could be tested on other classes of porous materials or on databases in related fields such as catalysis or battery materials.
- If run periodically on new publications, the system could keep MOF databases automatically updated without requiring repeated full manual reviews.
- Machine-learning models trained on the corrected LitMOF-DB might show improved accuracy in property prediction because they avoid learning from structurally flawed examples.
- Running the repair pipeline with different base LLMs or adding human-in-the-loop verification steps could quantify how much performance depends on the choice of language model.
Load-bearing premise
The multi-agent LLM system can reliably pull accurate crystallographic and synthesis details from papers without creating new extraction mistakes or missing information that would invalidate a repair.
What would settle it
Randomly select 100 of the repaired MOF entries and have a crystallographer compare the LitMOF-extracted structures and synthesis conditions against the original published papers to measure agreement rate.
Figures
read the original abstract
Metal-organic framework (MOF) databases have grown rapidly through experimental deposition and large-scale literature extraction, but recent analyses show that nearly half of their entries contain substantial structural errors. These inaccuracies propagate through high-throughput screening and machine-learning workflows, limiting the reliability of data-driven MOF discovery. Correcting such errors is exceptionally difficult because true repairs require integrating crystallographic files, synthesis descriptions, and contextual evidence scattered across the literature. Here we introduce LitMOF, a large language model-driven multi-agent framework that validates crystallographic information directly from the original literature and cross-validates it with database entries to repair structural errors. Applying LitMOF to the experimental MOF database (the CSD MOF Subset), we constructed LitMOF-DB, a curated set of 186,773 computation-ready structures, including the successful repair of 8,771 invalid entries, which accounts for 65.3% of the not-computation-ready MOFs in the latest CoRE MOF database. Additionally, the system uncovered 12,646 experimentally reported MOFs absent from existing resources, substantially expanding the known experimental design space. Using direct air capture screening as a case study, we demonstrate that structural errors severely distort predicted adsorption energies and CO2/H2O selectivity, leading to systematic misranking of materials, false positives, and the omission of high-performance candidates. This work establishes a scalable pathway toward self-correcting scientific databases and a generalizable paradigm for LLM-driven curation in materials science.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces LitMOF, an LLM-driven multi-agent framework that extracts and cross-validates crystallographic parameters, space groups, and synthesis details directly from original literature sources to repair structural errors in MOF databases. Applied to the CSD MOF Subset, it produces LitMOF-DB containing 186,773 computation-ready structures, including 8,771 repairs that cover 65.3% of the not-computation-ready entries in the latest CoRE MOF database, plus 12,646 newly identified experimentally reported MOFs absent from prior resources. A direct air capture screening case study shows that uncorrected structural errors distort adsorption energies and CO2/H2O selectivity rankings.
Significance. If the extraction accuracy holds, the work offers a scalable, literature-grounded approach to curating large materials databases, directly addressing documented error rates that undermine high-throughput screening and ML models in MOF discovery. The quantitative expansion of the experimental design space and the demonstrated impact on property rankings provide a concrete pathway toward self-correcting databases in materials science.
major comments (2)
- [Results] Results section (and abstract): The headline statistics—8,771 repairs accounting for 65.3% of CoRE not-computation-ready MOFs and 12,646 newly uncovered structures—rest entirely on the multi-agent LLM correctly parsing CIF coordinates, space groups, disorder, and literature-database mismatches. No precision/recall, inter-annotator agreement on a human-labeled subset, or error analysis for hallucinated parameters or overlooked synthesis constraints is reported, leaving the fraction of valid repairs versus newly introduced errors unquantified.
- [Methods] Methods section: The description of the multi-agent workflow lacks any ablation on prompt engineering, temperature settings, or fallback validation rules. Without these details or a reported human-expert agreement rate on a sample of repairs, it is impossible to bound the reliability of the cross-validation step that underpins all quantitative claims.
minor comments (2)
- [Figures/Tables] Figure captions and tables should explicitly state the exact CoRE MOF version and CSD release dates used for the baseline comparison to allow reproducibility.
- [Case Study] The case-study section would benefit from a supplementary table listing the top-10 misranked materials before and after correction, with the specific structural error (e.g., wrong space group or missing solvent) noted for each.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. The comments highlight important aspects of validation and methodological transparency that we address below. We have revised the manuscript to incorporate additional validation results and expanded methods descriptions.
read point-by-point responses
-
Referee: [Results] Results section (and abstract): The headline statistics—8,771 repairs accounting for 65.3% of CoRE not-computation-ready MOFs and 12,646 newly uncovered structures—rest entirely on the multi-agent LLM correctly parsing CIF coordinates, space groups, disorder, and literature-database mismatches. No precision/recall, inter-annotator agreement on a human-labeled subset, or error analysis for hallucinated parameters or overlooked synthesis constraints is reported, leaving the fraction of valid repairs versus newly introduced errors unquantified.
Authors: We agree that the absence of quantitative accuracy metrics leaves the reliability of the headline statistics incompletely characterized. To address this, we have performed a human validation study on a randomly selected subset of 300 structures (150 repairs and 150 new identifications). Two independent experts reviewed the extracted CIF parameters, space groups, and synthesis details against the source literature. The revised Results section will include a new subsection reporting the inter-annotator agreement, precision, recall, and a qualitative error analysis of common failure modes such as disorder handling and potential hallucinations. The multi-agent cross-validation step is shown to reduce but not eliminate such risks; these additions will allow readers to assess the fraction of valid versus erroneous entries. revision: yes
-
Referee: [Methods] Methods section: The description of the multi-agent workflow lacks any ablation on prompt engineering, temperature settings, or fallback validation rules. Without these details or a reported human-expert agreement rate on a sample of repairs, it is impossible to bound the reliability of the cross-validation step that underpins all quantitative claims.
Authors: We concur that greater methodological detail is required to evaluate reproducibility and robustness. The revised Methods section will be expanded to describe the prompt templates for each agent (now included in the Supplementary Information), the temperature setting selected for the LLM calls, and the explicit fallback rules (e.g., requiring agreement across agents or routing low-confidence cases to manual inspection). The human-expert agreement rates obtained from the validation study described above will also be reported in this section to bound the reliability of the cross-validation procedure. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces LitMOF as a new LLM multi-agent system and applies it directly to the external CSD MOF Subset and original literature sources to generate LitMOF-DB, reporting empirical counts of repairs (8,771) and newly uncovered structures (12,646). These outcomes are produced by processing independent external data rather than reducing to internal definitions, fitted parameters, or self-citations by construction. No equations, ansatzes, uniqueness theorems, or renamings of known results are present that would make the headline numbers equivalent to the paper's own inputs. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Original literature papers contain extractable, accurate crystallographic information sufficient to validate or correct database entries.
- ad hoc to paper LLM agents can reliably cross-validate extracted data against database records without introducing new errors at a rate that would undermine the reported repair statistics.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LitMOF consists of a Supervisor and five specialized agents... Reference Builder constructs the expected structural motif... Inspector & Editor identifies and corrects inconsistencies
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we constructed LitMOF-DB, a curated set of 118,464 computation-ready structures, including corrections of 69% (6,161 MOFs)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Hunting Structural Demons in Digital Reticular Chemistry: Lessons from Metal-Organic Frameworks
Structural errors called 'structural demons' invalidate over half of top computational MOF screening candidates and can be reduced by keeping diffraction data with synthesis details and consistent curation.
Reference graph
Works this paper leans on
-
[1]
Reticular synthesis and the design of new materials
Omar M Yaghi, Michael O’Keeffe, Nathan W Ockwig, et al. “Reticular synthesis and the design of new materials”. In:Nature423.6941 (2003), pp. 705–714
work page 2003
-
[2]
Yongchul G Chung, Jeffrey Camp, Maciej Haranczyk, et al. “Computation-ready, experimental metal–organic frameworks: A tool to enable high-throughput screening of nanoporous crystals”. In:Chemistry of Materials 26.21 (2014), pp. 6185–6192
work page 2014
-
[3]
Yongchul G Chung, Emmanuel Haldoupis, Benjamin J Bucior, et al. “Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: CoRE MOF 2019”. In:Journal of Chemical & Engineering Data64.12 (2019), pp. 5985–5998
work page 2019
-
[4]
Guobin Zhao, Logan M Brabson, Saumil Chheda, et al. “CoRE MOF DB: A curated experimental metal-organic framework database with machine-learned properties for integrated material-process screening”. In:Matter8.6 (2025)
work page 2025
-
[5]
The Cambridge Structural Database: a quarter of a million crystal structures and rising
Frank H Allen. “The Cambridge Structural Database: a quarter of a million crystal structures and rising”. In: Structural Science58.3 (2002), pp. 380–388
work page 2002
-
[6]
Peyman Z Moghadam, Aurelia Li, Seth B Wiggin, et al. “Development of a Cambridge Structural Database subset: a collection of metal–organic frameworks for past, present, and future”. In:Chemistry of materials29.7 (2017), pp. 2618–2625
work page 2017
-
[7]
Marco Gibaldi, Anna Kapeliukha, Andrew White, et al. “MOSAEC-DB: a comprehensive database of experimen- tal metal–organic frameworks with verified chemical accuracy suitable for molecular simulations”. In:Chemical Science16.9 (2025), pp. 4085–4100
work page 2025
-
[8]
A multi-modal pre-training transformer for universal transfer learning in metal–organic frameworks
Yeonghun Kang, Hyunsoo Park, Berend Smit, and Jihan Kim. “A multi-modal pre-training transformer for universal transfer learning in metal–organic frameworks”. In:Nature Machine Intelligence5.3 (2023), pp. 309– 318
work page 2023
-
[9]
Gianmarco G Terrones, Shih-Peng Huang, Matthew P Rivera, et al. “Metal–organic framework stability in water and harsh environments from data-driven models trained on the diverse WS24 data set”. In:Journal of the American Chemical Society146.29 (2024), pp. 20333–20348
work page 2024
-
[10]
High Structural Error Rates in “Computation-Ready
Andrew J White, Marco Gibaldi, Jake Burner, R Alex Mayo, and Tom K Woo. “High Structural Error Rates in “Computation-Ready” MOF Databases Discovered by Checking Metal Oxidation States”. In:Journal of the American Chemical Society147.21 (2025), pp. 17579–17583
work page 2025
-
[11]
Revealing the effect of structure curations on the simulated CO 2 separation performances of MOFs
Sadiye Velioglu and Seda Keskin. “Revealing the effect of structure curations on the simulated CO 2 separation performances of MOFs”. In:Materials Advances1.3 (2020), pp. 341–353
work page 2020
-
[12]
Hilal Daglar, Hasan Can Gulbalkan, Gokay Avci, et al. “Effect of metal–organic framework (MOF) database selec- tion on the assessment of gas storage and separation potentials of MOFs”. In:Angewandte Chemie International Edition60.14 (2021), pp. 7828–7837. 13
work page 2021
-
[13]
Identifying misbonded atoms in the 2019 CoRE metal–organic framework database
Taoyi Chen and Thomas A Manz. “Identifying misbonded atoms in the 2019 CoRE metal–organic framework database”. In:RSC advances10.45 (2020), pp. 26944–26951
work page 2019
-
[14]
Marco Gibaldi, Ohmin Kwon, Andrew White, Jake Burner, and Tom K Woo. “The HEALED SBU library of chemically realistic building blocks for construction of hypothetical metal–organic frameworks”. In:ACS Applied Materials & Interfaces14.38 (2022), pp. 43372–43386
work page 2022
-
[15]
MOFChecker: a package for validating and correcting metal–organic framework (MOF) structures
Xin Jin, Kevin Maik Jablonka, Elias Moubarak, Yutao Li, and Berend Smit. “MOFChecker: a package for validating and correcting metal–organic framework (MOF) structures”. In:Digital Discovery(2025)
work page 2025
-
[16]
Marco Gibaldi, Anna Kapeliukha, Andrew White, and Tom K Woo. “Incorporation of ligand charge and metal oxidation state considerations into the computational solvent removal and activation of experimental crystal structures preceding molecular simulation”. In:Journal of Chemical Information and Modeling65.1 (2024), pp. 275–287
work page 2024
-
[17]
Guobin Zhao, Logan M. Brabson, Saumil Chheda, et al.Computation-Ready Experimental Metal-Organic Framework (CoRE MOF) 2024 Dataset. Version 1.1. Zenodo, Mar. 2025.DOI: 10.5281/zenodo.15055758. URL:https://doi.org/10.5281/zenodo.15055758
-
[18]
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
Lei Wang, Wanyu Xu, Yihuai Lan, et al. “Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models”. In:arXiv preprint arXiv:2305.04091(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[19]
Mace-off: Short-range transferable machine learning force fields for organic molecules
Dávid Péter Kovács, J Harry Moore, Nicholas J Browning, et al. “Mace-off: Short-range transferable machine learning force fields for organic molecules”. In:Journal of the American Chemical Society147.21 (2025), pp. 17598–17611
work page 2025
-
[20]
A foundation model for atomistic materials chemistry
Ilyes Batatia, Philipp Benner, Yuan Chiang, et al. “A foundation model for atomistic materials chemistry”. In: (2023). arXiv:2401.00096 [physics.chem-ph]
work page internal anchor Pith review arXiv 2023
-
[21]
Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials
Thomas F Willems, Chris H Rycroft, Michaeel Kazi, Juan C Meza, and Maciej Haranczyk. “Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials”. In:Microporous and Mesoporous Materials149.1 (2012), pp. 134–141
work page 2012
-
[22]
Andrew Rosen.quacc – The Quantum Accelerator. Version v1.0.6. Oct. 2025.DOI: 10 . 5281 / zenodo . 17373420.URL:https://doi.org/10.5281/zenodo.17373420. 14 Supporting Information LLM–Driven Multi-Agent Curation and Expansion of Metal–Organic Frameworks Database Honghui Kim1, Dohoon Kim1, Jihan Kim1* 1Department of Chemical and Biomolecular Engineering, Kor...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.