An LLM-powered agent pipeline extracts ~9,000 structured concrete materials records from 278 publications with F1 scores up to 0.97, creating the largest open blended cement concrete database and demonstrating that larger, richer datasets improve ML prediction and generalization.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cond-mat.mtrl-sci 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Large language model-enabled automated data extraction for concrete materials informatics
An LLM-powered agent pipeline extracts ~9,000 structured concrete materials records from 278 publications with F1 scores up to 0.97, creating the largest open blended cement concrete database and demonstrating that larger, richer datasets improve ML prediction and generalization.