In Context Learning and Reasoning for Symbolic Regression with Large Language Models

· 2024 · cs.CL · arXiv 2410.17448

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Large Language Models (LLMs) are transformer-based machine learning models that have shown remarkable performance in tasks for which they were not explicitly trained. Here, we explore the potential of LLMs to perform symbolic regression -- a machine-learning method for finding simple and accurate equations from datasets. We prompt GPT-4 and GPT-4o models to suggest expressions from data, which are then optimized and evaluated using external Python tools. These results are fed back to the LLMs, which propose improved expressions while optimizing for complexity and loss. Using chain-of-thought prompting, we instruct the models to analyze data, prior expressions, and the scientific context (expressed in natural language) for each problem before generating new expressions. We evaluated the workflow in rediscovery of Langmuir and dual-site Langmuir's model for adsorption, along with Nikuradse's dataset on flow in rough pipes, which does not have a known target model equation. Both the GPT-4 and GPT-4o models successfully rediscovered equations, with better performance when using a scratchpad and considering scientific context. GPT-4o model demonstrated improved reasoning with data patterns, particularly evident in the dual-site Langmuir and Nikuradse dataset. We demonstrate how strategic prompting improves the model's performance and how the natural language interface simplifies integrating theory with data. We also applied symbolic mathematical constraints based on the background knowledge of data via prompts and found that LLMs generate meaningful equations more frequently. Although this approach does not outperform established SR programs where target equations are more complex, LLMs can nonetheless iterate toward improved solutions while following instructions and incorporating scientific context in natural language.

representative citing papers

LLM-driven design of physics-constrained constitutive models: two agents are better than one

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

A Creator-Inspector multi-agent LLM pipeline for constitutive artificial neural networks increases the rate of models satisfying all nine physical constraints to 100% or 56% depending on the LLM backbone.

LLM-Guided Open Hypothesis Learning from Autonomous Scanning Probe Microscopy Experiments

cond-mat.mtrl-sci · 2026-05-07 · unverdicted · novelty 7.0

The framework uses symbolic regression to propose analytical expressions from piezoresponse force microscopy data and an LLM to rank them for physical plausibility, yielding voltage-time growth laws for ferroelectric domain switching.

Leveraging Mathematical Reasoning of LLMs for Efficient GPU Thread Mapping

cs.DC · 2026-04-12 · unverdicted · novelty 6.0

Large language models derive exact analytical GPU thread mappings for complex 2D/3D domains and fractals via in-context learning, outperforming symbolic regression and enabling up to thousands-fold speedups and energy reductions.

citing papers explorer

Showing 3 of 3 citing papers.

LLM-driven design of physics-constrained constitutive models: two agents are better than one cs.LG · 2026-05-22 · unverdicted · none · ref 102 · internal anchor
A Creator-Inspector multi-agent LLM pipeline for constitutive artificial neural networks increases the rate of models satisfying all nine physical constraints to 100% or 56% depending on the LLM backbone.
LLM-Guided Open Hypothesis Learning from Autonomous Scanning Probe Microscopy Experiments cond-mat.mtrl-sci · 2026-05-07 · unverdicted · none · ref 3 · internal anchor
The framework uses symbolic regression to propose analytical expressions from piezoresponse force microscopy data and an LLM to rank them for physical plausibility, yielding voltage-time growth laws for ferroelectric domain switching.
Leveraging Mathematical Reasoning of LLMs for Efficient GPU Thread Mapping cs.DC · 2026-04-12 · unverdicted · none · ref 18 · internal anchor
Large language models derive exact analytical GPU thread mappings for complex 2D/3D domains and fractals via in-context learning, outperforming symbolic regression and enabling up to thousands-fold speedups and energy reductions.

In Context Learning and Reasoning for Symbolic Regression with Large Language Models

fields

years

verdicts

representative citing papers

citing papers explorer