hub

Causal reasoning and large language models: Opening a new frontier for causality

KICIMAN , E · 2023 · arXiv 2305.00050

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

PROMETHEUS: Automating Deep Causal Research Integrating Text, Data and Models

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

PROMETHEUS builds causal atlases from text and data using local predictive-state models and sheaf gluing to create navigable Topos World Models that expose evidence strength and coherence gaps.

TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.

PRCD-MAP: Learning How Much to Trust Imperfect Priors in Causal Discovery

stat.ML · 2026-05-03 · unverdicted · novelty 7.0

PRCD-MAP assigns per-edge trust to imperfect priors in causal discovery via empirical Bayes calibration and MLP propagation, delivering an ε-safety guarantee that vanishes at prior-quality extremes and empirical gains on CausalTime datasets.

Sequential Causal Discovery with Noisy Language Model Priors

cs.LG · 2025-06-19 · unverdicted · novelty 7.0

Proposes a sequential causal discovery framework integrating noisy LM priors with batch data via PAG representation and adaptive edge querying for improved structural accuracy.

MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data

cs.LG · 2024-06-15 · unverdicted · novelty 7.0

MALLM-GAN uses multi-agent LLMs to emulate GAN architecture for generating higher-quality synthetic tabular data from small samples than prior models, while preserving privacy.

CausalGuard: Conformal Inference under Graph Uncertainty

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

CausalGuard aggregates LLM-proposed and data-pruned DAGs to weight doubly robust pseudo-outcomes and applies conformal calibration to deliver finite-sample marginal coverage for conditional average treatment effects under graph uncertainty.

CIVeX: Causal Intervention Verification for Language Agents

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

CIVeX maps agent tool calls to structural causal queries, checks identifiability, and issues auditable verdicts to prevent false executions while preserving utility on confounded benchmarks.

Diagnosing and Mitigating Sycophancy and Skepticism in LLM Causal Judgment

cs.AI · 2026-01-13 · unverdicted · novelty 6.0

Introduces the CAUSALT3 benchmark for causal reasoning across Pearl's ladder and Regulated Causal Anchoring (RCA) to reduce sycophancy and skepticism in LLMs via inference-time verification.

CounterBench: Evaluating and Improving Counterfactual Reasoning in Large Language Models

cs.CL · 2025-02-16 · unverdicted · novelty 6.0

Introduces CounterBench benchmark and CoIn iterative reasoning method showing LLMs perform near random on formal counterfactual tasks but improve substantially with guided backtracking.

CasualSynth: Generating Structurally Sound Synthetic Data

cs.LG · 2026-05-17 · unverdicted · novelty 5.0

CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.

DeepImagine: Learning Biomedical Reasoning via Successive Counterfactual Imagining

cs.CL · 2026-04-24 · unverdicted · novelty 5.0

DeepImagine trains LLMs on counterfactual pairs from clinical trials using supervised fine-tuning and reinforcement learning to improve outcome prediction by approximating causal mechanisms.

Hume's Representational Conditions for Causal Judgment: What Bayesian Formalization Abstracted Away

cs.AI · 2026-04-03 · unverdicted · novelty 5.0

Hume's causal judgment requires experiential grounding, structured retrieval, and vivacity transfer, conditions that Bayesian formalizations abstract away while LLMs retain only statistical updating.

Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence

cs.CL · 2026-05-12 · unverdicted · novelty 4.0

The authors introduce a validation framework showing LLMs can pull causal links from disaster social media but require checks against post-event evidence to avoid relying on model priors.

Gemma 3 Technical Report

cs.CL · 2025-03-25 · accept · novelty 4.0

Gemma 3 introduces multimodal open models with architectural changes for efficient long context, trained via distillation and a new post-training recipe that makes the 4B version competitive with prior 27B models and the 27B version comparable to Gemini-1.5-Pro.

Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation

cs.AI · 2026-04-12

citing papers explorer

Showing 15 of 15 citing papers.

PROMETHEUS: Automating Deep Causal Research Integrating Text, Data and Models cs.AI · 2026-05-13 · unverdicted · none · ref 9
PROMETHEUS builds causal atlases from text and data using local predictive-state models and sheaf gluing to create navigable Topos World Models that expose evidence strength and coherence gaps.
TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations cs.LG · 2026-05-04 · unverdicted · none · ref 266
TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.
PRCD-MAP: Learning How Much to Trust Imperfect Priors in Causal Discovery stat.ML · 2026-05-03 · unverdicted · none · ref 19
PRCD-MAP assigns per-edge trust to imperfect priors in causal discovery via empirical Bayes calibration and MLP propagation, delivering an ε-safety guarantee that vanishes at prior-quality extremes and empirical gains on CausalTime datasets.
Sequential Causal Discovery with Noisy Language Model Priors cs.LG · 2025-06-19 · unverdicted · none · ref 14
Proposes a sequential causal discovery framework integrating noisy LM priors with batch data via PAG representation and adaptive edge querying for improved structural accuracy.
MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data cs.LG · 2024-06-15 · unverdicted · none · ref 18
MALLM-GAN uses multi-agent LLMs to emulate GAN architecture for generating higher-quality synthetic tabular data from small samples than prior models, while preserving privacy.
CausalGuard: Conformal Inference under Graph Uncertainty cs.LG · 2026-05-21 · unverdicted · none · ref 17
CausalGuard aggregates LLM-proposed and data-pruned DAGs to weight doubly robust pseudo-outcomes and applies conformal calibration to deliver finite-sample marginal coverage for conditional average treatment effects under graph uncertainty.
CIVeX: Causal Intervention Verification for Language Agents cs.AI · 2026-05-09 · unverdicted · none · ref 8
CIVeX maps agent tool calls to structural causal queries, checks identifiability, and issues auditable verdicts to prevent false executions while preserving utility on confounded benchmarks.
Diagnosing and Mitigating Sycophancy and Skepticism in LLM Causal Judgment cs.AI · 2026-01-13 · unverdicted · none · ref 1
Introduces the CAUSALT3 benchmark for causal reasoning across Pearl's ladder and Regulated Causal Anchoring (RCA) to reduce sycophancy and skepticism in LLMs via inference-time verification.
CounterBench: Evaluating and Improving Counterfactual Reasoning in Large Language Models cs.CL · 2025-02-16 · unverdicted · none · ref 17
Introduces CounterBench benchmark and CoIn iterative reasoning method showing LLMs perform near random on formal counterfactual tasks but improve substantially with guided backtracking.
CasualSynth: Generating Structurally Sound Synthetic Data cs.LG · 2026-05-17 · unverdicted · none · ref 20
CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.
DeepImagine: Learning Biomedical Reasoning via Successive Counterfactual Imagining cs.CL · 2026-04-24 · unverdicted · none · ref 11
DeepImagine trains LLMs on counterfactual pairs from clinical trials using supervised fine-tuning and reinforcement learning to improve outcome prediction by approximating causal mechanisms.
Hume's Representational Conditions for Causal Judgment: What Bayesian Formalization Abstracted Away cs.AI · 2026-04-03 · unverdicted · none · ref 6
Hume's causal judgment requires experiential grounding, structured retrieval, and vivacity transfer, conditions that Bayesian formalizations abstract away while LLMs retain only statistical updating.
Large Language Models for Causal Relations Extraction in Social Media: A Validation Framework for Disaster Intelligence cs.CL · 2026-05-12 · unverdicted · none · ref 61
The authors introduce a validation framework showing LLMs can pull causal links from disaster social media but require checks against post-event evidence to avoid relying on model priors.
Gemma 3 Technical Report cs.CL · 2025-03-25 · accept · none · ref 30
Gemma 3 introduces multimodal open models with architectural changes for efficient long context, trained via distillation and a new post-training recipe that makes the 4B version competitive with prior 27B models and the 27B version comparable to Gemini-1.5-Pro.
Thinking Fast, Thinking Wrong: Intuitiveness Modulates LLM Counterfactual Reasoning in Policy Evaluation cs.AI · 2026-04-12 · unreviewed · ref 14

Causal reasoning and large language models: Opening a new frontier for causality

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer