Probes predicting future behaviors from intermediate steps enable Future Probe Controlled Generation for steering large reasoning models with minimal quality degradation.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Authors create a benchmark across discrete/continuous and static/dynamical systems and introduce the Causal Abstraction Error (CAE) metric that reliably distinguishes valid from invalid causal abstractions when it includes faithfulness testing.
A structured mapping translates EU AI Act requirements into implementable verification activities for high-risk AI systems.
citing papers explorer
-
Predicting Future Behaviors in Reasoning Models Enables Better Steering
Probes predicting future behaviors from intermediate steps enable Future Probe Controlled Generation for steering large reasoning models with minimal quality degradation.
-
Validating Causal Abstraction Metrics on Simulated Complex Systems
Authors create a benchmark across discrete/continuous and static/dynamical systems and introduce the Causal Abstraction Error (CAE) metric that reliably distinguishes valid from invalid causal abstractions when it includes faithfulness testing.