We sample 20’000 such trajectories, and use 10% as a holdout dataset for valuation loss

To generate our target data, we employ a ground-truth optimizer of steepest descent with fixed step norm, set to0 · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?

cs.AI · 2026-01-30 · unverdicted · novelty 6.0

AI model failures on complex tasks become increasingly incoherent with longer reasoning chains, making consistent misalignment less likely than chaotic errors as capabilities scale.

citing papers explorer

Showing 1 of 1 citing paper.

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity? cs.AI · 2026-01-30 · unverdicted · none · ref 11
AI model failures on complex tasks become increasingly incoherent with longer reasoning chains, making consistent misalignment less likely than chaotic errors as capabilities scale.

We sample 20’000 such trajectories, and use 10% as a holdout dataset for valuation loss

fields

years

verdicts

representative citing papers

citing papers explorer