pith. sign in

Position: General alignment has hit a ceiling; edge alignment must be taken seriously

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

General Alignment has improved average-case helpfulness and safety, but current alignment practice still rewards confident, single-turn responses. The problem is not only that models fail on edge cases; it is that current evaluation makes many of these failures hard to see. We take the position that alignment must move beyond average-case evaluation by making failures under value conflict, plural stakeholder disagreement, and epistemic ambiguity visible and actionable. Scalar rewards compress diverse values into a single number; data and evaluation regimes collapse, filter, or fail to elicit the cases where alignment is hardest; and governance often lacks mechanisms for adjudicating contested cases. These blind spots produce value flattening, representation loss, and uncertainty blindness. We use Edge alignment to name a detection, evaluation, and governance agenda for surfacing these failures and connecting them to appropriate interventions. Rather than a single training objective, Edge alignment defines the conditions under which standard alignment should yield to mechanisms that preserve multidimensional value structure, represent plural perspectives, and support uncertainty-aware interaction. A pilot diagnostic set of 91 edge cases and four contemporary models illustrates that ordinary helpfulness and safety readings can miss process failures that edge-aware evaluation exposes. We outline operational edge signals, process-aware evaluation criteria, and a three-phase process stack that reframes alignment as a lifecycle problem of dynamic normative governance.

citation-role summary

background 1

citation-polarity summary

fields

cs.CL 1 cs.LG 1

years

2026 2

verdicts

UNVERDICTED 2

roles

background 1

polarities

background 1

clear filters

representative citing papers

Quantifying and Mitigating Premature Closure in Frontier LLMs

cs.CL · 2026-05-14 · unverdicted · novelty 6.0

Frontier LLMs exhibit premature closure by selecting answers at high rates on medical tasks where the correct choice was removed and on open-ended queries, with safety prompting reducing but not eliminating the behavior.

citing papers explorer

Showing 2 of 2 citing papers after filters.