Structural transparency of societal AI alignment through Institutional Logics
read the original abstract
The field of AI alignment is increasingly concerned with the questions of how values are integrated into the design of generative AI systems and how their integration shapes the social consequences of AI. However, existing transparency frameworks focus on the informational aspects of AI models, data, and procedures, while the institutional and organizational forces that shape alignment decisions and their downstream effects remain underexamined in both research and practice. To address this gap, we develop a framework of \emph{structural transparency} for analyzing organizational and institutional decisions concerning AI alignment, drawing on the theoretical lens of Institutional Logics. We develop a categorization of organizational decisions that are present in the governance of AI alignment, and provide an explicit analytical approach to examining them. We operationalize the framework through five analytical components, each with an accompanying "analyst recipe" that collectively identify the primary institutional logics and their internal relationships, external disruptions to existing social orders, and finally, how the structural risks of each institutional logic are mapped to a catalogue of sociotechnical harms. The proposed concept of structural transparency enables analysts to complement existing approached based on informational transparency with macro-level analyses that capture the institutional dynamics and consequences of decisions regarding AI alignment.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
NodeSynth: Socially Aligned Synthetic Data for AI Evaluation
NodeSynth generates evidence-anchored synthetic queries that trigger up to five times higher failure rates in mainstream LLMs than human-authored benchmarks.
-
NodeSynth: Socially Aligned Synthetic Data for AI Evaluation
NodeSynth creates evidence-based synthetic queries via a taxonomy generator to evaluate LLMs, revealing up to 5x higher failure rates than human benchmarks and gaps in guard models.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.