Moral alignment in LLMs improves with model size according to the power law D ∝ S^{-0.10} (R²=0.50).
Hwang, Vered Shwartz, Maarten Sap, and Yejin Choi
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CY 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Scaling Laws for Moral Machine Judgment in Large Language Models
Moral alignment in LLMs improves with model size according to the power law D ∝ S^{-0.10} (R²=0.50).