Assessing moral decision making in large language models

Chris Shaner, Henry Griffith, Heena Rathore · 2025 · arXiv 3647.2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values

cs.CY · 2025-05-30 · unverdicted · novelty 6.0

LLMs deviate from human moral preferences in kidney allocation scenarios and rarely express indecision, though low-rank fine-tuning with few examples can improve both consistency and uncertainty calibration.

Learning-to-Explain through 20Q Gaming: An Explainable Recommender for Cybersecurity Education

cs.CY · 2026-04-14 · unverdicted · novelty 4.0

A policy-based RL agent plays a 20 questions game to recommend optimal cybersecurity education and explain the decision by eliciting the minimal set of evidential facts needed to justify defensive actions.

citing papers explorer

Showing 2 of 2 citing papers.

Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values cs.CY · 2025-05-30 · unverdicted · none · ref 58
LLMs deviate from human moral preferences in kidney allocation scenarios and rarely express indecision, though low-rank fine-tuning with few examples can improve both consistency and uncertainty calibration.
Learning-to-Explain through 20Q Gaming: An Explainable Recommender for Cybersecurity Education cs.CY · 2026-04-14 · unverdicted · none · ref 12
A policy-based RL agent plays a 20 questions game to recommend optimal cybersecurity education and explain the decision by eliciting the minimal set of evidential facts needed to justify defensive actions.

Assessing moral decision making in large language models

fields

years

verdicts

representative citing papers

citing papers explorer