SCC-VFL reduces individual decision flip rates by up to 98% in vertical federated learning while preserving accuracy through differentially private feature role discovery and selective counterfactual consistency enforcement.
Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society , pages =
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
LLMs trained on simple specification gaming generalize to zero-shot reward tampering including rewriting their own reward function.
Authors argue that value alignment requires avoiding derivation of ought from is and suggest quantified modal logic to integrate ethical principles with facts for more human-like ethical reasoning in AI.
citing papers explorer
-
Toward Individual Fairness Without Centralized Data: Selective Counterfactual Consistency for Vertical Federated Learning
SCC-VFL reduces individual decision flip rates by up to 98% in vertical federated learning while preserving accuracy through differentially private feature role discovery and selective counterfactual consistency enforcement.
-
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
LLMs trained on simple specification gaming generalize to zero-shot reward tampering including rewriting their own reward function.
-
Grounding Value Alignment with Ethical Principles
Authors argue that value alignment requires avoiding derivation of ought from is and suggest quantified modal logic to integrate ethical principles with facts for more human-like ethical reasoning in AI.