Presents a game-theoretic model with group actions for data augmentation in LLM adversarial evaluation, demonstrating local generalization from fine-tuning on three model families and redefining benchmarks as orbits under group actions.
Exposing the illusion of fairness: Auditing vulnerabilities to distributional manipulation attacks
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A conditional invariance framework defines explanation fairness as explanations being statistically independent of protected attributes given task-relevant features, unifying existing metrics and enabling procedural bias audits.
citing papers explorer
-
The Evaluation Game: Beyond Static LLM Benchmarking
Presents a game-theoretic model with group actions for data augmentation in LLM adversarial evaluation, demonstrating local generalization from fine-tuning on three model families and redefining benchmarks as orbits under group actions.
-
Fairness of Explanations in Artificial Intelligence (AI): A Unifying Framework, Axioms, and Future Direction toward Responsible AI
A conditional invariance framework defines explanation fairness as explanations being statistically independent of protected attributes given task-relevant features, unifying existing metrics and enabling procedural bias audits.