Generalized Priority-Aware Shapley Value
Pith reviewed 2026-06-30 21:30 UTC · model grok-4.3
The pith
The generalized priority-aware Shapley value extends random-order valuation to arbitrary directed weighted graphs by penalizing order violations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce the generalized priority-aware Shapley value (GPASV), a random order value defined on arbitrary directed weighted priority graphs, in which pairwise edges penalize rather than forbid order violations. GPASV covers a range of classical models as boundary cases. We establish GPASV through an axiomatic characterization, develop the associated computational methods, and introduce a priority sweeping diagnostic extending PASV's. We apply GPASV to LLM ensemble valuation on the cyclic Chatbot Arena preference graph, illustrating that priority-aware valuation is not a one-button operation: different balances of pairwise graph priority versus individual soft priority produce substantivel
What carries the argument
GPASV, defined as a random order value on directed weighted priority graphs where edges impose penalties on order violations.
If this is right
- GPASV applies to cyclic graphs from human preferences without requiring acyclicity.
- It recovers classical Shapley and priority-aware variants as special cases.
- Computational methods exist for calculating the value.
- A priority sweeping diagnostic helps analyze the impact of priorities.
- Different weightings of graph priority versus soft priority lead to different ensemble valuations.
Where Pith is reading between the lines
- This approach may extend valuation techniques to other domains with conflicting constraints, such as resource allocation with soft priorities.
- Testing GPASV on additional cyclic datasets could reveal how penalty magnitudes affect fairness in attributions.
- The method suggests that priority information in real data is inherently graded rather than absolute, which could influence how preferences are modeled in decision systems.
Load-bearing premise
That penalizing order violations within the random-order formulation correctly captures priority information on cyclic graphs without introducing structural inconsistencies or unintended biases in the resulting valuations.
What would settle it
Applying GPASV to the Chatbot Arena cyclic graph and checking whether the resulting LLM valuations shift substantially when the balance between graph priority and soft priority is altered would test whether the penalization mechanism accurately reflects the input priorities.
Figures
read the original abstract
Shapley value and its priority-aware extensions are widely used for valuation in machine learning, but existing methods require pairwise priority to be binary and acyclic, a restriction spectacularly violated in real-data examples such as aggregated human preferences and multi-criterion comparisons. We introduce the generalized priority-aware Shapley value (GPASV), a random order value defined on arbitrary directed weighted priority graphs, in which pairwise edges penalize rather than forbid order violations. GPASV covers a range of classical models as boundary cases. We establish GPASV through an axiomatic characterization, develop the associated computational methods, and introduce a priority sweeping diagnostic extending PASV's. We apply GPASV to LLM ensemble valuation on the cyclic Chatbot Arena preference graph, illustrating that priority-aware valuation is not a one-button operation: different balances of pairwise graph priority versus individual soft priority produce substantively different valuations of the same data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the generalized priority-aware Shapley value (GPASV) as a random-order value on arbitrary directed weighted priority graphs. Pairwise edges impose additive or multiplicative penalties on order violations rather than hard constraints. The work claims an axiomatic characterization that recovers classical Shapley and priority-aware models as boundary cases, supplies associated computational methods and a priority-sweeping diagnostic, and applies the construction to LLM ensemble valuation on the cyclic Chatbot Arena preference graph, demonstrating sensitivity to the balance parameter between graph priority and individual soft priority.
Significance. If the axiomatic characterization is rigorous and the penalization mechanism yields well-defined, cycle-consistent valuations without sampling-dependent bias, the result would materially extend priority-aware valuation beyond the binary acyclic restriction that currently limits applicability to aggregated preferences and multi-criterion data. The explicit demonstration that different balance parameters produce substantively different rankings on the same data set is a useful practical contribution. No machine-checked proofs or fully reproducible code artifacts are mentioned.
major comments (3)
- [§3] §3 (definition of GPASV): the random-order formulation with pairwise penalties is asserted to remain well-defined and axiomatically unique on graphs containing cycles and non-binary weights, yet the manuscript supplies neither an explicit expression for the penalized permutation probability nor a proof that the resulting value is independent of the sampling procedure used to approximate it. This is load-bearing for the central claim that GPASV extends classical models without introducing structural inconsistencies.
- [§4] §4 (axiomatic characterization): the uniqueness theorem is stated but the derivation is not reproduced; it is therefore impossible to verify whether the chosen axioms continue to pin down a unique value once the penalty function is permitted to be non-monotonic across cycles. A concrete counter-example or invariance check on a small cyclic graph would be required.
- [§5] §5 (computational methods): the Monte-Carlo estimator for the penalized random-order value is described at a high level; no variance bound or convergence rate is given for the cyclic case, leaving open the possibility that the reported LLM valuations are dominated by sampling noise rather than by the priority signal.
minor comments (2)
- The abstract refers to 'an axiomatic characterization' without naming the axioms; a one-sentence list in the abstract would improve readability.
- Figure captions for the Chatbot Arena results should explicitly state the numerical values of the balance parameter used in each panel.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and recommendation for major revision. We address each major comment below and will incorporate the requested clarifications and additions into the revised manuscript.
read point-by-point responses
-
Referee: [§3] §3 (definition of GPASV): the random-order formulation with pairwise penalties is asserted to remain well-defined and axiomatically unique on graphs containing cycles and non-binary weights, yet the manuscript supplies neither an explicit expression for the penalized permutation probability nor a proof that the resulting value is independent of the sampling procedure used to approximate it. This is load-bearing for the central claim that GPASV extends classical models without introducing structural inconsistencies.
Authors: We agree that an explicit expression for the penalized permutation probability and a proof of sampling independence are required. In the revision we will supply the closed-form expression (the probability of a permutation is proportional to the product over all pairs of the penalty factor raised to the indicator of an order violation) and prove that the resulting expectation is independent of any particular Monte-Carlo sampling scheme because it coincides with the unique solution of the axiomatic system. revision: yes
-
Referee: [§4] §4 (axiomatic characterization): the uniqueness theorem is stated but the derivation is not reproduced; it is therefore impossible to verify whether the chosen axioms continue to pin down a unique value once the penalty function is permitted to be non-monotonic across cycles. A concrete counter-example or invariance check on a small cyclic graph would be required.
Authors: The full derivation of the uniqueness theorem will be reproduced in an appendix. We will also add an explicit invariance verification on a directed 3-cycle with non-monotonic penalties, confirming that the axioms continue to characterize a unique value. revision: yes
-
Referee: [§5] §5 (computational methods): the Monte-Carlo estimator for the penalized random-order value is described at a high level; no variance bound or convergence rate is given for the cyclic case, leaving open the possibility that the reported LLM valuations are dominated by sampling noise rather than by the priority signal.
Authors: We will augment §5 with a variance bound (Hoeffding-type, using the boundedness of the penalty function) and a convergence-rate statement that applies directly to graphs containing cycles. This will quantify the number of samples needed to ensure the reported valuations are not dominated by Monte-Carlo noise. revision: yes
Circularity Check
No circularity: axiomatic characterization provides independent grounding for the new definition.
full rationale
The paper introduces GPASV via an explicit random-order construction on weighted directed graphs with penalization of violations, then separately establishes it through axiomatic characterization. No equations reduce the output to a fitted input or self-referential quantity by construction. No load-bearing self-citations or uniqueness theorems imported from the authors' prior work are referenced in the provided text. The coverage of classical models as boundary cases is presented as a consequence of the definition rather than a renaming or smuggling of an ansatz. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- balance parameter between graph priority and individual soft priority
axioms (1)
- domain assumption Standard Shapley axioms continue to hold under the penalization extension to weighted directed graphs
Reference graph
Works this paper leans on
-
[1]
J. F. Banzhaf III. Weighted voting doesn’t work: A mathematical analysis.Rutgers L. Rev., 19: 317, 1964
1964
-
[2]
Black et al
D. Black et al. The theory of committees and elections. 1958
1958
-
[3]
R. A. Bradley and M. E. Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons.Biometrika, 39(3/4):324–345, 1952
1952
-
[4]
Bubley and M
R. Bubley and M. Dyer. Faster random generation of linear extensions.Discrete mathematics, 201(1-3):81–88, 1999
1999
-
[5]
G. T. Cantwell and C. Moore. Belief propagation for permutations, rankings, and partial orders. Physical Review E, 105(5):L052303, 2022
2022
-
[6]
Castro, D
J. Castro, D. Gómez, and J. Tejada. Polynomial calculation of the shapley value based on sampling.Computers & operations research, 36(5):1726–1730, 2009
2009
-
[7]
Chiang, L
W.-L. Chiang, L. Zheng, Y . Sheng, A. N. Angelopoulos, T. Li, D. Li, B. Zhu, H. Zhang, M. Jordan, J. E. Gonzalez, et al. Chatbot arena: An open platform for evaluating llms by human preference. InForty-first International Conference on Machine Learning, 2024
2024
-
[8]
J. A. M. N. C. Condorcet et al. Essai sur l’application de l’analyse à la probabilité des décisions, rendues à la pluralité des voix/| c par m. le marquis de condorcet..., 1785
-
[9]
J. Derks. A new proof for weber’s characterization of the random order values.Mathematical Social Sciences, 49(3):327–334, 2005
2005
-
[10]
Dubey, A
P. Dubey, A. Neyman, and R. J. Weber. Value theory without efficiency.Mathematics of Operations Research, 6(1):122–128, 1981
1981
-
[11]
Faigle and W
U. Faigle and W. Kern. The shapley value for cooperative games under precedence constraints. International Journal of Game Theory, 21(3):249–266, 1992
1992
-
[12]
P. C. Fishburn. Condorcet social choice functions.SIAM Journal on applied Mathematics, 33 (3):469–489, 1977
1977
-
[13]
C. Frye, C. Rowat, and I. Feige. Asymmetric shapley values: incorporating causal knowledge into model-agnostic explainability.Advances in neural information processing systems, 33: 1229–1239, 2020
2020
-
[14]
An Odd Estimator for Shapley Values
F. Fumagalli, L. Butler, J. S. Kang, K. Ramchandran, and R. T. Witter. An odd estimator for shapley values.arXiv preprint arXiv:2602.01399, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[15]
Fumagalli, R
F. Fumagalli, R. T. Witter, and C. Musco. PolySHAP: Extending kernelSHAP with interaction- informed polynomial regression. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=M19J8UGguq. 10
2026
-
[16]
Ghorbani and J
A. Ghorbani and J. Zou. Data shapley: Equitable valuation of data for machine learning. In International conference on machine learning, pages 2242–2251. PMLR, 2019
2019
-
[17]
M. E. Glickman. Parameter estimation in large dynamic paired comparison experiments.Journal of the Royal Statistical Society Series C: Applied Statistics, 48(3):377–394, 1999
1999
-
[18]
Hesterberg
T. Hesterberg. Weighted average importance sampling and defensive mixture distributions. Technometrics, 37(2):185–194, 1995
1995
-
[19]
A. B. Kahn. Topological sorting of large networks.Communications of the ACM, 5(11): 558–562, 1962
1962
-
[20]
Kalai and D
E. Kalai and D. Samet. On weighted Shapley values.International journal of game theory, 16 (3):205–222, 1987
1987
-
[21]
Kangas, T
K. Kangas, T. Hankala, T. M. Niinimäki, and M. Koivisto. Counting linear extensions of sparse posets. InIJCAI, volume 16, pages 603–609, 2016
2016
-
[22]
Karzanov and L
A. Karzanov and L. Khachiyan. On the conductance of order markov chains.Order, 8(1):7–15, 1991
1991
-
[23]
A. Kong, J. S. Liu, and W. H. Wong. Sequential imputations and bayesian missing data problems. Journal of the American statistical association, 89(425):278–288, 1994
1994
-
[24]
W. Kwon, Z. Li, S. Zhuang, Y . Sheng, L. Zheng, C. H. Yu, J. Gonzalez, H. Zhang, and I. Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the 29th symposium on operating systems principles, pages 611–626, 2023
2023
-
[25]
Y . Kwon and J. Zou. Beta shapley: a unified and noise-reduced data valuation framework for machine learning.arXiv preprint arXiv:2110.14049, 2021
- [26]
-
[27]
Li and Y
W. Li and Y . Yu. Robust data valuation with weighted banzhaf values.Advances in Neural Information Processing Systems, 36:60349–60383, 2023
2023
-
[28]
Li and Y
W. Li and Y . Yu. One sample fits all: Approximating all probabilistic values simultaneously and efficiently.Advances in Neural Information Processing Systems, 37:58309–58340, 2024
2024
-
[29]
K. Liu, Q. Long, Z. Shi, W. J. Su, and J. Xiao. Statistical impossibility and possibility of aligning llms with human preferences: From condorcet paradox to nash equilibrium.arXiv preprint arXiv:2503.10990, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[30]
Z. Liu, K. Lee, Y . Zhang, and W. Tang. First-order efficiency for probabilistic value estimation via a statistical viewpoint.arXiv preprint arXiv:2605.02827, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[31]
R. D. Luce et al.Individual choice behavior, volume 4. Wiley New York, 1959
1959
-
[32]
S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017
2017
-
[33]
C. L. Mallows. Non-null ranking models. i.Biometrika, 44(1/2):114–130, 1957
1957
-
[34]
Neurips 2026 main track handbook, 2026
NeurIPS. Neurips 2026 main track handbook, 2026. URL https://neurips.cc/ Conferences/2026/MainTrackHandbook
2026
-
[35]
A. S. Nowak and T. Radzik. On axiomatizations of the weighted shapley values.Games and Economic Behavior, 8(2):389–405, 1995
1995
-
[36]
R. L. Plackett. The analysis of permutations.Journal of the Royal Statistical Society Series C: Applied Statistics, 24(2):193–202, 1975
1975
-
[37]
Qwen3.5: Towards native multimodal agents, February 2026
Qwen Team. Qwen3.5: Towards native multimodal agents, February 2026. URL https: //qwen.ai/blog?id=qwen3.5
2026
-
[38]
L. S. Shapley. A value for n-person games.Contributions to the Theory of Games, 2, 1953
1953
-
[39]
Talvitie, T
T. Talvitie, T. M. Niinimäki, and M. Koivisto. The mixing of markov chains on linear extensions in practice. InIJCAI, pages 524–530, 2017
2017
-
[40]
J. T. Wang and R. Jia. Data Banzhaf: A robust data valuation framework for machine learning. InInternational Conference on Artificial Intelligence and Statistics, pages 6388–6421. PMLR, 2023. 11
2023
-
[41]
R. J. Weber.Probabilistic values for games, page 101–120. Cambridge University Press, 1988
1988
-
[42]
D. B. Wilson. Mixing times of lozenge tiling and card shuffling markov chains.The Annals of Applied Probability, 14(1):274–325, 2004
2004
-
[43]
R. T. Witter, Y . Liu, and C. Musco. Regression-adjusted monte carlo estimators for shapley values and probabilistic values. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URLhttps://openreview.net/forum?id=Qabko39AS5
2025
- [44]
-
[45]
Zheng, W.-L
L. Zheng, W.-L. Chiang, Y . Sheng, S. Zhuang, Z. Wu, Y . Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, et al. Judging llm-as-a-judge with mt-bench and chatbot arena.Advances in neural information processing systems, 36:46595–46623, 2023
2023
-
[46]
λπt exp{−β·V ω(πt;S t)}P k∈St λk exp{−β·V ω(k;S t)} · X k∈St exp{−β·V ω(k;S t)} # ∝ nY t=1
X. Zheng, Y . Huang, X. Chang, R. Jia, and Y . Tan. Rethinking data value: Asymmetric data Shapley for structure-aware valuation in data markets and machine learning pipelines.arXiv preprint arXiv:2511.12863, 2025. 12 Computer Code Computer code is available at:https://github.com/KiljaeL/GPASV A Limitations A.1 Utility Calls The main computational cost of...
-
[52]
After synthesizing, output ONLY the final aggregated answer, with no extra commentary
Two-turn consistency: if the prompt is multi-turn, treat the turn-1 aggregated answer as the assistant’s previous message when producing the turn-2 aggregated answer. After synthesizing, output ONLY the final aggregated answer, with no extra commentary. [Question] {x1} <|The Start of Candidates|> [The Start of Candidate 1 Answer] {candidate answer 1} [The...
-
[53]
Follow the user’s requested format, constraints, and style exactly
-
[54]
You may add minimal connective wording for readability
Use only information explicitly stated in the candidates. You may add minimal connective wording for readability
-
[55]
If candidates disagree and you cannot resolve it from the candidates, omit the claim or mark it as uncertain
-
[56]
Remove redundancy and irrelevant parts; choose the clearest phrasing among candidates
-
[57]
If JSON/code/strict format is required, keep it valid and do not introduce new APIs/libraries not present in candidates
-
[58]
[[rating ]]
Two-turn consistency: if the prompt is multi-turn, treat the turn-1 aggregated answer as the assistant’s previous message when producing the turn-2 aggregated answer. After synthesizing, output ONLY the final aggregated answer, with no extra commentary. <|The Start of Previous Conversation with User|> ### User: {x1} ### Assistant: {y1S} ### User: {x2} <|T...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.