Question the Questions: Auditing Representation in Online Deliberative Processes
Pith reviewed 2026-05-18 00:46 UTC · model grok-4.3
The pith
A new auditing framework uses justified representation to check whether a small set of questions fairly captures every participant's interests in deliberative processes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce an auditing framework for measuring the level of representation provided by a slate of questions, based on the social choice concept known as justified representation (JR). We present the first algorithms for auditing JR in the general utility setting, with our most efficient algorithm achieving a runtime of O(mn log n), where n is the number of participants and m is the number of proposed questions. We apply our auditing methods to historical deliberations, comparing the representativeness of the actual questions posed to the expert panel, participants' questions chosen via integer linear programming, and summary questions generated by large language models.
What carries the argument
The justified representation (JR) auditing framework, which verifies whether a selected slate of questions ensures that every sufficiently large group of participants has at least one question in the slate that they all value highly enough.
If this is right
- Practitioners can now quantify and compare the representativeness of different question selection methods in ongoing deliberations.
- Integer linear programming selections achieve higher justified representation scores than typical moderator choices in historical cases.
- Large language models can generate representative questions but currently fall short of optimized human or algorithmic selections in some audits.
- Integration into the online platform allows routine auditing across hundreds of deliberations in over 50 countries.
Where Pith is reading between the lines
- The auditing method could support real-time adjustments to question slates during live events rather than only post-hoc checks.
- Similar representation audits might extend to other selection tasks in participatory governance, such as choosing policy priorities or budget items.
- Repeated audits across many countries could surface systematic differences in how well current methods serve diverse cultural or demographic groups.
Load-bearing premise
The framework assumes that participants' utilities or preferences over the proposed questions can be accurately captured or elicited to apply the justified representation criteria.
What would settle it
Apply the auditing algorithm to participant utilities and a selected question slate from a real deliberation, then check whether the output correctly identifies a violation of justified representation that a manual review of the data also confirms.
Figures
read the original abstract
A central feature of many deliberative processes, such as citizens' assemblies and deliberative polls, is the opportunity for participants to engage directly with experts. While participants are typically invited to propose questions for expert panels, only a limited number can be selected due to time constraints. This raises the challenge of how to choose a small set of questions that best represent the interests of all participants. We introduce an auditing framework for measuring the level of representation provided by a slate of questions, based on the social choice concept known as justified representation (JR). We present the first algorithms for auditing JR in the general utility setting, with our most efficient algorithm achieving a runtime of $O(mn\log n)$, where $n$ is the number of participants and $m$ is the number of proposed questions. We apply our auditing methods to historical deliberations, comparing the representativeness of (a) the actual questions posed to the expert panel (chosen by a moderator), (b) participants' questions chosen via integer linear programming, (c) summary questions generated by large language models (LLMs). Our results highlight both the promise and current limitations of LLMs in supporting deliberative processes. By integrating our methods into an online deliberation platform that has been used for over hundreds of deliberations across more than 50 countries, we make it easy for practitioners to audit and improve representation in future deliberations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces an auditing framework based on justified representation (JR) from social choice theory to evaluate how well a small slate of questions represents the interests of participants in deliberative processes such as citizens' assemblies. It presents the first algorithms for auditing JR under a general cardinal utility model, including an efficient O(mn log n) procedure, and applies the framework to historical deliberation data to compare moderator-selected questions, ILP-optimized selections, and LLM-generated summary questions. The methods are integrated into an online platform used across many countries.
Significance. If the algorithmic claims and empirical comparisons hold, the work provides a principled, computationally tractable method for auditing representation in real-world deliberation, bridging social choice theory with practical AI-supported processes. The efficient runtime bound and platform integration are concrete strengths that could enable practitioners to improve fairness. The comparisons between human and LLM approaches yield actionable insights into current limitations of generative models for this task.
major comments (2)
- [§4] §4 (Empirical Evaluation): The JR auditing algorithms require a cardinal utility matrix over all m questions for each of the n participants. The manuscript applies the framework to historical and LLM-generated slates but provides no description or independent validation of how these utilities are elicited or reconstructed (e.g., via direct ratings, proxies, demographics, or embeddings). Because any error in the utility values directly falsifies the JR verdict for a given slate, this omission undermines the reliability of the reported comparisons among moderator, ILP, and LLM slates.
- [Algorithm 2] Algorithm 2 (O(mn log n) procedure): The runtime claim rests on a sorting or greedy selection step that avoids enumerating all possible coalitions. The manuscript should include a short correctness argument showing that the chosen ordering identifies all potential violating coalitions of size at least n/k without false negatives; absent this, the efficiency result cannot be fully assessed.
minor comments (2)
- [§2] The definition of the general utility model in §2 could explicitly state whether utilities are assumed to be normalized or elicited on a common scale, to avoid ambiguity when comparing slates across deliberations.
- [Figure 3] Figure 3 (comparison of JR scores) would benefit from error bars or statistical tests to indicate whether differences between the three slate types are significant.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major comment below and outline the revisions we will make to improve the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (Empirical Evaluation): The JR auditing algorithms require a cardinal utility matrix over all m questions for each of the n participants. The manuscript applies the framework to historical and LLM-generated slates but provides no description or independent validation of how these utilities are elicited or reconstructed (e.g., via direct ratings, proxies, demographics, or embeddings). Because any error in the utility values directly falsifies the JR verdict for a given slate, this omission undermines the reliability of the reported comparisons among moderator, ILP, and LLM slates.
Authors: We agree that the current version of the manuscript does not adequately describe the construction of the cardinal utility matrix from the historical deliberation data. This is a substantive omission that affects the interpretability of the empirical results. In the revised manuscript we will add a dedicated subsection in §4 that details the utility reconstruction procedure (combining available direct ratings with demographic and response-based proxies) and includes a brief validation against a held-out subset of explicitly rated questions. We will also report a sensitivity analysis showing that the comparative findings between moderator, ILP, and LLM slates remain stable under small perturbations to the utility values. revision: yes
-
Referee: [Algorithm 2] Algorithm 2 (O(mn log n) procedure): The runtime claim rests on a sorting or greedy selection step that avoids enumerating all possible coalitions. The manuscript should include a short correctness argument showing that the chosen ordering identifies all potential violating coalitions of size at least n/k without false negatives; absent this, the efficiency result cannot be fully assessed.
Authors: We thank the referee for requesting an explicit correctness argument. The O(mn log n) procedure first sorts questions by aggregate utility and then performs a greedy scan over participants ordered by their highest utility in the slate. We will insert a concise proof sketch immediately after the algorithm statement in the revised manuscript. The argument shows that any coalition of size at least n/k that violates JR must contain at least one participant whose utility for some question in the slate is high enough to appear early in the sorted order, ensuring the greedy check detects the violation and produces no false negatives. revision: yes
Circularity Check
No circularity: algorithms derived from established JR definition
full rationale
The paper defines an auditing framework by directly importing the standard justified representation (JR) axiom from social choice theory and then supplies new algorithms to decide whether a given slate satisfies JR under cardinal utilities. The O(mn log n) runtime is obtained by standard sorting and greedy selection over the utility matrix; this is a computational reduction of the JR decision problem, not a renaming or self-referential fit. No load-bearing step relies on self-citation chains, fitted parameters renamed as predictions, or ansatzes smuggled from prior author work. The empirical applications to historical slates and LLM outputs are presented as separate evaluations that presuppose the utility matrix rather than deriving the auditing correctness from those data. The derivation chain is therefore self-contained against the external JR literature.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Justified representation provides a meaningful measure of representation for question slates in deliberative settings
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present the first algorithms for auditing JR in the general utility setting, with our most efficient algorithm achieving a runtime of O(mn log n)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Explanation Systems for Approval-Based Multiwinner Voting
Price systems explain approval-based multiwinner voting by modeling voter influence via budgets spent on approved candidates, supported by axioms and a polynomial-time continuous-influence rule that satisfies jointly ...
Reference graph
Works this paper leans on
-
[1]
Lee, Sean Morota Chu, and Jeremy Vollen
Haris Aziz, Barton E. Lee, Sean Morota Chu, and Jeremy Vollen. Proportionally representative clustering. In Proceedings of the 20th Conference on Web and Internet Economics (WINE 2024), 2024.https://arxiv. org/abs/2304.13917. Tuva Bardal, Markus Brill, David McCune, and Jannik Peters. Proportional representation in practice: quan- tifying proportionality ...
-
[2]
doi: 10.1609/aaai.v39i13.33483.https://doi.org/10.1609/aaai.v39i13
ISBN 978-1-57735-897-8. doi: 10.1609/aaai.v39i13.33483.https://doi.org/10.1609/aaai.v39i13. 33483. Niclas Boehmer, Sara Fish, and Ariel D. Procaccia. Generative Social Choice: The Next Generation. In Forty-second International Conference on Machine Learning, 2025.https://openreview.net/forum?id= E1E6T7KHlR. Markus Brill and Jannik Peters. Robust and verif...
work page doi:10.1609/aaai.v39i13.33483.https://doi.org/10.1609/aaai.v39i13 2025
-
[3]
Association for Computing Machinery . ISBN 9798400701047. doi: 10.1145/3580507. 3597785.https://doi.org/10.1145/3580507.3597785. Samuel Chang, Estelle Ciesla, Michael Finch, James Fishkin, Lodewijk Gelauff, Ashish Goel, Ricky Hernan- dez Marquez, Shoaib Mohammed, and Alice Siu. Meta community forum: Results analysis, april
-
[4]
Bakker, Jay Baxter, and Martin Saveski
Soham De, Michiel A. Bakker, Jay Baxter, and Martin Saveski. Supernotes: Driving consensus in crowd- sourced fact-checking. InProceedings of the ACM on Web Conference 2025, WWW ’25, page 3751–3761, New York, NY, USA,
work page 2025
-
[5]
Bakker, Jay Baxter, and Martin Saveski
Association for Computing Machinery . ISBN 9798400712746. doi: 10.1145/ 3696410.3714934.https://doi.org/10.1145/3696410.3714934. Sara Fish, Paul G ¨olz, David C. Parkes, Ariel D. Procaccia, Gili Rusak, Itai Shapira, and Manuel W ¨uthrich. Generative Social Choice. InProceedings of the 25th ACM Conference on Economics and Computation, EC ’24, page 985, New...
-
[6]
The economic limits of permissionless consensus
Association for Computing Machinery . ISBN 9798400707049. doi: 10.1145/3670865.3673547.https://doi.org/10.1145/3670865.3673547. James Fishkin and Larry Diamond. Can deliberation cure our divisions about democracy?Boston Globe, (August 21, 2023), August 2023.https://www.bostonglobe.com/2023/08/21/opinion/ 2024-elections-partisanship-democracy-common-ground...
work page doi:10.1145/3670865.3673547.https://doi.org/10.1145/3670865.3673547 2023
-
[7]
Tianyu Gao, Xingcheng Yao, and Danqi Chen
ISBN 9780300065565.http://www.jstor.org/stable/j.ctt32bgmt. Tianyu Gao, Xingcheng Yao, and Danqi Chen. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors,Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Onl...
work page 2021
-
[8]
doi: 10.18653/v1/2021.emnlp-main.552.https://aclanthology.org/2021.emnlp-main.552/
Association for Computational Linguistics. doi: 10.18653/v1/2021.emnlp-main.552.https://aclanthology.org/2021.emnlp-main.552/. 12 Lodewijk Gelauff, Liubov Nikolenko, Sukolsak Sakshuwong, James Fishkin, Ashish Goel, Kamesh Munagala, and Alice Siu. Achieving parity with human moderators: A self-moderating platform for online delibera- tion
work page doi:10.18653/v1/2021.emnlp-main.552.https://aclanthology.org/2021.emnlp-main.552/ 2021
-
[9]
Accessed: 2025-10-06. Zhihao Jiang and Ashish Goel. Approximation algorithms for optimization problems with justified represen- tation constraints.Personal Communication; authors omitted for double blind review,
work page 2025
-
[10]
Jigsaw. How one of the fastest-growing cities in Kentucky used AI to plan for the next 25 Years, Jun 2025.https://medium.com/jigsaw/ how-one-of-the-fastest-growing-cities-in-kentucky-used-ai-to-plan-for-the-next-25-years-3b70c4fd1412. Katerina Korre, Dimitris Tsirmpas, Nikos Gkoumas, Emma Cabal´e, Danai Myrtzani, Theodoros Evgeniou, Ion Androutsopoulos, a...
-
[11]
ISSN 00071234, 14692112.http://www.jstor. org/stable/4092249. Sammy McKinney . Integrating artificial intelligence into citizens’ assemblies: Benefits, concerns and future pathways.Journal of Deliberative Democracy, 20(1),
-
[12]
Proportionally Fair Clustering Revisited
Evi Micha and Nisarg Shah. Proportionally Fair Clustering Revisited. In47th International Colloquium on Automata, Languages, and Programming (ICALP 2020), pages 85–1. Schloss Dagstuhl–Leibniz-Zentrum f¨ur Informatik,
work page 2020
-
[13]
doi: 10.1787/339306da-en.https://doi.org/10.1787/339306da-en. Aviv Ovadya. ’Generative CI’ through Collective Response Systems, 2023.https://arxiv.org/abs/2302. 00672. Luis S´anchez-Fern´andez, Edith Elkind, Martin Lackner, Norberto Fern´andez, Jes´us Fisteus, Pablo Basanta Val, and Piotr Skowron. Proportional justified representation. InProceedings of th...
work page doi:10.1787/339306da-en.https://doi.org/10.1787/339306da-en 2023
-
[14]
doi: 0.6035/recerca.5516. Christopher T Small, Ivan Vendrov, Esin Durmus, Hadjar Homaei, Elizabeth Barry , Julien Cornebise, Ted Suzman, Deep Ganguli, and Colin Megill. Opportunities and Risks of LLMs for Scalable Deliberation with Polis.arXiv preprint arXiv:2306.11932,
-
[15]
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay
doi: 10.1126/science.adq2852.https://www.science.org/doi/ abs/10.1126/science.adq2852. Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay . Deep Learning Based Recommender System: A Survey and New Perspectives.ACM Comput. Surv., 52(1), February
work page doi:10.1126/science.adq2852.https://www.science.org/doi/
-
[16]
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
ISSN 0360-0300. doi: 10.1145/3285029. https://doi.org/10.1145/3285029. 13 Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, et al. Qwen3 embedding: Advancing text embedding and reranking through foundation models.arXiv preprint arXiv:2506.05176,
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1145/3285029
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.