Recognition: unknown
Counting Worlds Branching Time Semantics for post-hoc Bias Mitigation in generative AI
Pith reviewed 2026-05-10 01:38 UTC · model grok-4.3
The pith
A new logic counts possible AI outputs as worlds to verify and fix bias in generations
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CTLF adopts a counting worlds semantics where each world represents a possible output at a given step in the generation process and introduces modal operators that allow us to verify whether the current output series respects an intended probability distribution over a protected attribute, to predict the likelihood of remaining within acceptable bounds as new outputs are generated, and to determine how many outputs are needed to remove in order to restore fairness. The framework is illustrated on a toy example of biased image generation.
What carries the argument
Counting worlds semantics for branching-time logic, using modal operators to verify probability distributions, predict bound adherence, and compute removals for fairness restoration.
Load-bearing premise
The actual generative AI process can be modeled faithfully as a branching-time structure with countable worlds at each step and that the modal operators accurately capture and enforce the intended fairness properties outside of toy examples.
What would settle it
An empirical test where CTLF is applied to outputs from a real generative model, and either the fairness verification fails to match observed data or the recommended removals do not achieve the target distribution.
Figures
read the original abstract
Generative AI systems are known to amplify biases present in their training data. While several inference-time mitigation strategies have been proposed, they remain largely empirical and lack formal guarantees. In this paper we introduce CTLF, a branching-time logic designed to reason about bias in series of generative AI outputs. CTLF adopts a counting worlds semantics where each world represents a possible output at a given step in the generation process and introduces modal operators that allow us to verify whether the current output series respects an intended probability distribution over a protected attribute, to predict the likelihood of remaining within acceptable bounds as new outputs are generated, and to determine how many outputs are needed to remove in order to restore fairness. We illustrate the framework on a toy example of biased image generation, showing how CTLF formulas can express concrete fairness properties at different points in the output series.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CTLF, a branching-time logic employing counting-worlds semantics in which each world represents a possible generative output at a given step. It defines modal operators intended to verify whether an output series respects a target probability distribution over a protected attribute, to predict the likelihood of remaining within fairness bounds as generation continues, and to compute the number of outputs that must be removed to restore compliance. These capabilities are illustrated via a toy example of biased image generation.
Significance. If the semantics and operators can be shown to correctly encode the intended fairness properties with respect to the actual (non-uniform) generative distribution, the framework would supply the first formal, post-hoc verification tool for bias mitigation in sequential generative outputs, replacing purely empirical strategies with checkable guarantees.
major comments (2)
- [Abstract / Semantics] Abstract and semantics section: the counting-worlds semantics is described as using cardinality ratios to evaluate probability operators, yet generative models induce non-uniform measures (softmax, diffusion schedules, etc.). No reduction to the model's actual probability measure is indicated, so the operators cannot correctly verify or predict respect for an intended distribution; this is load-bearing for all three claimed capabilities.
- [Abstract] Abstract: the manuscript states that CTLF supplies modal operators with the listed verification, prediction, and removal properties but provides neither the syntax, the satisfaction relation, nor any proof that the operators achieve these properties. Without these definitions the central claims cannot be checked.
minor comments (1)
- [Toy example] The toy example would be clearer if it included explicit CTLF formulas together with the step-by-step evaluation under the counting-worlds semantics.
Simulated Author's Rebuttal
We thank the referee for the thorough review and for identifying key areas where the formal foundations of CTLF require clarification and strengthening. We address each major comment below and outline the revisions we will undertake.
read point-by-point responses
-
Referee: [Abstract / Semantics] Abstract and semantics section: the counting-worlds semantics is described as using cardinality ratios to evaluate probability operators, yet generative models induce non-uniform measures (softmax, diffusion schedules, etc.). No reduction to the model's actual probability measure is indicated, so the operators cannot correctly verify or predict respect for an intended distribution; this is load-bearing for all three claimed capabilities.
Authors: We agree that this distinction is essential. The submitted manuscript defines the counting-worlds semantics via unweighted cardinality ratios, which corresponds to a uniform measure. To support the non-uniform distributions produced by generative models, we will revise the semantics section to introduce a weighted counting-worlds model. Each world will be equipped with a probability weight taken directly from the generative process (e.g., token probabilities or diffusion-step likelihoods). The modal operators will then be re-defined with respect to this weighted measure, and we will supply a formal reduction together with a proof that the verification, prediction, and removal operators correctly encode the intended fairness properties under the actual distribution. The toy example will be updated to use non-uniform probabilities. These changes will be incorporated in the revised manuscript. revision: yes
-
Referee: [Abstract] Abstract: the manuscript states that CTLF supplies modal operators with the listed verification, prediction, and removal properties but provides neither the syntax, the satisfaction relation, nor any proof that the operators achieve these properties. Without these definitions the central claims cannot be checked.
Authors: We acknowledge that the submitted version does not present the syntax, satisfaction relation, or proofs in sufficient detail for the claims to be independently verified. In the revision we will add a dedicated subsection (or expand Section 3) that gives the full syntax of CTLF, defines the satisfaction relation for each modal operator, and includes complete proofs that the operators realize the stated verification, prediction, and removal capabilities under the (revised, weighted) semantics. A brief summary of the syntax and satisfaction clauses will also be added to the abstract to guide readers to the formal material. revision: yes
Circularity Check
No circularity: CTLF is a definitional semantics for fairness properties
full rationale
The paper introduces CTLF as a new branching-time logic whose counting-worlds semantics and modal operators are explicitly defined to capture verification of probability distributions over protected attributes, prediction of future compliance, and computation of output removals needed for fairness. This is a standard definitional construction of a formal system rather than any derivation that reduces to its own fitted inputs or self-referential equations. No load-bearing self-citations, no parameters fitted to data then renamed as predictions, and no uniqueness theorems imported from prior author work appear in the provided abstract or description. The toy example simply illustrates the defined operators. The framework is self-contained against external benchmarks as a proposed logic; any mismatch with non-uniform generative probabilities is an assumption or modeling question, not a circular reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Generative AI output series can be modeled as branching time structures with countable worlds at each step.
invented entities (1)
-
CTLF logic with counting worlds semantics and modal operators
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Antonelli, L
M. Antonelli, L. Ceragioli, A. Buda, and G. Primiero. A linear temporal logic of frequencies on series of events, 2026
2026
-
[2]
Antonelli, U
M. Antonelli, U. Dal Lago, and P. Pistone. On counting propositional logic and Wag- ner’s hierarchy.Theoretical Computer Science, 966-967, 2023
2023
-
[3]
Antonelli, U
M. Antonelli, U. Dal Lago, and P. Pistone. Towards logical foundations for probabilistic computation.Annals of Pure and Applied Logic, 175(9), 2024
2024
-
[4]
S. G. Ayyamperumal and L. Ge. Current state of LLM risks and AI guardrails. 2024
2024
-
[5]
Bacchus.Representing and Reasoning with Probabilistic Knowledge A Logical Ap- proach to Probabilities
F. Bacchus.Representing and Reasoning with Probabilistic Knowledge A Logical Ap- proach to Probabilities. The MIT Press, 1990
1990
-
[6]
S. L. Blodgett et al. Bias and fairness in large language models: A survey.Computa- tional Linguistics, 2024
2024
-
[7]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Bias amplifi- cation chains in ml-based systems with an application to credit scoring, Jan 2023
2023
-
[8]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Bias amplifica- tion chains in ml-based systems with an application to credit scoring. In G. Coraglia, F. A. D’Asaro, A. Dyoub, F. A. Lisi, and G. Primiero, editors,Proceedings of the 3rd Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Program- ming co-locate...
2024
-
[9]
A. G. Buda, G. Coraglia, F. A. Genco, C. Manganini, and G. Primiero. Assessing the risk of discrimination with BRIO: A use case from the financial sector. In D. Pedreschi, M. Milano, I. Tiddi, S. Russell, C. Boldrini, L. Pappalardo, A. Passerini, and S. Wang, editors,Proceedings of the 4th International Conference on Hybrid Human-Artificial Intelligence, ...
2025
-
[10]
A. G. Buda and G. Primiero. A logic for using information.Logique Et Analyse, 265(n/a):59–103, 2025
2025
-
[11]
Ceragioli and G
L. Ceragioli and G. Primiero. Trustworthiness preservation by copies of machine learn- ing systems.Int. J. Approx. Reason., 192:109638, 2026
2026
-
[12]
T. Chen, G. Primiero, F. Raimondi, and N. Rungta. A computationally grounded, weighted doxastic logic.Stud Logica, 104(4):679–703, 2016. 10
2016
- [13]
-
[14]
Coraglia, F
G. Coraglia, F. A. D’Asaro, F. A. Genco, D. Giannuzzi, D. Posillipo, G. Primiero, and C. Quaggio. Brioxalkemy: a bias detecting tool. In G. Boella, F. A. D’Asaro, A. Dyoub, L. Gorrieri, F. A. Lisi, C. Manganini, and G. Primiero, editors,Proceedings of the 2nd Workshop on Bias, Ethical AI, Explainability and the role of Logic and Logic Programming co-locat...
2023
-
[15]
F. A. D’Asaro, F. A. Genco, and G. Primiero. Checking trustworthiness of probabilistic computations in a typed natural deduction system.J. Log. Comput., 35(6), 2025
2025
-
[16]
Fagin, J
R. Fagin, J. Halpern, and N. Megiddo. A logic for reasoning about probabilities.In- formation and Computation, 87(1-2):78–128, 1990
1990
-
[17]
Y. Guo, M. Guo, J. Su, Z. Yang, and M. Zhu. Bias in large language models: Origin, evaluation, and mitigation.arXiv preprint arXiv:2411.10915, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
S. Han, S. Avestimehr, and C. He. Bridging the safety gap: A guardrail pipeline for trustworthy llm inferences. 2025
2025
-
[19]
Haque, D
F. Haque, D. Xu, and X. Niu. A comprehensive survey on bias and fairness in large lan- guage models. In S. Yuan, F. Malliaros, and X. Zheng, editors,Trends and Applications in Knowledge Discovery and Data Mining, pages 83–101. Springer Nature Singapore, Singapore, 2025
2025
-
[20]
Hirota, Y
Y. Hirota, Y. Nakashima, and N. Garcia. Quantifying societal bias amplification in image captioning. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13440–13449, 2022
2022
-
[21]
Hutschenreiter, C
L. Hutschenreiter, C. Baier, and J. Klein. Parametric markov chains: Pctl complexity and fraction-free gaussian elimination.Electronic Proceedings in Theoretical Computer Science, 256:16–30, Sept. 2017
2017
-
[22]
Jansen, S
N. Jansen, S. Junges, and J.-P. Katoen.Parameter Synthesis in Markov Models: A Gentle Survey, pages 407–437. Springer Nature Switzerland, Cham, 2022
2022
-
[23]
Kiashemshaki, M
K. Kiashemshaki, M. J. Torkamani, N. Mahmoudi, and M. S. Bilehsavar. Simulating a bias mitigation scenario in large language models. 2025
2025
-
[24]
Kontinen
J. Kontinen. A logical characterization of the counting hierarchy.ACM Trans. Comput. Log., 10(1):7:1–7:21, 2009
2009
-
[25]
Kubyshkina and G
E. Kubyshkina and G. Primiero. A possible worlds semantics for trustworthy non- deterministic computations.Int. J. Approx. Reason., 172:109212, 2024
2024
-
[26]
Legastelois, M.-J
B. Legastelois, M.-J. Lesot, and A. R. d’Allonnes. Typology of axioms for a weighted modal logic.International Journal of Approximate Reasoning, 90:341–358, 2017
2017
-
[27]
L. Lin, L. Wang, J. Guo, and K.-F. Wong. Investigating bias in llm-based bias detection: Disparities between llms and human perception. InProceedings of COLING, 2025. 11
2025
-
[28]
Maaz and T
M. Maaz and T. C. Y. Chan. Formal verification of markov processes with learned parameters, 2025
2025
-
[29]
L. Monteiro Paes et al. Direct steering optimization for bias mitigation.arXiv preprint arXiv:2512.15926, 2025
-
[30]
Moskowakis
A. Moskowakis. On a generalization of quantifiers.Fundamenta Mathematicae, 1957
1957
-
[31]
N. J. Nilsson. Probabilistic logic.Artificial Intelligence, 28(1):71–87, 1986
1986
-
[32]
Seshadri, S
P. Seshadri, S. Singh, and Y. Elazar. The bias amplification paradox in text-to-image generation, 2023
2023
-
[33]
Z. Siddique, I. Khalid, et al. Shifting perspectives: Steering vector ensembles for robust bias mitigation in llms.arXiv preprint arXiv:2503.05371, 2025
-
[34]
A. A. Syed. Guardrails for large language models: A review of techniques and challenges. URF Journals, 2025
2025
-
[35]
K. W. Wagner. The complexity of combinatorial problems with succinct input repre- sentation.Acta informatica, 23(3):325–356, 1986
1986
-
[36]
H. Zhou, Z. Feng, Z. Zhu, J. Qian, and K. Mao. Unibias: unveiling and mitigating llm bias through internal attention and ffn manipulation. 2024. 12
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.