pith. sign in

arxiv: 2605.18202 · v1 · pith:HQCEEOLJnew · submitted 2026-05-18 · 💻 cs.LG · cs.AI

Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based Models

Pith reviewed 2026-05-20 12:48 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords conformal predictionneuro-symbolic AIconcept-based modelslogical consistencyuncertainty quantificationset-valued prediction
0
0 comments X

The pith

COCOCO produces conformal prediction sets for neuro-symbolic concept models that are logically consistent, cover the true output with a user-chosen probability, and stay small.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Neuro-symbolic concept-based models extract high-level concepts from data and then apply logical rules to reach a final label. Their predictions can be overconfident, so the paper adds conformal prediction to give distribution-free coverage guarantees. Existing conformal methods for these models violate at least one of three required properties: the sets must respect the logical constraints, must cover the true label with the stated probability, and must remain small. COCOCO conformalizes concepts and labels together and then reconciles the two sets with one deduction-abduction step. This single revision produces sets that meet all three properties at once while preserving the original coverage guarantee and working even when the supplied logical knowledge is imperfect.

Core claim

COCOCO is a post-hoc procedure that takes any trained neuro-symbolic concept-based model, produces conformal sets over both concepts and labels, and reconciles them through a single deduction-abduction revision step. The resulting sets are guaranteed to be consistent with the given logical constraints, to contain the true label with a user-specified probability, and to respect a user-specified size budget, all while retaining the distribution-free coverage property of the underlying conformal procedure.

What carries the argument

The single deduction-abduction revision step that jointly reconciles the conformal concept set and the conformal label set.

If this is right

  • The sets remain valid even when the provided logical knowledge is only partially correct.
  • Users can request any desired set size budget and the method will produce the smallest sets that still satisfy the other guarantees.
  • The same procedure works for any neuro-symbolic concept-based architecture that separates concept prediction from label inference under constraints.
  • Experiments across eight datasets show smaller sets and better task performance than prior conformal baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same joint-conformalization-plus-revision pattern could be applied to other hybrid systems that combine continuous perception with discrete rules.
  • Because the revision step is deterministic and cheap, the method can be inserted after any existing conformal predictor without retraining the underlying model.
  • If the logical constraints change over time, the coverage guarantee may need a fresh calibration set to remain valid.

Load-bearing premise

The logical constraints are fixed in advance and a single deduction-abduction step is enough to restore consistency without breaking the distribution-free coverage guarantee of the underlying conformal procedure.

What would settle it

Run COCOCO on a dataset with known ground-truth labels and check whether the produced sets contain the true label at least as often as the target coverage level; if the empirical coverage falls below the target on repeated trials, the coverage claim is falsified.

Figures

Figures reproduced from arXiv: 2605.18202 by Andrea Passerini, Andrea Pugnana, Emanuele Marconato, Samuele Bortolotti, Stefano Teso.

Figure 1
Figure 1. Figure 1: Left: the task and concept predictions of NeSy-CBMs can make confident mistakes (e.g., 6 and 13, in red), making it difficult to gauge the (un)reliability of their predictions. Right: COCOCO enables any NeSy￾CBM to output task- and concept-level prediction sets with coverage guarantees for any desired size, while ensuring their mutual consistency via an abduction-deduction refinement step. E.g., in the pic… view at source ↗
Figure 2
Figure 2. Figure 2: COCOCO∗ on CHX with DPL under fixed regime |Υrev(x)| ≤ 2 ∧ |Γ rev(x)| ≤ 5, bootstrapped over 100 calibration resamples and averaged across 10 seeds. Empirical coverage (orange) exceeds the theoretical target (green), while sizes remain strictly below the imposed budgets. guarantees do not degrade with the number of concepts. We evaluate on CHX, fixing label and concept budgets to 2 and 5 – operationally me… view at source ↗
Figure 3
Figure 3. Figure 3: E-conformal coverage under single-sided constraints on [PITH_FULL_IMAGE:figures/full_fig_p041_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Joint failure probabilities on CIF10 and RIV10. Even if low, this estimate contributes to tightening the lower bound in Proposition 4.2 and Proposition C.10. 42 [PITH_FULL_IMAGE:figures/full_fig_p042_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Average prediction-set sizes across constraint regimes; error bars show [PITH_FULL_IMAGE:figures/full_fig_p043_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Synthetic experiment showing (1 − α)δab − β for varying α (left) and β (right) starting at nominal guarantee of 1−α = 1−β = 0.9. The bound decrease linearly depending on the quality of the inference layer g. F.1 Estimating δab and δde We estimate the soundness gaps empirically as ˆδde = 1 |Dtest| X (x,c∗,y∗)∈Dtest 1{g † (c ∗ ) = y ∗ }, (139) ˆδab = 1 |Dtest| X (x,c∗,y∗)∈Dtest 1{c ∗ ∈ g −†(y ∗ )}. (140) whe… view at source ↗
Figure 7
Figure 7. Figure 7: COCOCO∗ on CHX with DPL under fixed regime |Υrev(x)| ≤ 2 ∧ |Γ rev(x)| ≤ 5, bootstrapped over 100 calibration resamples and averaged across 10 seeds. Empirical coverage (orange) exceeds the theoretical target (green), while sizes remain strictly below the imposed budgets. Results using product as e-value aggregator. | rev (x)| 2 | rev (x)| 5 | rev (x)| 2 | rev (x)| 5 0 1 2 3 4 5 Average set size Average | r… view at source ↗
Figure 8
Figure 8. Figure 8: Average set-sizes using COCOCO∗ and product as e-value aggregator. F.3 The Full Spectrum of NeSy Conformal Prediction In the main text we compared COCOCO only against RPB, the strongest competitor. Here we report the full ablation against the four natural conformalization strategies introduced in Section B: TO, TAb, CO, and CDe. The goal is threefold: (i) confirm that the two-sided joint revision of COCOCO… view at source ↗
Figure 9
Figure 9. Figure 9: Time overhead (%) relative to DPL (left) and LTN (right) baseline (dashed line, 100%) for all methods (TO, TAb, CO, CDe, RPB, COCOCO and COCOCO∗ ) averaged across all 10 seeds and all datasets. Blue shows the case where concept supervision was absent, and orange shows the case where it was present. Error bars indicate standard deviation. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Time overhead (%) relative to the DPL and LTN baseline (dashed line, 100%) for all methods (TO, TAb, CO, CDe, RPB, COCOCO, and COCOCO∗ ), across all 8 datasets, averaged over 10 seeds. The top panel reports results with the DPL baseline, and the bottom panel with the LTN baseline. Blue indicates absence of concept supervision, while orange indicates its presence. Error bars denote standard deviation. 48 … view at source ↗
read the original abstract

Neuro-Symbolic Concept-based Models (NeSy-CBMs) are a family of architectures that integrate neural networks with symbolic reasoning for enhanced reliability in high-stakes applications. They work by first extracting high-level concepts from the input and then inferring a task label from these compatibly with given logical constraints. Yet, their label and concept predictions can be overconfident, making it difficult for stakeholders to gauge when the model's decisions can be trusted. We address this issue by integrating ideas from Conformal Prediction (CP), a framework providing rigorous, distribution-free coverage guarantees. We formalize three desiderata -- consistency, coverage, and conciseness -- that any conformal method for NeSy-CBMs should satisfy, and show that existing approaches fall short of at least one. We then introduce COCOCO, a post-hoc framework that conformalizes concepts and labels jointly and reconciles them via a single deduction-abduction revision step. COCOCO satisfies all three desiderata, retains distribution-free coverage, is robust to imperfect knowledge and supports user-specified size budgets. Our experiments on 8 data sets highlight how COCOCO compares favorably against competitors and natural baselines in terms of performance and set size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces COCOCO, a post-hoc framework for conformal prediction in Neuro-Symbolic Concept-Based Models (NeSy-CBMs). It jointly conformalizes concept and label predictions, then applies a single deduction-abduction revision step using fixed logical constraints to produce sets that are consistent, concise, and coverage-guaranteed. The authors formalize three desiderata (consistency, coverage, conciseness), argue that prior methods fail at least one, and claim that COCOCO satisfies all three while retaining distribution-free marginal coverage, remaining robust to imperfect constraints, and supporting user-specified size budgets. Experiments on eight datasets compare COCOCO favorably to baselines and competitors on set size and empirical coverage.

Significance. If the coverage preservation after revision is rigorously established, the work would meaningfully advance reliable uncertainty quantification for hybrid neural-symbolic systems in high-stakes settings. The joint conformalization plus logical reconciliation directly targets overconfidence while respecting domain constraints, and the support for size budgets adds practical utility. The formalization of desiderata and the post-hoc nature are clear strengths; empirical results on multiple datasets provide useful validation, though the theoretical guarantee is the primary source of potential impact.

major comments (2)
  1. [§4.2] §4.2 (Coverage Theorem): The central claim that the deduction-abduction revision preserves distribution-free marginal coverage is load-bearing, yet the argument only sketches that the revision is a deterministic map applied identically to calibration and test points. No explicit lemma shows that the operator maintains the p-value ordering or the coverage inequality when constraints are imperfect or when the revision involves abduction over multiple concepts; this gap directly affects whether the distribution-free guarantee survives the reconciliation step.
  2. [§3.1] §3.1 (Joint Conformalization): The nonconformity scores for concepts and labels are defined jointly, but the subsequent revision step is not shown to be non-adaptive with respect to the test label. If the abduction component can depend on the realized test prediction in a way that correlates with the nonconformity score, exchangeability between calibration and test points is at risk; a concrete counter-example or invariance proof is required.
minor comments (3)
  1. [§2] Notation for the final consistent sets (C_final) versus raw conformal sets is introduced without a clear table or diagram in §2; adding a small running example would improve readability.
  2. [Table 1] Table 1 (desiderata comparison): the row for 'robustness to imperfect knowledge' lists COCOCO as 'yes' but provides no quantitative measure of how coverage degrades with constraint error rate; a small sensitivity plot would clarify the claim.
  3. [§5] The abstract states 'supports user-specified size budgets' but the experimental section does not report the achieved sizes relative to the budget parameter; including these numbers would make the practical advantage concrete.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address the major points below regarding the coverage theorem and joint conformalization, and we will strengthen the manuscript with additional formal arguments as outlined.

read point-by-point responses
  1. Referee: [§4.2] §4.2 (Coverage Theorem): The central claim that the deduction-abduction revision preserves distribution-free marginal coverage is load-bearing, yet the argument only sketches that the revision is a deterministic map applied identically to calibration and test points. No explicit lemma shows that the operator maintains the p-value ordering or the coverage inequality when constraints are imperfect or when the revision involves abduction over multiple concepts; this gap directly affects whether the distribution-free guarantee survives the reconciliation step.

    Authors: We agree that an explicit lemma would improve rigor. The manuscript sketches preservation via the deterministic and identical application of the revision map to calibration and test points, which maintains exchangeability and marginal coverage. In revision we will add Lemma 4.1 formally proving that the deduction-abduction operator preserves the coverage inequality even under imperfect constraints and multi-concept abduction, by establishing that post-revision sets are a deterministic function of the pre-revision conformal sets that does not disrupt p-value ordering. revision: yes

  2. Referee: [§3.1] §3.1 (Joint Conformalization): The nonconformity scores for concepts and labels are defined jointly, but the subsequent revision step is not shown to be non-adaptive with respect to the test label. If the abduction component can depend on the realized test prediction in a way that correlates with the nonconformity score, exchangeability between calibration and test points is at risk; a concrete counter-example or invariance proof is required.

    Authors: The revision operates on the joint conformal sets using only the fixed logical constraints; it does not condition on the realized test label in a way that correlates with nonconformity scores. The same deterministic process is applied symmetrically to calibration points. To address the concern we will add an invariance argument in the appendix showing the revision function is non-adaptive w.r.t. the test label and preserves exchangeability. We do not expect a counter-example under the stated setup. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; coverage claim rests on unverified but non-circular preservation argument for revision step

full rationale

The paper presents COCOCO as a post-hoc procedure that first produces joint conformal sets for concepts and labels, then applies a single fixed deduction-abduction revision using given logical constraints. The abstract and described method claim retention of distribution-free marginal coverage after this step. No equation or derivation reduces a fitted parameter to a prediction of the same quantity, nor does any central result collapse to a self-citation whose authors overlap and whose content is unverified. The three desiderata are formalized independently, and the claim that existing methods fail at least one is presented as an analysis rather than a definitional tautology. The potential issue that the revision map might break exchangeability is a question of proof completeness rather than circularity by construction. Hence the derivation chain remains self-contained against external conformal guarantees, warranting only a minor score for the implicit assumption that the revision operator is coverage-preserving.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities beyond the standard conformal prediction coverage guarantee and the assumption of given logical constraints.

axioms (1)
  • standard math Conformal prediction provides distribution-free coverage guarantees when applied to any fixed predictor.
    Invoked when stating that COCOCO retains distribution-free coverage.
invented entities (1)
  • COCOCO framework no independent evidence
    purpose: Joint conformalization of concepts and labels with deduction-abduction revision
    New post-hoc procedure introduced in the paper.

pith-pipeline@v0.9.0 · 5757 in / 1389 out tokens · 32789 ms · 2026-05-20T12:48:23.785220+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

96 extracted references · 96 canonical work pages · 1 internal anchor

  1. [1]

    From statistical relational to neural-symbolic artificial intelligence

    Luc De Raedt, Sebastijan Dumanˇci´c, Robin Manhaeve, and Giuseppe Marra. From statistical relational to neural-symbolic artificial intelligence. InIJCAI, 2021

  2. [2]

    Neural-symbolic learning and reasoning: A survey and interpretation.Neuro-Symbolic Artificial Intelligence: The State of the Art, 342:1, 2022

    Artur d’Avila Garcez, Sebastian Bader, Howard Bowman, Luis C Lamb, Leo de Penning, BV Illuminoo, Hoifung Poon, and COPPE Gerson Zaverucha. Neural-symbolic learning and reasoning: A survey and interpretation.Neuro-Symbolic Artificial Intelligence: The State of the Art, 342:1, 2022

  3. [3]

    Deep learning with logical constraints

    Eleonora Giunchiglia, Mihaela Catalina Stoian, and Thomas Lukasiewicz. Deep learning with logical constraints. pages 5478–5485, 2022. doi: 10.24963/IJCAI.2022/767. URL https://doi.org/10. 24963/ijcai.2022/767

  4. [4]

    A review of some techniques for inclusion of domain-knowledge into deep neural networks.Scientific Reports, 2022

    Tirtharaj Dash, Sharad Chitlangia, Aditya Ahuja, and Ashwin Srinivasan. A review of some techniques for inclusion of domain-knowledge into deep neural networks.Scientific Reports, 2022

  5. [5]

    Symbol grounding in neuro-symbolic ai: A gentle introduction to reasoning shortcuts.arXiv preprint arXiv:2510.14538, 2025

    Emanuele Marconato, Samuele Bortolotti, Emile van Krieken, Paolo Morettin, Elena Umili, Antonio Vergari, Efthymia Tsamoura, Andrea Passerini, and Stefano Teso. Symbol grounding in neuro-symbolic ai: A gentle introduction to reasoning shortcuts.arXiv preprint arXiv:2510.14538, 2025

  6. [6]

    The mnist database of handwritten digits.http://yann

    Yann LeCun. The mnist database of handwritten digits.http://yann. lecun. com/exdb/mnist/, 1998

  7. [7]

    DeepProbLog: Neural Probabilistic Logic Programming.NeurIPS, 2018

    Robin Manhaeve, Sebastijan Dumancic, Angelika Kimmig, Thomas Demeester, and Luc De Raedt. DeepProbLog: Neural Probabilistic Logic Programming.NeurIPS, 2018

  8. [8]

    BEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts.Uncertainty in AI, 2024

    Emanuele Marconato, Samuele Bortolotti, Emile van Krieken, Antonio Vergari, Andrea Passerini, and Stefano Teso. BEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts.Uncertainty in AI, 2024

  9. [9]

    Neurosymbolic diffusion models

    Emile van Krieken, Pasquale Minervini, Edoardo Ponti, and Antonio Vergari. Neurosymbolic diffusion models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  10. [10]

    Things machine learning models know that they don’t know

    Salvatore Ruggieri and Andrea Pugnana. Things machine learning models know that they don’t know. In AAAI, pages 28684–28693. AAAI Press, 2025

  11. [11]

    Deciding fast and slow: The role of cognitive biases in ai-assisted decision-making.Proceedings of the ACM on Human-computer Interaction, 6(CSCW1):1–22, 2022

    Charvi Rastogi, Yunfeng Zhang, Dennis Wei, Kush R Varshney, Amit Dhurandhar, and Richard Tomsett. Deciding fast and slow: The role of cognitive biases in ai-assisted decision-making.Proceedings of the ACM on Human-computer Interaction, 6(CSCW1):1–22, 2022

  12. [12]

    A tutorial on conformal prediction

    Vladimir V ovk, Alexander Gammerman, and Craig Saunders. A tutorial on conformal prediction. In Advances in Neural Information Processing Systems, 1999

  13. [13]

    John Wiley & Sons, 2019

    Glenn Shafer and Vladimir V ovk.Game-theoretic foundations for probability and finance. John Wiley & Sons, 2019

  14. [14]

    E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

    Vladimir V ovk and Ruodu Wang. E-values: Calibration, combination and applications.The Annals of Statistics, 49(3):1736–1754, 2021

  15. [15]

    Safe testing

    Peter Grünwald, Rianne de Heide, and Wouter M Koolen. Safe testing. In2020 Information theory and applications workshop (ITA), pages 1–54. IEEE, 2020

  16. [16]

    Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

    Aaditya Ramdas and Ruodu Wang. Hypothesis testing with e-values.Foundations and Trends® in Statistics, 1(1-2):1–390, 2025

  17. [17]

    Conformal e-prediction.Pattern Recognition, 166:111674, 2025

    Vladimir V ovk. Conformal e-prediction.Pattern Recognition, 166:111674, 2025

  18. [18]

    E-values expand the scope of conformal prediction

    Etienne Gauthier, Francis Bach, and Michael I Jordan. E-values expand the scope of conformal prediction. arXiv preprint arXiv:2503.13050, 2025

  19. [19]

    Bach, and Michael I

    Etienne Gauthier, Francis R. Bach, and Michael I. Jordan. Backward conformal prediction. InNeurIPS, 2025

  20. [20]

    Bach, and Michael I

    Etienne Gauthier, Francis R. Bach, and Michael I. Jordan. Adaptive coverage policies in conformal prediction. InAISTATS, 2026

  21. [21]

    Neural probabilistic logic programming in deepproblog.Artificial Intelligence, 298:103504, 2021

    Robin Manhaeve, Sebastijan Duman ˇci´c, Angelika Kimmig, Thomas Demeester, and Luc De Raedt. Neural probabilistic logic programming in deepproblog.Artificial Intelligence, 298:103504, 2021

  22. [22]

    Shortcuts and identifiability in concept-based models from a neuro-symbolic lens

    Samuele Bortolotti, Emanuele Marconato, Paolo Morettin, Andrea Passerini, and Stefano Teso. Shortcuts and identifiability in concept-based models from a neuro-symbolic lens.CoRR, abs/2502.11245, February

  23. [23]

    URLhttps://doi.org/10.48550/arXiv.2502.11245

  24. [24]

    Not all neuro-symbolic concepts are created equal: Analysis and mitigation of reasoning shortcuts

    Emanuele Marconato, Stefano Teso, Antonio Vergari, and Andrea Passerini. Not all neuro-symbolic concepts are created equal: Analysis and mitigation of reasoning shortcuts. InNeurIPS, 2023. 10 Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based ModelsA PREPRINT

  25. [25]

    Probabilistic (logic) programming concepts.Machine Learning, 100:5–47, 2015

    Luc De Raedt and Angelika Kimmig. Probabilistic (logic) programming concepts.Machine Learning, 100:5–47, 2015

  26. [26]

    A knowledge compilation map.Journal of Artificial Intelligence Research, 17:229–264, 2002

    Adnan Darwiche and Pierre Marquis. A knowledge compilation map.Journal of Artificial Intelligence Research, 17:229–264, 2002

  27. [27]

    A compositional atlas of tractable circuit operations for probabilistic inference.NeurIPS, 2021

    Antonio Vergari, YooJung Choi, Anji Liu, Stefano Teso, and Guy Van den Broeck. A compositional atlas of tractable circuit operations for probabilistic inference.NeurIPS, 2021

  28. [28]

    Logic tensor networks

    Samy Badreddine, Artur d’Avila Garcez, Luciano Serafini, and Michael Spranger. Logic tensor networks. Artificial Intelligence, 303:103649, 2022

  29. [29]

    Logic tensor networks for semantic image interpretation

    Ivan Donadello, Luciano Serafini, and Artur D’Avila Garcez. Logic tensor networks for semantic image interpretation. InIJCAI, 2017

  30. [30]

    Analyzing Differentiable Fuzzy Implications

    Emile van Krieken, Erman Acar, and Frank van Harmelen. Analyzing Differentiable Fuzzy Implications. InProceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning, pages 893–903, 9 2020. doi: 10.24963/kr.2020/92. URL https://doi.org/10.24963/ kr.2020/92

  31. [31]

    On the independence assumption in neurosymbolic learning

    Emile Van Krieken, Pasquale Minervini, Edoardo Ponti, and Antonio Vergari. On the independence assumption in neurosymbolic learning. InInternational Conference on Machine Learning, pages 49078– 49097. PMLR, 2024

  32. [32]

    Towards human-ai complementarity with prediction sets

    Giovanni De Toni, Nastaran Okati, Suhas Thejaswi, Eleni Straitouri, and Manuel Gomez-Rodriguez. Towards human-ai complementarity with prediction sets. In A. Globerson, L. Mackey, D. Bel- grave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Informa- tion Processing Systems, volume 37, pages 31380–31409. Curran Associates, Inc., 20...

  33. [33]

    Designing decision support systems using counterfactual prediction sets

    Eleni Straitouri and Manuel Gomez Rodriguez. Designing decision support systems using counterfactual prediction sets. InInternational Conference on Machine Learning, pages 46722–46744. PMLR, 2024

  34. [34]

    Towards human-ai complementarity in matching tasks.arXiv preprint arXiv:2508.13285, 2025

    Adrian Arnaiz-Rodriguez, Nina Corvelo Benz, Suhas Thejaswi, Nuria Oliver, and Manuel Gomez- Rodriguez. Towards human-ai complementarity in matching tasks.arXiv preprint arXiv:2508.13285, 2025

  35. [35]

    Controlling counterfactual harm in decision support systems based on prediction sets

    Eleni Straitouri, Suhas Thejaswi, and Manuel Gomez Rodriguez. Controlling counterfactual harm in decision support systems based on prediction sets. In A. Globerson, L. Mackey, D. Bel- grave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Informa- tion Processing Systems, volume 37, pages 129443–129479. Curran Associates, Inc., 202...

  36. [36]

    Boger, Seyone Chithrananda, Anastasios N

    Ron S. Boger, Seyone Chithrananda, Anastasios N. Angelopoulos, Peter H. Yoon, Michael I. Jordan, and Jennifer A. Doudna. Functional protein mining with conformal guarantees.Nature Communications, 16 (1):85, Jan 2025. ISSN 2041-1723. doi: 10.1038/s41467-024-55676-y. URL https://doi.org/10. 1038/s41467-024-55676-y

  37. [37]

    Inductive confidence machines for regression

    Harris Papadopoulos, Kostas Proedrou, V olodya V ovk, and Alexander Gammerman. Inductive confidence machines for regression. InECML, volume 2430 ofLecture Notes in Computer Science, pages 345–356. Springer, 2002

  38. [38]

    Angelopoulos and Stephen Bates

    Anastasios N. Angelopoulos and Stephen Bates. Conformal prediction: A gentle introduction.Found. Trends Mach. Learn., 16(4):494–591, 2023

  39. [39]

    Machine learning with requirements: A manifesto.Neurosymbolic Artificial Intelligence, 1:NAI–240767, 2025

    Eleonora Giunchiglia, Fergus Imrie, Mihaela van der Schaar, and Thomas Lukasiewicz. Machine learning with requirements: A manifesto.Neurosymbolic Artificial Intelligence, 1:NAI–240767, 2025. doi: 10.3233/NAI-240767. URLhttps://doi.org/10.3233/NAI-240767

  40. [40]

    Road-r: the autonomous driving dataset with logical requirements.Machine Learning, 112(9):3261– 3291, 2023

    Eleonora Giunchiglia, Mihaela C˘at˘alina Stoian, Salman Khan, Fabio Cuzzolin, and Thomas Lukasiewicz. Road-r: the autonomous driving dataset with logical requirements.Machine Learning, 112(9):3261– 3291, 2023

  41. [41]

    Semantic Probabilistic Layers for Neuro-Symbolic Learning

    Kareem Ahmed, Stefano Teso, Kai-Wei Chang, Guy Van den Broeck, and Antonio Vergari. Semantic Probabilistic Layers for Neuro-Symbolic Learning. InNeurIPS, 2022

  42. [42]

    A probabilistic neuro-symbolic layer for algebraic constraint satisfaction

    Leander Kurscheidt, Paolo Morettin, Roberto Sebastiani, Andrea Passerini, and Antonio Vergari. A probabilistic neuro-symbolic layer for algebraic constraint satisfaction. In Silvia Chiappa and Sara 11 Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based ModelsA PREPRINT Magliacane, editors,Proceedings of the Forty-first Confere...

  43. [43]

    The theory and practice of map inference over non-convex constraints.arXiv preprint arXiv:2602.08681, 2026

    Leander Kurscheidt, Gabriele Masina, Roberto Sebastiani, and Antonio Vergari. The theory and practice of map inference over non-convex constraints.arXiv preprint arXiv:2602.08681, 2026

  44. [44]

    Ccn+: A neuro-symbolic framework for deep learning with requirements.International Journal of Approximate Reasoning, page 109124, 2024

    Eleonora Giunchiglia, Alex Tatomir, Mihaela C ˘at˘alina Stoian, and Thomas Lukasiewicz. Ccn+: A neuro-symbolic framework for deep learning with requirements.International Journal of Approximate Reasoning, page 109124, 2024

  45. [45]

    Concept bottleneck models

    Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. Concept bottleneck models. InInternational conference on machine learning, pages 5338–

  46. [46]

    Deep symbolic learning: discovering symbols and rules from perceptions

    Alessandro Daniele, Tommaso Campari, Sagar Malhotra, and Luciano Serafini. Deep symbolic learning: discovering symbols and rules from perceptions. InIJCAI, 2023

  47. [47]

    (alpha)ilp: thinking visual scenes as differentiable logic programs.Machine Learning, 112(5):1465–1497, 2023

    Hikaru Shindo, Viktor Pfanschilling, Devendra Singh Dhami, and Kristian Kersting. (alpha)ilp: thinking visual scenes as differentiable logic programs.Machine Learning, 112(5):1465–1497, 2023

  48. [48]

    From perception to programs: regularize, overparameterize, and amortize

    Hao Tang and Kevin Ellis. From perception to programs: regularize, overparameterize, and amortize. In ICML, 2023

  49. [49]

    Scaling multi-label conformal prediction with label interactions for a large number of labels

    Ghassan Najjar, Céline Berthou, and Héléna V orobieva. Scaling multi-label conformal prediction with label interactions for a large number of labels. In Rita P. Ribeiro, Bernhard Pfahringer, Nathalie Japkowicz, Pedro Larrañaga, Alípio M. Jorge, Carlos Soares, Pedro H. Abreu, and João Gama, editors, Machine Learning and Knowledge Discovery in Databases. Re...

  50. [50]

    ISBN 978-3-032-06109-6

    Springer Nature Switzerland. ISBN 978-3-032-06109-6

  51. [51]

    Uncertainty quantification for neurosymbolic programs via compositional conformal prediction.arXiv preprint arXiv:2405.15912, 2024

    Ramya Ramalingam, Sangdon Park, and Osbert Bastani. Uncertainty quantification for neurosymbolic programs via compositional conformal prediction.arXiv preprint arXiv:2405.15912, 2024

  52. [52]

    Abductive learning: towards bridging machine learning and logical reasoning.Science China

    Zhi-Hua Zhou. Abductive learning: towards bridging machine learning and logical reasoning.Science China. Information Sciences, 62(7):76101, 2019

  53. [53]

    Are training resources insufficient? predict first then explain! arXiv preprint arXiv:2110.02056, 2021

    Myeongjun Jang and Thomas Lukasiewicz. Are training resources insufficient? predict first then explain! arXiv preprint arXiv:2110.02056, 2021

  54. [54]

    A-nesi: A scalable approximate method for probabilistic neurosymbolic inference.arXiv preprint arXiv:2212.12393, 2022

    Emile van Krieken, Thiviyan Thanapalasingam, Jakub M Tomczak, Frank van Harmelen, and Annette ten Teije. A-nesi: A scalable approximate method for probabilistic neurosymbolic inference.arXiv preprint arXiv:2212.12393, 2022

  55. [55]

    Neurosymbolic conformal classification

    Arthur Ledaguenel, Céline Hudelot, and Mostepha Khouadjia. Neurosymbolic conformal classification. arXiv preprint arXiv:2409.13585, 2024

  56. [56]

    Distribution-free prediction bands for non-parametric regression.Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014

    Jing Lei and Larry Wasserman. Distribution-free prediction bands for non-parametric regression.Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014

  57. [57]

    Distribution-free predictive inference for regression.Journal of the American Statistical Association, 113(523):1094– 1111, 2018

    Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J Tibshirani, and Larry Wasserman. Distribution-free predictive inference for regression.Journal of the American Statistical Association, 113(523):1094– 1111, 2018

  58. [58]

    The limits of distribution-free conditional predictive inference.Information and Inference: A Journal of the IMA, 10 (2):455–482, 2021

    Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. The limits of distribution-free conditional predictive inference.Information and Inference: A Journal of the IMA, 10 (2):455–482, 2021

  59. [59]

    Conformal prediction with conditional guarantees

    Isaac Gibbs, John J Cherian, and Emmanuel J Candès. Conformal prediction with conditional guarantees. Journal of the Royal Statistical Society Series B: Statistical Methodology, 87(4):1100–1126, 2025

  60. [60]

    Conformal prediction under ambiguous ground truth.Trans

    David Stutz, Abhijit Guha Roy, Tatiana Matejovicova, Patricia Strachan, Ali Taylan Cemgil, and Arnaud Doucet. Conformal prediction under ambiguous ground truth.Trans. Mach. Learn. Res., 2023, 2023

  61. [61]

    Conformalized credal regions for classifica- tion with ambiguous ground truth.Trans

    Michele Caprio, David Stutz, Shuo Li, and Arnaud Doucet. Conformalized credal regions for classifica- tion with ambiguous ground truth.Trans. Mach. Learn. Res., 2025, 2025

  62. [62]

    Tibshirani, Rina Foygel Barber, Emmanuel J

    Ryan J. Tibshirani, Rina Foygel Barber, Emmanuel J. Candès, and Aaditya Ramdas. Conformal prediction under covariate shift. InNeurIPS, pages 2526–2536, 2019

  63. [63]

    A fast, reliable, and secure programming language for llm agents with code actions,

    Stephen Mell, Botong Zhang, David Mell, Shuo Li, Ramya Ramalingam, Nathan Yu, Steve Zdancewic, and Osbert Bastani. A fast, reliable, and secure programming language for llm agents with code actions,

  64. [64]

    12 Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based ModelsA PREPRINT

    URLhttps://arxiv.org/abs/2506.12202. 12 Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based ModelsA PREPRINT

  65. [65]

    Active learning for neurosymbolic program synthesis.Proc

    Celeste Barnaby, Qiaochu Chen, Ramya Ramalingam, Osbert Bastani, and I¸ sıl Dillig. Active learning for neurosymbolic program synthesis.Proc. ACM Program. Lang., 9(OOPSLA2), October 2025. doi: 10.1145/3763102. URLhttps://doi.org/10.1145/3763102

  66. [66]

    Scallop: From probabilistic deductive databases to scalable differentiable reasoning.NeurIPS, 2021

    Jiani Huang, Ziyang Li, Binghong Chen, Karan Samel, Mayur Naik, Le Song, and Xujie Si. Scallop: From probabilistic deductive databases to scalable differentiable reasoning.NeurIPS, 2021

  67. [67]

    Abductive knowledge induction from raw data.arXiv preprint arXiv:2010.03514, 2020

    Wang-Zhou Dai and Stephen H Muggleton. Abductive knowledge induction from raw data.arXiv preprint arXiv:2010.03514, 2020

  68. [68]

    Soft-unification in deep probabilistic logic

    Jaron Maene and Luc De Raedt. Soft-unification in deep probabilistic logic. InThirty-seventh Conference on Neural Information Processing Systems, 2023

  69. [69]

    Analysis for abductive learning and neural-symbolic reasoning shortcuts

    Xiao-Wen Yang, Wen-Da Wei, Jie-Jing Shao, Yu-Feng Li, and Zhi-Hua Zhou. Analysis for abductive learning and neural-symbolic reasoning shortcuts. InICML, 2024

  70. [70]

    Viviano, Paul Bertin, Paul Morrison, Parsa Torabian, Matteo Guarrera, Matthew P Lungren, Akshay Chaudhari, Rupert Brooks, Mohammad Hashir, and Hadrien Bertrand

    Joseph Paul Cohen, Joseph D. Viviano, Paul Bertin, Paul Morrison, Parsa Torabian, Matteo Guarrera, Matthew P Lungren, Akshay Chaudhari, Rupert Brooks, Mohammad Hashir, and Hadrien Bertrand. Torchxrayvision: A library of chest x-ray datasets and models. In Ender Konukoglu, Bjoern Menze, Archana Venkataraman, Christian Baumgartner, Qi Dou, and Shadi Albarqo...

  71. [71]

    Cebab: Estimating the causal effects of real-world concepts on nlp model behavior.Advances in Neural Information Processing Systems, 35:17582–17596, 2022

    Eldar D Abraham, Karel D’Oosterlinck, Amir Feder, Yair Gat, Atticus Geiger, Christopher Potts, Roi Reichart, and Zhengxuan Wu. Cebab: Estimating the causal effects of real-world concepts on nlp model behavior.Advances in Neural Information Processing Systems, 35:17582–17596, 2022

  72. [72]

    A comprehensive study of image classification model sensitivity to foregrounds, backgrounds, and visual attributes

    Mazda Moayeri, Phillip Pope, Yogesh Balaji, and Soheil Feizi. A comprehensive study of image classification model sensitivity to foregrounds, backgrounds, and visual attributes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2022

  73. [73]

    A benchmark suite for systematically evaluating reasoning shortcuts

    Samuele Bortolotti et al. A benchmark suite for systematically evaluating reasoning shortcuts. In NeurIPS Datasets & Benchmarks Track, 2024

  74. [74]

    Explainable object-induced action decision for autonomous vehicles

    Yiran Xu, Xiaoyin Yang, Lihang Gong, Hsuan-Chu Lin, Tz-Ying Wu, Yunsheng Li, and Nuno Vasconce- los. Explainable object-induced action decision for autonomous vehicles. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

  75. [75]

    Probabilistic circuits: A unifying framework for tractable probabilistic models.UCLA

    Y Choi, Antonio Vergari, and Guy Van den Broeck. Probabilistic circuits: A unifying framework for tractable probabilistic models.UCLA. URL: http://starai. cs. ucla. edu/papers/ProbCirc20. pdf, page 6, 2020

  76. [76]

    Klay: Accelerating arithmetic circuits for neurosymbolic ai

    Jaron Maene, Vincent Derkinderen, and Pedro Zuidberg Dos Martires. Klay: Accelerating arithmetic circuits for neurosymbolic ai. InICLR, 2025

  77. [77]

    The deeplog neurosymbolic machine, 2025

    Vincent Derkinderen, Robin Manhaeve, Rik Adriaensen, Lucas Van Praet, Lennert De Smet, Giuseppe Marra, and Luc De Raedt. The deeplog neurosymbolic machine, 2025. URL https://arxiv.org/ abs/2508.13697

  78. [78]

    Uller: A unified language for learning and reasoning

    Emile Van Krieken, Samy Badreddine, Robin Manhaeve, and Eleonora Giunchiglia. Uller: A unified language for learning and reasoning. InInternational Conference on Neural-Symbolic Learning and Reasoning, pages 219–239. Springer, 2024

  79. [79]

    Constraint-aware neu- rosymbolic uncertainty quantification with bayesian deep learning for scientific discovery.arXiv preprint arXiv:2601.12442, 2026

    Shahnawaz Alam, Mohammed Mudassir Uddin, and Mohammed Kaif Pasha. Constraint-aware neu- rosymbolic uncertainty quantification with bayesian deep learning for scientific discovery.arXiv preprint arXiv:2601.12442, 2026

  80. [80]

    False discovery rate control with e-values.Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3):822–852, 2022

    Ruodu Wang and Aaditya Ramdas. False discovery rate control with e-values.Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3):822–852, 2022

Showing first 80 references.