Binary Spiking Neural Networks as Causal Models

Aditya Kar (CNRS; CERCO UMR5549); Emiliano Lorini (CNRS; IRIT); Timoth\'ee Masquelier (CNRS

arxiv: 2604.27007 · v1 · submitted 2026-04-29 · 💻 cs.AI

Binary Spiking Neural Networks as Causal Models

Aditya Kar (CNRS , IRIT) , Emiliano Lorini (CNRS , Timoth\'ee Masquelier (CNRS , CERCO UMR5549) This is my paper

Pith reviewed 2026-05-07 11:43 UTC · model grok-4.3

classification 💻 cs.AI

keywords binary spiking neural networkscausal modelsabductive explanationsSAT solverSMT solverexplainable AIMNIST dataset

0 comments

The pith

Representing binary spiking neural networks as binary causal models allows SAT and SMT solvers to compute abductive explanations that exclude irrelevant features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the spiking activity inside binary spiking neural networks can be captured exactly as a binary causal model. From this model, standard logic solvers can derive abductive explanations for any classification the network produces. The explanations are constructed so that every included feature must have a causal path to the output, which prevents the inclusion of features that have no effect at all. A reader would care because the approach supplies a precise, checkable account of why a network reached its decision rather than an approximate ranking of feature importance.

Core claim

We formally define a binary spiking neural network and represent its spiking activity as a binary causal model. Thanks to this causal representation we can use a SAT solver as well as an SMT solver to compute abductive explanations of the network's output. When the method is applied to a network trained on MNIST, the resulting explanations are expressed in terms of pixel-level features and contain only features that participate in a causal chain to the classification.

What carries the argument

The binary causal model obtained by mapping each neuron's spike events to binary variables and their temporal dependencies to directed causal links, which turns the network's forward pass into a logical theory from which abduction can be performed.

If this is right

Abductive explanations computed from the causal model contain only causally relevant features and exclude any that have no effect on the output.
Both SAT and SMT solvers can be applied directly to the same causal model to produce such explanations.
The method yields pixel-level explanations for classifications on the MNIST dataset that satisfy the relevance guarantee.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same causal-model construction could be tried on binary spiking networks trained for tasks other than digit recognition to test whether the relevance guarantee still holds.
Because the explanations rest on explicit causal relations, they could be used to generate minimal intervention sets that would change a given classification.
The approach suggests a route for embedding formal verification inside spiking-network pipelines without leaving the domain of binary logic.

Load-bearing premise

The spiking activity of a trained binary spiking neural network can be faithfully captured by a binary causal model without omitting any behavior that affects the final classification.

What would settle it

A concrete input pattern for which the binary spiking neural network produces a classification that cannot be recovered from any assignment of values to the variables in the corresponding causal model.

Figures

Figures reproduced from arXiv: 2604.27007 by Aditya Kar (CNRS, CERCO UMR5549), Emiliano Lorini (CNRS, IRIT), Timoth\'ee Masquelier (CNRS.

**Figure 1.** Figure 1: Image of digit 5 (a) showing in green the input neurons/features being connected with view at source ↗

**Figure 2.** Figure 2: Green features in the two figures are those having non-zero weight connections with the view at source ↗

**Figure 1.** Figure 1: Visualization of abductive explanation for digit 1. view at source ↗

**Figure 2.** Figure 2: Visualization of abductive explanation for digit 9. view at source ↗

**Figure 3.** Figure 3: Visualization of abductive explanation for digit 5. view at source ↗

**Figure 4.** Figure 4: Visualization of abductive explanation for digits 2, 6 and 7. view at source ↗

read the original abstract

We provide a causal analysis of Binary Spiking Neural Networks (BSNNs) to explain their behavior. We formally define a BSNN and represent its spiking activity as a binary causal model. Thanks to this causal representation, we are able to explain the output of the network by leveraging logic-based methods. In particular, we show that we can successfully use a SAT as well as a SMT solver to compute abductive explanations from this binary causal model. To illustrate our approach, we trained the BSNN on the standard MNIST dataset and applied our SAT-based and SMT-based methods to finding abductive explanations of the network's classifications based on pixel-level features. We also compared the found explanations against SHAP, a popular method used in the area of explainable AI. We show that, unlike SHAP, our approach guarantees that a found explanation does not contain completely irrelevant features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper encodes BSNN spiking as a binary causal model for SAT/SMT abductive explanations with a relevance guarantee, but provides no evidence that the encoding preserves the network's actual behavior.

read the letter

The core move here is turning a binary spiking neural network into a causal model so SAT and SMT solvers can generate abductive explanations for its outputs. They define the BSNN formally, represent spikes as binary variables in a causal graph, and extract explanations for MNIST classifications that are guaranteed to exclude completely irrelevant features. They also run a comparison to SHAP on the same task. This is new as an application of causal abduction to BSNNs specifically, and the guarantee is a clear practical plus if the model is accurate. The formal setup and the solver usage are straightforward and well-motivated for logic-based XAI work. The MNIST example shows the pipeline can run end to end. The soft spot is the missing check on whether the binary causal model actually captures the spiking dynamics. Spiking networks depend on membrane potential accumulation, refractory periods, and inter-layer timing; if the per-timestep binary variables drop any of those, the extracted explanations can be relevant inside the model yet incomplete for the real network. The abstract claims success but gives no quantitative results, sensitivity checks, or proof of semantic preservation. Without that, the central claim rests on an untested assumption. The citation pattern looks standard and appropriate for the area. This is for readers working on explainable AI for spiking or recurrent nets who want logic-based methods with guarantees. A specialist in causal models for neural nets could extract the encoding idea and test it further. I would send it to peer review so the authors can supply the missing faithfulness evidence or experiments; the idea is worth a full look even if it needs strengthening.

Referee Report

3 major / 2 minor

Summary. The paper defines Binary Spiking Neural Networks (BSNNs) and maps their spiking activity to a binary causal model. It then uses SAT and SMT solvers to compute abductive explanations for classifications, illustrates the method on a BSNN trained on MNIST using pixel-level features, and compares the resulting explanations to SHAP, claiming that the logic-based approach guarantees explanations contain no completely irrelevant features.

Significance. If the mapping from BSNN dynamics to the binary causal model is faithful, the work would provide a formal, solver-based route to abductive explanations for spiking networks that avoids the irrelevant-feature problem of perturbation methods such as SHAP. The absence of quantitative metrics, error analysis, or a proof of semantic preservation, however, leaves the central claim unverified and reduces the immediate contribution to the literature on explainable AI for neuromorphic models.

major comments (3)

[Abstract] Abstract: the assertion that SAT/SMT solvers are 'successfully' used to compute abductive explanations and that the method was compared to SHAP is unsupported by any quantitative results, accuracy figures, runtime data, or statistical analysis of explanation quality. Without these, the claim that the approach 'guarantees' no irrelevant features cannot be evaluated against the baseline.
[Causal-model construction] The construction of the binary causal model from the BSNN (described after the formal definition of the network): the mapping encodes spiking activity via per-timestep binary variables. No argument or experiment is supplied showing that this encoding preserves membrane-potential accumulation, refractory dynamics, or inter-layer timing dependencies that affect the final classification. If any such behavior is omitted, abductive explanations extracted from the causal model may miss causally relevant features even while satisfying the 'no irrelevant features' property inside the model.
[MNIST experiments] MNIST illustration section: the paper reports that explanations were obtained for network classifications but supplies neither the number of test instances examined, nor any measure of agreement with the network's actual decision boundary, nor a direct comparison (e.g., feature overlap, fidelity, or stability) against SHAP. This leaves the superiority claim untested.

minor comments (2)

[Notation] The notation used for the causal variables (spike indicators, layer indices, time steps) should be introduced once in a single table or definition block to avoid repeated re-definition.
[Solver encoding] The manuscript would benefit from an explicit statement of the precise SAT/SMT encoding (variables, clauses, objective) used to extract the abductive explanations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thorough review and valuable comments. We provide point-by-point responses to the major comments below. Where appropriate, we indicate revisions that will be incorporated into the next version of the manuscript to address the concerns raised.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that SAT/SMT solvers are 'successfully' used to compute abductive explanations and that the method was compared to SHAP is unsupported by any quantitative results, accuracy figures, runtime data, or statistical analysis of explanation quality. Without these, the claim that the approach 'guarantees' no irrelevant features cannot be evaluated against the baseline.

Authors: We agree that the abstract could be more precise in describing the results. The manuscript presents the formal mapping and demonstrates the approach on MNIST examples, showing that the logic-based method produces explanations without irrelevant features by construction, in contrast to SHAP which can include them. However, we acknowledge the lack of quantitative metrics such as runtime or fidelity scores. In the revised manuscript, we will update the abstract to reflect the illustrative nature of the experiments and add quantitative analysis, including runtime data for the solvers and a comparison of explanation sizes or fidelity to the model's decisions. revision: yes
Referee: [Causal-model construction] The construction of the binary causal model from the BSNN (described after the formal definition of the network): the mapping encodes spiking activity via per-timestep binary variables. No argument or experiment is supplied showing that this encoding preserves membrane-potential accumulation, refractory dynamics, or inter-layer timing dependencies that affect the final classification. If any such behavior is omitted, abductive explanations extracted from the causal model may miss causally relevant features even while satisfying the 'no irrelevant features' property inside the model.

Authors: The binary causal model is derived directly from the BSNN definition by representing each neuron's spike at each time step as a binary variable, with clauses encoding the spiking rules based on the network's weights and thresholds. This is intended to be a faithful abstraction for the purpose of causal analysis of spike-based decisions. We concede that a formal proof of equivalence to the full continuous dynamics (including exact membrane potential accumulation) is not included. We will add a subsection providing a formal argument that the discrete binary encoding preserves the causal dependencies relevant to the classification output, focusing on the spike timings that determine the final class. revision: yes
Referee: [MNIST experiments] MNIST illustration section: the paper reports that explanations were obtained for network classifications but supplies neither the number of test instances examined, nor any measure of agreement with the network's actual decision boundary, nor a direct comparison (e.g., feature overlap, fidelity, or stability) against SHAP. This leaves the superiority claim untested.

Authors: The MNIST section is presented as an illustration of the method rather than a comprehensive empirical study. We report examples for specific classifications but did not include aggregate statistics. We agree this limits the ability to evaluate performance quantitatively. In the revision, we will specify the number of instances tested, provide measures such as the average size of explanations and their fidelity (how well they predict the output when features are set), and include a direct comparison table with SHAP regarding feature relevance and overlap. revision: yes

Circularity Check

0 steps flagged

No circularity: direct definitional mapping with independent solver application

full rationale

The paper defines a BSNN, then directly encodes its spiking activity into a binary causal model to enable SAT/SMT computation of abductive explanations. This encoding is presented as a formal representation rather than a derived prediction or fitted result. No equations reduce to prior self-citations, no parameters are fit then renamed as predictions, and no uniqueness theorems or ansatzes are smuggled in. The MNIST comparison to SHAP is an external empirical check, and the 'no irrelevant features' property follows from standard abductive reasoning over the constructed model. The derivation chain is self-contained and does not collapse to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on one domain assumption: that BSNN dynamics admit an exact binary causal encoding.

axioms (1)

domain assumption Spiking activity of a BSNN can be represented as a binary causal model
Stated as the foundation for all subsequent SAT/SMT reasoning.

pith-pipeline@v0.9.0 · 5463 in / 1055 out tokens · 39990 ms · 2026-05-07T11:43:38.098202+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · 1 internal anchor

[2]

In an analogous way we can prove that iv) leads to a contradiction

This leads to a contradiction. In an analogous way we can prove that iv) leads to a contradiction. 2 A.2 Proof of Proposition 6.2 Proof. Suppose i) the term λ= pIx,y,t∧λ′is an abductive explanation of outSbin k ,t and, toward a contradiction, ii) ̸ ∃Hz ∈Hk such that Ix,y ∈ R+(Hz). By ii), we have that iii) for every pX,t′∈VSbin k the Boolean equation E(pX...

work page 2026
[3]

1 arXiv:2604.27007v1 [cs.AI] 29 Apr 2026 Figure 2: Visualization of abductive explanation for digit

work page internal anchor Pith review Pith/arXiv arXiv 2026

[1] [2]

In an analogous way we can prove that iv) leads to a contradiction

This leads to a contradiction. In an analogous way we can prove that iv) leads to a contradiction. 2 A.2 Proof of Proposition 6.2 Proof. Suppose i) the term λ= pIx,y,t∧λ′is an abductive explanation of outSbin k ,t and, toward a contradiction, ii) ̸ ∃Hz ∈Hk such that Ix,y ∈ R+(Hz). By ii), we have that iii) for every pX,t′∈VSbin k the Boolean equation E(pX...

work page 2026

[2] [3]

1 arXiv:2604.27007v1 [cs.AI] 29 Apr 2026 Figure 2: Visualization of abductive explanation for digit

work page internal anchor Pith review Pith/arXiv arXiv 2026