pith. sign in

arxiv: 2103.07818 · v2 · submitted 2021-03-14 · 📊 stat.ME · stat.AP

Quantifying uncertainty in spikes estimated from calcium imaging data

Pith reviewed 2026-05-24 12:59 UTC · model grok-4.3

classification 📊 stat.ME stat.AP
keywords calcium imagingspike estimationselective inferencep-valuesconfidence intervalsType I errorneuron activity
0
0 comments X

The pith

A selective inference algorithm produces finite-sample p-values and confidence intervals with correct coverage for spikes estimated from calcium imaging data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the problem of testing whether a neuron spiked at a timepoint where a spike has already been estimated from the same calcium imaging observations. Standard tests fail here because the estimation step uses the data, leading to inflated Type I error. The authors develop a selective inference procedure that conditions on the estimation event to restore valid error control. This yields an efficient algorithm for exact finite-sample p-values and intervals that achieve the desired selective coverage. The method is demonstrated on both simulated data and recordings from the spikefinder challenge.

Core claim

We describe an efficient algorithm to compute finite-sample p-values that control selective Type I error, and confidence intervals with correct selective coverage, for spikes estimated using a recent proposal from the literature.

What carries the argument

A selective inference procedure that conditions on the data-dependent event of estimating a spike at a given timepoint, applied to the exponential-decay calcium model.

If this is right

  • Finite-sample p-values control selective Type I error when testing the null of no spike at an estimated time.
  • Confidence intervals achieve correct selective coverage for the calcium jump size at estimated spike times.
  • The algorithm runs efficiently enough to apply to real datasets such as those in the spikefinder challenge.
  • The approach directly corrects the invalidity of classical tests that ignore the selection step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same conditioning strategy could be adapted to other event-detection settings where a point process is first estimated from noisy continuous observations.
  • If the exponential-decay assumption holds only approximately, one could examine how robust the coverage remains under small model perturbations.
  • The method highlights a general template for post-selection inference that may apply to spike sorting pipelines beyond calcium imaging.

Load-bearing premise

The calcium imaging observations are generated exactly by the stated model in which calcium decays exponentially between spikes and jumps instantaneously at each spike.

What would settle it

Generating calcium traces from a model with gradual spike rise times or non-exponential decay and checking whether the reported selective p-values and intervals retain their nominal coverage.

Figures

Figures reproduced from arXiv: 2103.07818 by Daniela M. Witten, Sean W. Jewell, Yiqun T. Chen.

Figure 1
Figure 1. Figure 1: (a): One simulation with y1, . . . , y10,000 (grey dots) generated according to model (1.1) with γ = 0.98, σ = 0.2, and zt = 0 for all t. The ℓ0 problem in (3.14) was solved with λ = 0.1, resulting in 47 estimated spikes with fluorescence increases. Estimated calcium is displayed in blue. We display one estimated spike at time ˆτ = 3, 060 with y3,000, . . . , y3,100. (b): Quantile￾quantile plot for the Wal… view at source ↗
Figure 2
Figure 2. Figure 2: Data generated according to (1.1), with T = 80, σ = 0.1, γ = 0.98, and one spike at t = 40. Solving the ℓ0 problem (3.14) with λ = 0.75 yields a single estimated spike at t = 40. (a): We plot the original data, which corresponds to y ′ (φ) with φ = ν ⊤y = 1.02, where ν is constructed according to (2.7) with ˆτj = 40 and h = 40. The estimated calcium concentration is displayed in blue. (b): The perturbed da… view at source ↗
Figure 3
Figure 3. Figure 3: (a): Quantile-quantile plot for the naive p-values defined in (5.30), which have inflated selective Type I error. (b): Quantile-quantile plot for p-values from our proposed selective test in (2.9), which controls selective Type I error. (c): Under the model (1.1), detection probability (5.32) is an increasing function of 1/σ. (d): Conditional power (5.31) increases as a function of 1/σ for all h. For a giv… view at source ↗
Figure 4
Figure 4. Figure 4: (a): Selective confidence intervals achieve correct nominal coverage (95% coverage at level α = 0.05) across all values of h (defined in (2.7)) and σ (defined in (1.1)). The mean (and standard deviation) over 500 simulated datasets are displayed. (b): Naive confidence intervals have poor coverage when 1/σ is small, for all values of h. (c): For h = 1, selective confidence intervals are on average wider tha… view at source ↗
Figure 5
Figure 5. Figure 5: Illustrative example for recording 29 from Chen [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Result for recordings from the Chen and others (2013) dataset. (a): The correlations between the true spike times and the spikes estimated from (6.37) are plotted in orange. The correlations between the true spike times and the subset of the spikes from (6.37) with p-value (2.9) below 0.05 are plotted in blue. For each recording, the black line represents the 2.5% and 97.5% quantiles of the resampling dist… view at source ↗
Figure 7
Figure 7. Figure 7: Plot of the contrast ν generated according to (2.7), with T = 50, γ = 0.98, ˆτj = 20, and h = 5 [PITH_FULL_IMAGE:figures/full_fig_p049_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Running time of Algorithm 1 over 50 replicate datasets, as a fu [PITH_FULL_IMAGE:figures/full_fig_p049_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Results for the Chen and others (2013) dataset. Details are as in [PITH_FULL_IMAGE:figures/full_fig_p050_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Results for the Chen and others (2013) dataset. Details are as in [PITH_FULL_IMAGE:figures/full_fig_p051_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Residuals, yt − cˆt, for recordings from the Chen and others (2013) dataset, where ˆct is the solution to (6.37). (a) (b) (c) [PITH_FULL_IMAGE:figures/full_fig_p052_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: (a): Quantile-quantile plot for selective p-values computed using estimated variance ˆσ 2 based on 100 simulations (2,988 hypothesis tests) under the global null. (b): Conditional power for selective p-values with estimated variance ˆσ 2 . (c): Selective confidence intervals computed using estimated variance ˆσ 2 achieve correct nominal coverage (95% coverage at level α = 0.05) across all values of h and … view at source ↗
read the original abstract

In recent years, a number of methods have been proposed to estimate the times at which a neuron spikes on the basis of calcium imaging data. However, quantifying the uncertainty associated with these estimated spikes remains an open problem. We consider a simple and well-studied model for calcium imaging data, which states that calcium decays exponentially in the absence of a spike, and instantaneously increases when a spike occurs. We wish to test the null hypothesis that the neuron did not spike -- i.e., that there was no increase in calcium -- at a particular timepoint at which a spike was estimated. In this setting, classical hypothesis tests lead to inflated Type I error, because the spike was estimated on the same data used for testing. To overcome this problem, we propose a selective inference approach. We describe an efficient algorithm to compute finite-sample p-values that control selective Type I error, and confidence intervals with correct selective coverage, for spikes estimated using a recent proposal from the literature. We apply our proposal in simulation and on calcium imaging data from the spikefinder challenge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes a selective inference procedure for testing the null hypothesis of no spike at times estimated from calcium imaging data under a model with exponential calcium decay between spikes and instantaneous jumps at spikes. It claims to provide an efficient algorithm that computes finite-sample p-values controlling selective Type I error and confidence intervals with exact selective coverage, with demonstrations on simulations and spikefinder challenge data.

Significance. If the algorithm and its selective error control are correctly derived, the work would fill an important gap by enabling rigorous, finite-sample uncertainty quantification for spike estimation in a well-studied generative model. The emphasis on exact selective coverage rather than asymptotic approximations is a notable strength for this application area.

major comments (1)
  1. [Model description and selective inference derivation] The selective Type I error control and coverage guarantees are derived under the exact generative model (exponential decay, instantaneous jumps, and the noise distribution used for the conditional law). This assumption is load-bearing for the central claim, as any mismatch (e.g., finite rise times, correlated noise, or baseline drift) alters the conditional distribution given the selection event and invalidates the reported p-values and intervals. The manuscript should include a dedicated discussion or simulation study of robustness to such misspecification.
minor comments (2)
  1. [Algorithm description] Clarify the precise definition of the selection event induced by the spike estimator and how the conditional distribution is computed in the algorithm; a short pseudocode or complexity statement would aid reproducibility.
  2. [Abstract] The abstract states that the method 'controls selective Type I error' but provides no equation or theorem reference; adding a pointer to the main result (e.g., Theorem X) would strengthen the summary.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation and recommendation of minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [Model description and selective inference derivation] The selective Type I error control and coverage guarantees are derived under the exact generative model (exponential decay, instantaneous jumps, and the noise distribution used for the conditional law). This assumption is load-bearing for the central claim, as any mismatch (e.g., finite rise times, correlated noise, or baseline drift) alters the conditional distribution given the selection event and invalidates the reported p-values and intervals. The manuscript should include a dedicated discussion or simulation study of robustness to such misspecification.

    Authors: We agree that the exact finite-sample selective Type I error control and coverage properties are derived under the assumed generative model and that deviations such as finite rise times, temporally correlated noise, or baseline drift would alter the conditional distribution given the selection event, thereby invalidating the exact guarantees. In the revised manuscript we will add a dedicated subsection to the Discussion that (i) restates the modeling assumptions required for the conditional law, (ii) discusses the plausibility of these assumptions for typical calcium imaging data, and (iii) qualitatively describes how common forms of misspecification could affect the reported p-values and intervals. We will also note that the selective-inference framework can be extended to richer generative models provided the relevant conditional distribution can be characterized. Because a comprehensive simulation study of robustness under multiple misspecification regimes would substantially expand the scope of the work, we will not include such simulations; the added discussion will nevertheless give readers the necessary context for interpreting the results. revision: partial

Circularity Check

0 steps flagged

No circularity; selective p-values and CIs derived from conditional distribution under explicit generative model

full rationale

The paper applies the established selective inference framework to a fixed spike estimator under an explicit generative model (exponential decay + instantaneous jumps). The algorithm computes the exact conditional law given the selection event defined by the estimator; the resulting p-values and intervals therefore control selective error by the mathematics of conditioning, not by any data-dependent fitting or renaming. No equations reduce a reported quantity to its own inputs by construction, no self-citation chain is load-bearing for the central claim, and the method is externally falsifiable by simulation under the stated model. This matches the normal non-circular case (score 0-2).

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the correctness of the exponential-decay observation model and on the validity of the selective-inference theory applied to the particular spike estimator referenced in the abstract.

axioms (1)
  • domain assumption Calcium concentration decays exponentially in the absence of a spike and increases instantaneously when a spike occurs.
    Explicitly stated in the abstract as the model under consideration.

pith-pipeline@v0.9.0 · 5717 in / 1241 out tokens · 49948 ms · 2026-05-24T12:59:46.017087+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

  1. [1]

    In this case, Cost ( y′ 1:(ˆτj− h+1) (φ) , α ; γ ) equals Cost ( y′ 1:(ˆτj− h)(φ), α/γ ; γ ) + 1 2 ( y′ ˆτj− h+1(φ) − α )2 , where α/γ accounts for the exponential calcium decay

    There is no changepoint at the (ˆτj − h)th time step. In this case, Cost ( y′ 1:(ˆτj− h+1) (φ) , α ; γ ) equals Cost ( y′ 1:(ˆτj− h)(φ), α/γ ; γ ) + 1 2 ( y′ ˆτj− h+1(φ) − α )2 , where α/γ accounts for the exponential calcium decay

  2. [2]

    base case

    There is a changepoint at the (ˆτj − h)th time step. In this case, Cost ( y′ 1:(ˆτj− h+1)(φ), α ; γ ) Supplementary Materials 7 equals min α′⩾0 { Cost ( y′ 1:(ˆτj− h)(φ), α ′; γ )} + λ + 1 2 ( y′ ˆτj − h+1(φ) − α )2 , where the changepoint incurs a penalty of λ, and there can be an arbitrary change in the calcium from timepoint ˆτj − h to ˆτj − h + 1. The...

  3. [3]

    Therefore, the first term in Cˆτj − h+1 is a piecewise quadratic function of φ and α according to Observation 2. As for the second term, we note that min α⩾0 { Cost(y′ 1:(ˆτj− h)(φ), α ; γ) } is a piecewise quadratic function of φ according to Observation 3, so its sum with λ + 1 2 (y′ ˆτj− h+1(φ) − α )2 is piecewise quadratic in φ and α . □ Lemma A.7 Supp...

  4. [4]

    Compute the collection of functions Cˆτj using Proposition 4

  5. [5]

    Compute the collection of functions ˜Cˆτj +1 using Proposition 7

  6. [6]

    Compute C(φ) using (3.25)

  7. [7]

    Compute C′(φ) using (3.26)

  8. [8]

    Compute S = {φ : C(φ) ⩽ C′(φ)}. 12 Y. T. CHEN AND OTHERS A.8 Proof of Proposition 5 Throughout the proof, we assume that the number of pieces in the piecewise quadratic functions under consideration is a constant that does not depend on h and T . Moreover, we will leverage the toolkit from Maidstone and others (2017); Rigaill (2015); Jewell and others (20...

  9. [9]

    , ˆτj}, assuming that we have computed Cs− 1

    Step 1: We first consider the time to compute Cs for some s ∈ { ˆτj − h + 1, . . . , ˆτj}, assuming that we have computed Cs− 1. (a) We first compute ⋃ f ∈C s− 1{f (α/γ, φ ) + 1 2 (y′ s(φ) − α )2}, which takes O(|Cs− 1|) = O(s − ˆτj + h) operations. (b) We then compute gs(φ) using (3.24): the inner minimization over α ⩾ 0 takes O(1) operations for each f ∈ ...

  10. [10]

    Supplementary Materials 13

    Step 2: Applying the same logic used in analyzing Step 1 to the secon d step of Algorithm 1, we conclude that computing ˜Cˆτj+1 takes O(h2) operations using Proposition 7. Supplementary Materials 13

  11. [11]

    Both terms can be computed in O(|Cˆτj |) = O(h) operations using Observation 1; moreover, the summation will take O(1) operations according to Observation 2

    According to (3.25), computing C(φ) requires minf ∈C ˆτj {minα⩾0 f (α, φ )} and minf ∈ ˜Cˆτj +1 {minα′⩾0 f (α ′, φ )}. Both terms can be computed in O(|Cˆτj |) = O(h) operations using Observation 1; moreover, the summation will take O(1) operations according to Observation 2. Hence Step 3 takes O(h) operations in total

  12. [12]

    According to (3.26), C′(φ) = min f ∈C ˆτj , ˜f ∈ ˜Cˆτj +1 { min α⩾0 { f (α, φ ) + ˜f (γα, φ ) }} . (a) Computing the set { f (α, φ ) + ˜f (γα, φ ) ⏐ ⏐ ⏐ f ∈ C ˆτj , ˜f ∈ ˜Cˆτj+1 } takes O(|Cˆτj | · |˜Cˆτj +1|) = O(h2) operations, since each addition takes O(1) operations (Observation 2) and there are |Cˆτj | · |˜Cˆτj +1| such sums. (b) Minimizing over α ⩾...

  13. [13]

    In O(1) oper- ations, we can obtain S in (2.13) by computing the set of φ such that min {C(φ), C ′(φ)} = C(φ)

    To carry out Step 5, we first compute min {C(φ), C ′(φ)}, the minimum of two piecewise quadratic functions of φ only, which takes O(1) operations by Observation 1. In O(1) oper- ations, we can obtain S in (2.13) by computing the set of φ such that min {C(φ), C ′(φ)} = C(φ). To summarize, computing S defined in (2.13) using Algorithm 1 takes O(h2) operations...

  14. [14]

    C1 has only one function C1 = Cost(y′ 1(φ), α ; γ) = 1 2 (8 − α )2

  15. [15]

    Supplementary Materials 15 This completes the calculation Cost ( y′ 1:ˆτj (φ), α ; γ ) = Cost(y′ 1:2(φ), α ; γ) = min f ∈C 2 f (α, φ )

    To compute C2, we apply (3.23): C2 = { 1 2 (8 − α/ 0.5)2 + 1 2 (5.6 − 0.4φ − α )2, 1 2 (5.6 − 0.4φ − α )2 + g2(φ) } , where g2(φ) = min α⩾0 Cost(y1, α ; γ) + λ = 0 + λ = 1. Supplementary Materials 15 This completes the calculation Cost ( y′ 1:ˆτj (φ), α ; γ ) = Cost(y′ 1:2(φ), α ; γ) = min f ∈C 2 f (α, φ ). For the reverse direction, we will apply Proposi...

  16. [16]

    C4 consists of a single function: C4 = Cost(y′ 4(φ), α ; 1/γ ) = 1 2 (3 − α )2

  17. [17]

    Applying (A.21), we get C3 = min { 1 2 (3 − α/ 2)2 + 1 2 (2.8 + 0.8φ − α )2, min α′⩾0 { 1 2 (3 − α ′/ 2)2 } + λ + 1 2 (2.8 + 0.8φ − α )2 } , which yields Cost ( y′ T :ˆτj +1(φ), α ; 1/γ ) = Cost(y′ 4:3(φ), α ; 1/γ ) = min f ∈C 3 f (α, φ ). According to (3.18), C(φ) = min α⩾0 { Cost(y′ 1:2(φ), α ; γ) } + min α⩾0 { Cost(y′ 4:3(φ), α ; 1/γ ) } + λ, where min...