pith. sign in

arxiv: 2604.26055 · v3 · submitted 2026-04-28 · 📊 stat.ME · stat.AP

Extending Evidence Accumulation Models to Bounded Continuous Self-report Data

Pith reviewed 2026-05-12 03:11 UTC · model grok-4.3

classification 📊 stat.ME stat.AP
keywords evidence accumulation modelsbounded continuous responsesdiffusion modelsself-report datareaction timesamortized Bayesian inferencemodel comparisonaffect ratings
0
0 comments X

The pith

Two diffusion models adapt evidence accumulation to bounded continuous self-report data like affect ratings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces the Half-Circular Diffusion Model and Beta Drift Diffusion Model to extend evidence accumulation ideas beyond binary choices to continuous responses on fixed scales with upper and lower bounds. Many psychological measures use such scales, so the work targets a common data type where standard diffusion models do not apply directly. The authors fit both models via amortized Bayesian methods on a dataset of 215 participants, recover parameters reliably, and match the observed joint patterns of ratings and reaction times. Model comparison yields a practical rule based on rating dispersion to select between the two.

Core claim

The Half-Circular Diffusion Model and Beta Drift Diffusion Model both accurately capture the joint distribution of responses and reaction times for bounded continuous self-reports, yield reliably recoverable and interpretable parameters, and can be chosen between using a simple diagnostic based on the dispersion of the observed rating distribution.

What carries the argument

The Half-Circular Diffusion Model, which restricts circular diffusion to a half-space, and the Beta Drift Diffusion Model, which employs a beta distribution for the response variable to enforce bounds.

If this is right

  • Researchers gain interpretable parameters for drift rates, thresholds, and noise in continuous rating tasks.
  • A dispersion-based rule allows quick selection between the two models without full comparison each time.
  • Full workflows including parameter recovery, calibration, and predictive checks become available for this data type.
  • Open code supports direct application to other bounded self-report measures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could help distinguish cognitive mechanisms when people give continuous rather than categorical judgments.
  • Applications may extend to clinical or educational contexts that rely on bounded rating scales for symptoms or knowledge.
  • Further tests on data generated from known non-diffusion processes could clarify when these models reflect true mechanisms.

Load-bearing premise

The models correctly represent the underlying cognitive evidence-accumulation process for bounded continuous responses rather than serving only as flexible statistical descriptions.

What would settle it

Simulations in which the models fail to recover the true generating parameters, or posterior predictive checks that show systematic mismatches with the observed joint distribution of ratings and reaction times.

Figures

Figures reproduced from arXiv: 2604.26055 by Agnes Moors, Francis Tuerlinckx, Tam\'as Sz\H{u}cs, Yufei Wu.

Figure 1
Figure 1. Figure 1: Schematic illustration of the DDM, the CDM, and the SCDM. In all panels, view at source ↗
Figure 2
Figure 2. Figure 2: Schematic illustration of the HCDM. The left panel illustrates a complete view at source ↗
Figure 3
Figure 3. Figure 3: Schematic illustration of the valid region for drift rate vectors and the definition view at source ↗
Figure 4
Figure 4. Figure 4: Schematic illustration of the drift rate and noise processes in BDDM. The left view at source ↗
Figure 5
Figure 5. Figure 5: Participants chose from tiles hiding monetary outcomes (e.g., 3 with +£2, view at source ↗
Figure 6
Figure 6. Figure 6: The parameter recovery of the HCDM. The x-axis stands for the ground truth view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of the posterior means of drift rate vectors and boundary for nine view at source ↗
Figure 8
Figure 8. Figure 8: Correlation between HCDM parameters and empirical summary statistics. Each view at source ↗
Figure 9
Figure 9. Figure 9: The parameter recovery of the BDDM. The x-axis stands for the truth para view at source ↗
Figure 10
Figure 10. Figure 10: Visualization of BDDM estimates. We randomly sampled nine participants, view at source ↗
Figure 11
Figure 11. Figure 11: Correlation coefficients between BDDM parameters/ features and empirical view at source ↗
Figure 12
Figure 12. Figure 12: The posterior predictive check of HCDM. The first row shows the observed view at source ↗
Figure 13
Figure 13. Figure 13: The posterior predictive check of BDDM. The first row shows the observed view at source ↗
Figure 14
Figure 14. Figure 14: Overestimation of the 0.1 quantile of rating using BDDM. Panels A-D show view at source ↗
Figure 15
Figure 15. Figure 15: The PMPHCDM and ∆GOF. The top panel shows the PMPHCDM for each par￾ticipant, ordered by their average PMP across 10 independently trained approximators. Participants on the left strongly favor the BDDM, those on the right strongly favor the HCDM, and those in the middle show mixed evidence. The lower panel shows ∆GOF ordered accordingly, with blue, yellow, and green segments indicating participants who st… view at source ↗
Figure 16
Figure 16. Figure 16: Histogram of rating standard deviations. The left panel shows datasets best view at source ↗
read the original abstract

Evidence accumulation models (EAMs) provide a powerful framework for inferring latent cognitive processes from choice and reaction time data. While EAMs are traditionally limited to binary choices, recent developments have extended them to rotationally symmetric continuous responses via the circular diffusion model \citep{smith2016diffusion} and the spatially continuous diffusion model \citep{ratcliff2018decision}. Yet, such extensions are limited in scope, as many psychological constructs are measured on bounded non-rotational scales. In this paper, we bridge this gap by presenting and comparing two adaptations designed for bounded continuous data: the Half-Circular Diffusion Model (HCDM) and the Beta Drift Diffusion Model (BDDM). Because both models have intractable likelihoods, we fit them using Amortized Bayesian Inference (ABI) and compare them using Amortized Bayesian Model Comparison (ABMC). We demonstrate the complete workflow on an empirical affect dataset (N = 215), including parameter recovery, simulation-based calibration, posterior predictive checks, and model comparison. Both models accurately capture the joint distribution of responses and reaction times and yield interpretable parameters that can be reliably recovered. The model comparison further reveals a simple diagnostic for choosing between them: the dispersion of the rating distribution, with HCDM preferred for moderate spread and BDDM for highly concentrated or highly dispersed ratings. This work extends the EAM framework to a new application context, bounded continuous self-report data, and offers researchers a user-friendly toolkit for modeling the cognitive dynamics of continuous responses. We release fully documented Python code with both GPU and CPU implementations, along with example datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes two extensions of evidence accumulation models for bounded continuous self-report data: the Half-Circular Diffusion Model (HCDM) and the Beta Drift Diffusion Model (BDDM). Due to intractable likelihoods, both are fit via Amortized Bayesian Inference (ABI) and compared via Amortized Bayesian Model Comparison (ABMC). On an empirical affect dataset (N=215), the authors report parameter recovery, simulation-based calibration, posterior predictive checks, and a dispersion-based diagnostic for model selection, concluding that both models capture the joint response-RT distribution with recoverable parameters and that HCDM is preferred for moderate dispersion while BDDM suits highly concentrated or dispersed ratings. Fully documented Python code (GPU/CPU) is released.

Significance. If the empirical results hold, this work meaningfully extends the EAM framework to a common but previously underserved data type (bounded continuous self-reports), providing interpretable parameters and a practical selection rule. The explicit release of reproducible code, the use of ABI/ABMC for intractability, and the suite of validation checks (recovery, SBC, PPCs) are clear strengths that support independent verification and adoption.

major comments (2)
  1. [Abstract] Abstract: the central claim that 'both models accurately capture the joint distribution of responses and reaction times' is stated without any quantitative fit metric (e.g., mean absolute error on PPCs, log predictive density, or comparison to a non-cognitive baseline such as a simple beta regression with RT). This weakens the strength of the empirical demonstration even though the validation pipeline is otherwise comprehensive.
  2. [§4] §4 (model comparison): the dispersion-based diagnostic is presented as a simple rule, but no simulation study or analytic derivation shows its robustness when the true data-generating process lies between the two models or when boundary parameters vary; this is load-bearing for the practical recommendation.
minor comments (3)
  1. [Figures 4-6] Figure legends and axis labels in the posterior predictive check panels should explicitly state the quantitative discrepancy measure used (if any) rather than relying solely on visual inspection.
  2. [§2.2] Notation for the beta drift rate in the BDDM should be clarified relative to the standard DDM drift to avoid confusion with the circular case.
  3. [Abstract] The abstract would be strengthened by adding one sentence on the magnitude of the fit improvement or the recovery error rates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help strengthen the presentation of our empirical validation and the practical guidance on model selection. We address each major comment below, indicating revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'both models accurately capture the joint distribution of responses and reaction times' is stated without any quantitative fit metric (e.g., mean absolute error on PPCs, log predictive density, or comparison to a non-cognitive baseline such as a simple beta regression with RT). This weakens the strength of the empirical demonstration even though the validation pipeline is otherwise comprehensive.

    Authors: We agree that a quantitative anchor would make the abstract claim more precise. The main text already reports detailed posterior predictive checks (PPCs) with visual overlays of observed versus predicted response histograms and RT distributions, plus simulation-based calibration confirming recoverability. For the revision we will (i) insert a concise quantitative summary into the abstract (e.g., “PPCs yield mean absolute errors below 0.05 in binned response probabilities and RT quantiles”) and (ii) add the corresponding numeric values to the results section. A non-cognitive baseline comparison lies outside the paper’s scope, which focuses on extending the EAM framework; we will note this limitation explicitly. revision: yes

  2. Referee: [§4] §4 (model comparison): the dispersion-based diagnostic is presented as a simple rule, but no simulation study or analytic derivation shows its robustness when the true data-generating process lies between the two models or when boundary parameters vary; this is load-bearing for the practical recommendation.

    Authors: The diagnostic is an empirical pattern observed in the ABMC results on the affect dataset, where HCDM was favored at moderate dispersion and BDDM at the extremes. Parameter-recovery and SBC simulations already span a wide range of dispersion and boundary values, providing indirect support. We will revise §4 to present the rule explicitly as a data-driven heuristic rather than a general theorem, add a short caveat about intermediate or boundary-varying cases, and recommend dataset-specific validation. A dedicated robustness simulation study would be valuable but exceeds the scope of a minor revision; we therefore treat the change as partial. revision: partial

Circularity Check

0 steps flagged

No significant circularity; new models and empirical tests are independent

full rationale

The paper introduces HCDM and BDDM as explicit adaptations of prior diffusion models to bounded continuous responses, then fits them via standard amortized Bayesian inference (ABI) because the likelihoods are intractable. All central claims (joint distribution capture, parameter recoverability, dispersion-based model selection) are validated through parameter recovery simulations, simulation-based calibration, posterior predictive checks, and ABMC on an empirical dataset (N=215). No equation or result reduces a reported prediction to a fitted input by construction, and no self-citation chain is load-bearing for the new extensions or empirical findings. The derivation and validation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The models inherit standard diffusion assumptions (Wiener process, constant drift, absorbing boundaries) and add new boundary-handling mechanisms whose validity is not independently verified outside the current dataset.

free parameters (3)
  • drift rate
    Core parameter of the diffusion process, fitted per condition or participant.
  • boundary separation
    Adapted for half-circle or beta support; fitted to data.
  • non-decision time
    Standard EAM parameter, fitted.
axioms (2)
  • domain assumption Evidence accumulates as a Wiener process with constant drift until an absorbing boundary is reached.
    Inherited from classic EAM literature and applied without new justification to bounded continuous responses.
  • domain assumption Amortized Bayesian inference networks can accurately approximate the intractable likelihoods of the new models.
    Relied upon for all fitting and model comparison steps.

pith-pipeline@v0.9.0 · 5603 in / 1337 out tokens · 42629 ms · 2026-05-12T03:11:15.370370+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Guided Image Generation with Conditional Invertible Neural Networks

    Ardizzone, L., Lüth, C., Kruse, J., Rother, C., & Köthe, U. (2019).Guided Image Generation with Conditional Invertible Neural Networks.http://arxiv.org/abs/ 1907.02392. arXiv:1907.02392 [cs] Berkhof, J., Van Mechelen, I., & Hoijtink, H. (2000). Posterior predictive checks: Prin- ciples and discussion.Computational Statistics, 15(3), 337–354. Brown, S. D. ...

  2. [2]

    more emotional

    Cannon, P., Ward, D., & Schmon, S. M. (2022). Investigating the impact of model mis- specification in neural simulation-based inference.arXiv preprint arXiv:2209.01845. Charness, G., Gneezy, U., & Imas, A. (2013). Experimental methods: Eliciting risk preferences.Journal of economic behavior & organization, 87, 43–51. Clark, L. A. & Watson, D. (1995).Const...

  3. [3]

    Lee, J., Lee, Y., Kim, J., Kosiorek, A., Choi, S., & Teh, Y. W. (2019). Set transformer: A frameworkforattention-basedpermutation-invariantneuralnetworks.International conference on machine learning, 3744–3753. 34 Lipman, Y., Chen, R. T., Ben-Hamu, H., Nickel, M., & Le, M. (2022). Flow match- ing for generative modeling.The Eleventh International Conferen...

  4. [4]

    Mauss, I

    Cambridge University Press. Mauss, I. B. & Robinson, M. D. (2009). Measures of emotion: A review.Cognition and emotion, 23(2), 209–237. Molenaar, D., Cúri, M., & Bazán, J. L. (2022). Zero and one inflated item response theory models for bounded continuous data.Journal of Educational and Behavioral Statistics, 47(6), 693–735. Mulder, M. J., Wagenmakers, E....

  5. [5]

    Ratcliff, R. (2018). Decision making on spatially continuous scales.Psychological review, 125(6),

  6. [6]

    Ratcliff, R., Gomez, P., & McKoon, G. (2004). A Diffusion Model Account of the Lexical Decision Task.Psychological Review, 111(1), 159–182.https://doi.org/10.1037/ 0033-295X.111.1.159 35 Ratcliff, R. & McKoon, G. (2024). Using diffusion models for symbolic numeracy tasks to examine aging effects.Journal of Experimental Psychology: Learning, Memory, and Co...

  7. [7]

    & Rouder, J

    Ratcliff, R. & Rouder, J. N. (1998). Modeling Response Times for Two-Choice De- cisions.Psychological Science, 9(5), 347–356.https://doi.org/10.1111/1467- 9280.00067 Rogers, L. C. G. & Williams, D. (2000).Diffusions, Markov processes, and martingales, volume

  8. [8]

    Säilynoja, T., Bürkner, P.-C., & Vehtari, A

    Cambridge university press. Säilynoja, T., Bürkner, P.-C., & Vehtari, A. (2022). Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison. Statistics and Computing, 32(2),

  9. [9]

    Smith, P. L. (2016). Diffusion theory of decision making in continuous report.Psycholo- gical Review, 123(4),

  10. [10]

    Szűcs, T., Wu, Y., Tuerlinckx, F., & Moors, A. (2026). A systematic examination of the determinants of affect in a drift-diffusion framework. Unpublished manuscript, Unpublished manuscript. Teoh, Y. Y., Cunningham, W. A., & Hutcherson, C. A. (2023). Framing Subjective Emotion Reports as Dynamic Affective Decisions.Affective Science, 4(3), 522–528. https:/...

  11. [11]

    T., & Voss, A

    Von Krause, M., Radev, S. T., & Voss, A. (2022). Mental speed is high until age 60 as revealed by analysis of over a million participants.Nature Human Behaviour, 6(5), 700–708.https://doi.org/10.1038/s41562-021-01282-7 36 Wu, Y., Radev, S. T., & Tuerlinckx, F. (2026). Testing and improving the robustness of amortized bayesian inference for cognitive model...

  12. [12]

    During the training phase, a prior θ∼p(θ)and a generative modelx∼p(x|θ)are defined, whereθ∈R D

    ABI workflow A simple amortized workflow is depicted in Figure A1. During the training phase, a prior θ∼p(θ)and a generative modelx∼p(x|θ)are defined, whereθ∈R D. We assume that the functional or algorithmic form of the generative model is known and can be realized as a Monte Carlo simulation program. Subsequently, parameters are simulated from the prior ...

  13. [13]

    The summary and inference networks are jointly optimized to ensure that the approximate posterior corresponds as closely as possible to the true posterior

    learn an invertible 38 transformationfbetween the complex target distribution and a predefined simple latent distributionz(e.g., a spherical Gaussian), such that sampling fromzand applying the inversef −1 yields samples from the approximate posterior: θ∼q(θ|s(x))⇐ ⇒θ=f −1(z;s(x))withz∼ N(0,I), wherefis an invertible function parameterized by a conditional...

  14. [14]

    Figure A1: A basic amortized Bayesian workflow with normalizing flow

    divergence between the true and the approximate posterior for any data setxsampled from the prior predictive distribution p(x)(for more details, please refer to Radev et al., 2020): (f ∗, s∗) = argmin f,s Ep(x) h DKL p(θ|x)||q(θ|s(x) i . Figure A1: A basic amortized Bayesian workflow with normalizing flow. Parameters and data are simulated from a prior an...

  15. [15]

    Upon convergence, the vector fieldv ϕ is learned, drawing samples from the approximate posteriorq(θ|s(x obs))involves solving an ordinary differential equations (ODE). Given an observationxobs, we sample an initial state from the base distribution, θt=0 =z∼ N(0,I), and simulate the continuous-time dynamics governed by the learned vector field: dθt dt =v ϕ...

  16. [16]

    In fact, the neural network will outputα

    We can use the mean of the Dirichlet distribution, which is a vector of probabilities given by: Ep∼Dir(α)[p] =α 1 α0 whereα 0 =PJ j=1 αj, as an approximation to the posterior model probabilities. In fact, the neural network will outputα. It is important to note that this method implicitly favors simpler models. Data generated by a simpler model tend to be...