pith. machine review for the scientific record. sign in

arxiv: 2604.15245 · v1 · submitted 2026-04-16 · 🧬 q-bio.NC

Recognition: unknown

Goxpyriment: A Go Framework for Behavioral and Cognitive Experiments

Authors on Pith no claims yet

Pith reviewed 2026-05-10 08:36 UTC · model grok-4.3

classification 🧬 q-bio.NC
keywords goxpyrimentexperimentsframeworkprogrammingaudiobehavioralcognitivedesigned
0
0 comments X

The pith

A Go programming library produces self-contained experiment binaries with built-in stimuli, audio, and OS-level timing for psychology studies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors built a tool called Goxpyriment using the Go language instead of Python. Go programs can be turned into one single file that includes all the pictures, sounds, and instructions needed for an experiment. This file runs on its own without installing extra programs. The tool measures how fast people respond by using timestamps from the computer's operating system. It also includes many ready-made example experiments so users can see how it works and learn from them.

Core claim

Goxpyriment compiles entire experiments into single, self-contained executable binaries with zero runtime dependencies. This drastically simplifies distribution to collaborators and testing computers.

Load-bearing premise

That the Go implementation delivers the claimed timing reliability through OS-level event timestamps and disabled garbage collection, without any benchmarks or comparisons provided in the abstract.

read the original abstract

We introduce `Goxpyriment', a new open-source software framework for programming behavioral and cognitive experiments using the Go programming language. The library is designed to address some limitations of existing Python-based experiment tools, particularly the runtime environment complexity that frequently complicates deployment across laboratories. Because Go is a compiled language that can natively embed assets (e.g., graphics, audio files, and stimulus lists), Goxpyriment compiles entire experiments into single, self-contained executable binaries with zero runtime dependencies. This drastically simplifies distribution to collaborators and testing computers. The programming interface, inspired by Expyriment (Krause & Lindemann, 2014), was designed to be human friendly. The library includes an array of visual stimuli (text, shapes, images, Gabor patches, motion clouds, ...) and audio capabilities (WAV playback and tone generation). While developing Goxpyriment, we focused on timing reliability. Input events are timestamped by the operating system at hardware-interrupt time, so reaction times are computed by subtracting two OS-level timestamps rather than relying on continuous polling. Go's garbage collector can be disabled, greatly reducing the probability of unpredictable pauses that could corrupt stimulus timing. Finally, a set of over forty psychology experiments implemented in Goxpyriment are provided that promote not only learning by humans but also improve the ability of modern AI-assisted coding tools to help program experiments. The framework is released under the GNU General Public License v3 and is freely available at https://github.com/chrplr/goxpyriment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Goxpyriment, a new open-source Go framework for programming behavioral and cognitive experiments. It claims to address deployment complexities of Python-based tools (such as Expyriment) by compiling entire experiments—including assets—into single self-contained executable binaries with zero runtime dependencies. The library offers a human-friendly interface, an array of visual stimuli (text, shapes, images, Gabor patches, motion clouds) and audio capabilities (WAV playback, tone generation), and prioritizes timing reliability by using operating-system hardware-interrupt timestamps for input events (so reaction times are differences of two OS timestamps) together with the option to disable Go's garbage collector. Over forty psychology experiments are provided as examples.

Significance. If the timing-reliability and deployment-simplification claims hold, the framework could meaningfully reduce setup friction and improve reproducibility for laboratory experiments, particularly when distributing to collaborators or non-technical users. The provision of numerous worked examples also supports both human learning and AI-assisted coding of experiments.

major comments (2)
  1. [Timing reliability paragraph] The section describing timing reliability (paragraph beginning 'While developing Goxpyriment, we focused on timing reliability'): the central practical advantage—lower jitter and more reliable reaction times via OS-level hardware-interrupt timestamps and optional GC disable—is asserted without any benchmark data, jitter histograms, error measurements, load tests, or direct comparisons against polling-based Python tools or Expyriment. This leaves the key claim as an untested design hypothesis rather than a demonstrated property.
  2. [Abstract and introduction] Abstract and introduction: claims that the framework 'drastically simplifies distribution' and delivers 'timing reliability' are load-bearing for the paper's motivation, yet no quantitative evidence (benchmarks, timing error statistics, cross-OS tests, or comparisons) is supplied to support them.
minor comments (2)
  1. [References] The single reference to Krause & Lindemann (2014) is appropriate but could usefully be supplemented by a brief discussion of documented timing limitations in current Python experiment libraries.
  2. [Overall structure] The manuscript would benefit from a short dedicated section or appendix reporting any internal validation of the timing mechanisms (even if preliminary) to allow readers to assess the design choices.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for identifying areas where empirical support would strengthen the manuscript. We address each major comment below and outline the revisions we will make.

read point-by-point responses
  1. Referee: The section describing timing reliability (paragraph beginning 'While developing Goxpyriment, we focused on timing reliability'): the central practical advantage—lower jitter and more reliable reaction times via OS-level hardware-interrupt timestamps and optional GC disable—is asserted without any benchmark data, jitter histograms, error measurements, load tests, or direct comparisons against polling-based Python tools or Expyriment. This leaves the key claim as an untested design hypothesis rather than a demonstrated property.

    Authors: We agree that the timing-reliability claims rest on architectural choices without accompanying empirical data in the current manuscript. The use of OS hardware-interrupt timestamps and the option to disable Go's garbage collector are intended to reduce jitter relative to polling-based approaches, but these benefits are presented as design properties rather than measured outcomes. In the revised version we will add a dedicated subsection with benchmark results, including jitter histograms, timing-error statistics under varying CPU loads, and direct comparisons to Expyriment on the same hardware. revision: yes

  2. Referee: Abstract and introduction: claims that the framework 'drastically simplifies distribution' and delivers 'timing reliability' are load-bearing for the paper's motivation, yet no quantitative evidence (benchmarks, timing error statistics, cross-OS tests, or comparisons) is supplied to support them.

    Authors: The referee is correct that both the abstract and introduction advance these advantages without quantitative backing. For distribution, the benefit follows directly from Go's static linking and asset embedding, which produces a single executable with no runtime dependencies; we will add concrete metrics (binary sizes, deployment steps, and cross-platform tests) to illustrate the simplification. For timing reliability we refer to the new benchmark subsection described above. We will revise the abstract and introduction to reference these additions and moderate the language accordingly. revision: yes

Circularity Check

0 steps flagged

No circularity; paper is a direct software presentation with no derivations or fitted claims.

full rationale

The manuscript describes a new Go framework for behavioral experiments, highlighting features such as single-binary compilation, OS-level event timestamps, and optional GC disabling. No equations, predictions, first-principles derivations, or parameter fits appear anywhere in the text. The sole citation (to Expyriment) is for interface inspiration and carries no load-bearing uniqueness theorem or ansatz. All claims are presented as design choices and implementation details rather than results that reduce to the paper's own inputs by construction. The work is therefore self-contained with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The contribution is a software implementation rather than a theoretical model, so there are no free parameters, axioms, or invented entities in the scientific sense.

pith-pipeline@v0.9.0 · 5582 in / 942 out tokens · 29256 ms · 2026-05-10T08:36:49.459904+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

4 extracted references · 3 canonical work pages

  1. [1]

    L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J

    Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2021). Gorilla in our midst: An online behavioral experiment builder.Behavior Re- search Methods,52(1), 388–407. https://doi.org/10 .3758/s13428-019-01237-x Brainard, D. H. (1997). The Psychophysics Toolbox.Spatial Vision,10(4), 433–436. https://doi.org/10.1163/156 856897X003...

  2. [2]

    https://doi.o rg/10.21105/joss.05351 Donovan, A. A. A., & Kernighan, B. W. (2016).The Go pro- gramming language. Addison-Wesley. Henninger, F., Shevchenko, Y ., Meijer, R. R., Inzlicht, M., & Hilbig, B. E. (2022). Lab.js: A free, open, online study builder.Behavior Research Methods,54(2), 556–573. https://doi.org/10.3758/s13428-019-012 83-5 Jung, B. (2026...

  3. [3]

    Krause, F., & Lindemann, O. (2014). Expyriment: A Python library for cognitive and neuroscientific experi- ments.Behavior Research Methods,46(2), 416–

  4. [4]

    slices

    https://doi.org/10.3758/s13428-013-0390-6 Nature Portfolio. (2023). Tools and technologies: Artificial intelligence. Pallier, C. (2025, February). Bbtkv3 [computer software]. github.https://github.com/chrplr/bbtkv3 [Accessed: 2026-04-01]. Peirce, J. W., Gray, J. R., Simpson, S., MacAskill, M. R., Höchenberger, R., Sogo, H., Kastman, E., & Lin- deløv, J. K...