arxiv: 2603.22705 · v5 · submitted 2026-03-24 · 🧬 q-bio.NC · q-bio.PE

Recognition: 2 theorem links

· Lean Theorem

Detecting outliers of pursuit eye movements: a preliminary analysis of autism spectrum disorder

Emiko Shishido , Seiko Miyata , Tetsuya Yamamoto , Masaki Fukunaga , Ryota Hashimoto , Kenichiro Miura , Norio Ozaki

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:14 UTC · model grok-4.3

classification 🧬 q-bio.NC q-bio.PE

keywords autism spectrum disordersmooth pursuit eye movementsoutlier analysisMahalanobis distanceoculomotor atypicalityheterogeneityprincipal component analysis

0 comments

The pith

An outlier score from smooth pursuit eye movements detects atypical patterns in 38.9 percent of ASD adults versus 5.1 percent of controls.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops an outlier analysis for smooth pursuit eye movements to identify individual deviations in autism spectrum disorder rather than relying on group averages. Researchers recorded eye movements in 18 ASD adults and 39 typically developed people during a Lissajous pursuit task. They created an outlier score using Mahalanobis distance on principal components of temporal lag and spatial deviation. The ASD group showed 38.9 percent outliers compared to 5.1 percent in controls, with higher mean scores. This highlights how focusing on extremes can reveal heterogeneity that averages obscure.

Core claim

The outlier analysis, based on Mahalanobis distance from PCA-optimized features of temporal lag and spatial deviation in smooth pursuit eye movements, reveals a significantly higher prevalence of outliers (38.9 percent, or 7 out of 18) in the ASD group compared to the TD group (5.1 percent), along with an elevated mean outlier score (3.00 plus or minus 2.62 versus 1.52 plus or minus 0.80). These deviations persist even when conventional mean-based analyses show limited sensitivity, providing a metric for the idiosyncratic atypicality in ASD oculomotor control and a baseline for clinical subtypes.

What carries the argument

The outlier score, defined as the Mahalanobis distance of a feature vector consisting of temporal lag and spatial deviation after PCA optimization, relative to the TD normative distribution, with outliers exceeding the square root of 10.

If this is right

The approach visualizes high idiosyncratic atypicality in ASD oculomotor control.
It shifts focus from group averages to individual deviations with greater sensitivity.
It supplies a potential baseline for identifying clinical subtypes within ASD.
Extreme deviations remain detectable even when mean-based comparisons lack sensitivity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If replicated, the method could help quantify biological heterogeneity of ASD at the single-subject level using a normative reference.
Applying the same outlier logic to other eye-movement tasks or modalities might expose additional atypical profiles.
The normative-distribution approach suggests a route toward individualized oculomotor markers for neurodevelopmental conditions.

Load-bearing premise

The two chosen features of temporal lag and spatial deviation after PCA plus the fixed threshold of square root of 10 adequately capture idiosyncratic atypicalities in ASD without missing other patterns or being misled by noise in the small TD sample.

What would settle it

A replication study with a larger typically developed reference group or additional eye-movement features that finds no difference in outlier rates between ASD and TD participants would falsify the higher prevalence claim.

read the original abstract

Background: Autism spectrum disorder (ASD) is characterized by significant clinical and biological heterogeneity. Conventional group-mean analyses of eye movements often mask individual atypicalities, potentially overlooking critical pathological signatures. This study aimed to identify idiosyncratic oculomotor patterns in ASD using an "outlier analysis" of smooth pursuit eye movement (SPEM). Methods: We recorded SPEM during a slow Lissajous pursuit task in 18 adults with ASD and 39 typically developed (TD) individuals. To quantify individual deviations, we derived an "outlier score" based on the Mahalanobis distance. This score was calculated from a feature vector, optimized via Principal Component Analysis (PCA), comprising the temporal lag ($\Delta$t) and the spatial deviation ($\Delta$s). An outlier was statistically defined as a score exceeding $\sqrt{10}$ (approximately 3.16$\sigma$) relative to the TD normative distribution. Results: While the TD group exhibited a low outlier rate of 5.1%, the ASD group demonstrated a significantly higher prevalence of 38.9% (7/18) (binomial P = 0.0034). Furthermore, the mean outlier score was significantly elevated in the ASD group (3.00 $\pm$ 2.62) compared to the TD group (1.52 $\pm$ 0.80; P = 0.002). Notably, these extreme deviations were captured even when conventional mean-based comparisons showed limited sensitivity. Conclusions: Our outlier analysis successfully visualized the high degree of idiosyncratic atypicality in ASD oculomotor control. By shifting the focus from group averages to individual deviations, this approach provides a sensitive metric for capturing the inherent heterogeneity of ASD, offering a potential baseline for identifying clinical subtypes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows higher outlier rates in ASD eye movements via Mahalanobis on PCA features, but small samples and untested covariance stability keep it preliminary.

read the letter

This paper's core finding is that an outlier score derived from Mahalanobis distance on two PCA-reduced features of smooth pursuit eye movements flags a much higher fraction of ASD cases than typical development controls, even when average performance looks similar. The numbers are 38.9% versus 5.1%, with supporting mean score difference. They do a clean job of laying out the pipeline: record SPEM on a slow Lissajous task, extract temporal lag and spatial deviation, PCA, then distance to TD distribution with a sqrt(10) cutoff. The stats are straightforward binomial and t-tests, and the point about capturing heterogeneity lands. The main limitation is the modest sample, especially the 18 ASD participants, combined with no reported checks on how stable the TD covariance matrix is. At n=39 the estimate is usable but not bulletproof, and without bootstrap or threshold sensitivity the reported prevalence could be inflated by sampling variation in the normative cloud. The threshold itself looks chosen rather than derived from the data. Readers working on individual-level metrics for neurodevelopmental conditions would find this useful as a starting template. It is not ready for clinical claims, but it gives a concrete method that addresses a real gap in group-average eye movement studies. The work shows clear thinking on the problem and honest use of the data they have. I would bring it to a reading group for discussion on methods for heterogeneity. I would not cite it yet in my own papers until the robustness is addressed. It should go to peer review so the authors can add the missing sensitivity tests.

Referee Report

3 major / 2 minor

Summary. The paper claims that an outlier analysis of smooth pursuit eye movements (SPEM) during a Lissajous task reveals significantly higher idiosyncratic atypicality in ASD (n=18) than in TD controls (n=39). Using Mahalanobis distance on a 2D PCA-reduced feature space of temporal lag (Δt) and spatial deviation (Δs), with mean/covariance from the TD sample, they define outliers as scores > √10 (~3.16σ). This yields an ASD outlier rate of 38.9% (7/18, binomial p=0.0034) vs. 5.1% in TD, plus elevated mean scores (3.00±2.62 vs. 1.52±0.80, p=0.002), even when conventional mean comparisons lack sensitivity. The approach is positioned as a sensitive metric for ASD heterogeneity and potential subtype identification.

Significance. If the robustness concerns are addressed, the work offers a useful shift from group-mean to individual-deviation metrics for capturing ASD oculomotor heterogeneity. The multivariate Mahalanobis approach on PCA features provides a concrete, quantifiable baseline that could support clinical subtyping, addressing a recognized limitation of mean-based eye-movement studies in ASD.

major comments (3)

[Methods] Methods (Mahalanobis distance section): The covariance matrix for the Mahalanobis distance is estimated solely from the TD sample (n=39). No bootstrap, leave-one-out, or sensitivity analysis is reported to assess stability of this estimate or its effect on the fixed √10 threshold. With n=39 in 2D space, small perturbations in the TD cloud could move several ASD points across the boundary, undermining the reported 38.9% rate and p=0.0034.
[Methods] Methods (PCA and threshold): The choice of PCA reduction to two features (Δt, Δs) and the arbitrary threshold of √10 lack validation or justification. No cross-validation of feature selection, alternative dimensionality choices, or threshold sensitivity analysis (e.g., varying from 2.5σ to 3.5σ) is provided, raising the possibility that the group difference is sensitive to these specific decisions.
[Results] Results (statistical comparisons): The binomial test on 7/18 ASD outliers and the t-test on mean scores rest on a small ASD sample (n=18). No power analysis or assessment of how the outlier count changes with minor shifts in TD parameters is included, making the significance claims vulnerable to sampling variability.

minor comments (2)

[Abstract] Abstract: 'typically developed' should read 'typically developing'.
[Methods] Methods: Provide explicit equations or pseudocode for computing Δt and Δs from the raw eye-tracking traces.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped us strengthen the methodological rigor of our outlier analysis. We have revised the manuscript to incorporate sensitivity and stability analyses as suggested. These additions address concerns about the robustness of the Mahalanobis distance and PCA-based approach while acknowledging the preliminary nature of the small ASD sample.

read point-by-point responses

Referee: [Methods] Methods (Mahalanobis distance section): The covariance matrix for the Mahalanobis distance is estimated solely from the TD sample (n=39). No bootstrap, leave-one-out, or sensitivity analysis is reported to assess stability of this estimate or its effect on the fixed √10 threshold. With n=39 in 2D space, small perturbations in the TD cloud could move several ASD points across the boundary, undermining the reported 38.9% rate and p=0.0034.

Authors: We agree that stability of the covariance estimate merits explicit validation. In the revised manuscript, we will add a bootstrap resampling procedure (1000 iterations) of the TD sample to recompute the mean and covariance matrix each time, then re-evaluate the outlier scores and classifications for all ASD participants. This will quantify the variability in the 38.9% rate and confirm that the reported significance is not driven by a single unstable estimate of the TD distribution. revision: yes
Referee: [Methods] Methods (PCA and threshold): The choice of PCA reduction to two features (Δt, Δs) and the arbitrary threshold of √10 lack validation or justification. No cross-validation of feature selection, alternative dimensionality choices, or threshold sensitivity analysis (e.g., varying from 2.5σ to 3.5σ) is provided, raising the possibility that the group difference is sensitive to these specific decisions.

Authors: The two-component PCA solution was selected because the first two principal components together explain >85% of the variance in the (Δt, Δs) feature space; we will now report the exact variance explained and compare results obtained with the raw two-dimensional features versus the PCA-reduced space. We will also add a threshold sensitivity analysis in which the cutoff is varied from 2.5σ to 3.5σ and the resulting ASD outlier rates and p-values are tabulated, demonstrating that the group difference remains statistically significant across this range. revision: yes
Referee: [Results] Results (statistical comparisons): The binomial test on 7/18 ASD outliers and the t-test on mean scores rest on a small ASD sample (n=18). No power analysis or assessment of how the outlier count changes with minor shifts in TD parameters is included, making the significance claims vulnerable to sampling variability.

Authors: We acknowledge that the modest ASD sample size (n=18) is a limitation typical of clinical eye-tracking studies. In the revision we will include a post-hoc power analysis for the observed effect sizes of both the binomial test and the two-sample t-test on outlier scores. We will additionally perform a leave-one-out sensitivity analysis on the TD sample to show how removal of any single control participant affects the normative mean/covariance and the resulting ASD outlier count. These analyses will be presented alongside the original results. revision: partial

Circularity Check

0 steps flagged

No significant circularity; standard statistical outlier detection applied to held-out data

full rationale

The paper computes an outlier score as Mahalanobis distance in the 2D PCA-reduced space of temporal lag (Δt) and spatial deviation (Δs), with mean and covariance estimated solely from the TD sample. Outliers are defined by a fixed threshold of √10. The reported results (38.9% prevalence in ASD, mean score difference) are direct empirical comparisons using binomial and t-tests on these independently computed scores. No equation reduces the central claim to a fitted parameter by construction, no self-citation is load-bearing, and no ansatz or uniqueness theorem is invoked. The derivation chain is self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard multivariate statistics and a conventional outlier threshold; no new entities are postulated.

free parameters (1)

outlier threshold = √10
Set to √10 (≈3.16σ) to define statistical outliers relative to TD distribution.

axioms (1)

domain assumption The two-dimensional feature vector after PCA follows a multivariate normal distribution suitable for Mahalanobis distance.
Implicit in the use of Mahalanobis distance on the optimized feature vector.

pith-pipeline@v0.9.0 · 5650 in / 1380 out tokens · 68283 ms · 2026-05-15T01:14:13.086916+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

outlier score based on the Mahalanobis distance... feature vector... PCA... temporal lag (Δt) and spatial deviation (Δs)... threshold √10
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat induction and recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

binomial parameter analysis... 38.9% (7/18) ... P=0.0034

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.