Cross-sectional topological anomaly scores and intraday return predictability in the S&P 500: A BallMapper, decoder-conditional VAE, and Function-on-Function regression approach

Krzysztof Ozimek

arxiv: 2606.08586 · v1 · pith:QKTHNLYSnew · submitted 2026-06-07 · 💱 q-fin.ST

Cross-sectional topological anomaly scores and intraday return predictability in the S&P 500: A BallMapper, decoder-conditional VAE, and Function-on-Function regression approach

Krzysztof Ozimek This is my paper

Pith reviewed 2026-06-27 17:32 UTC · model grok-4.3

classification 💱 q-fin.ST

keywords topological anomaly detectionintraday return predictabilityBallMapperdecoder-conditional VAEfunction-on-function regressionTakens embeddingS&P 500co-movement structure

0 comments

The pith

The history of stock-level topological anomaly scores carries predictive content for intraday return curves across S&P 500 assets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a stock-level topological anomaly score that incorporates both market-wide topological structure and cross-sectional peer information, rather than flagging unusual values in raw price series. It embeds ten-stock intraday paths via Takens delay embedding, represents them as BallMapper graphs, and scores deviations using three decoder-conditional variational autoencoder variants. Penalised function-on-function regression then shows that the recent history of these scores reliably forecasts subsequent return curves, with effects that build gradually, often reverse early, and draw more weight from recent observations. The pattern holds for every tested asset, bar frequency, and scoring variant, and the timing of reversal shifts with market regime while the distribution of predictive weight varies with sampling interval.

Core claim

The history of the stock-level topological anomaly score carries predictive content for return curves, confirmed across all assets, intraday bar frequencies, and scoring variants, revealing gradual accumulation of return impact, frequent early reversal, and predictive content weighted toward recent anomaly history. When the reversal occurs depends on market regime; how evenly the anomaly history contributes to prediction depends on bar frequency.

What carries the argument

Decoder-conditional variational autoencoder variants that score topologically misexpected persistent deviations in the latent structure of co-movement, derived from BallMapper graphs of Takens-embedded ten-stock intraday series.

If this is right

Return impact from anomaly scores accumulates gradually rather than appearing immediately.
The direction of the impact frequently reverses early in the prediction window.
Predictive weight is distributed across the anomaly history but concentrated on more recent observations.
The timing of reversal shifts with market regime while the evenness of contribution depends on intraday bar frequency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the scores truly capture misexpected co-movement, combining them with standard factors could tighten intraday risk estimates.
The same construction might be applied to longer-horizon or cross-asset panels to test whether topological persistence signals operate at multiple scales.
Regime-dependent reversal timing suggests the method could be used to flag shifts between accumulation and correction phases in real time.

Load-bearing premise

The BallMapper embedding combined with decoder-conditional VAE variants isolates genuine topological deviations in latent co-movement structure rather than embedding artifacts, parameter choices, or statistical noise.

What would settle it

Finding that anomaly-score histories add no incremental explanatory power for return curves in a strict out-of-sample period or under a different market regime would falsify the central claim.

read the original abstract

Anomaly detection methods in financial time series score statistically unusual observations in observable data, not topologically misexpected persistent deviations in the latent structure of co-movement. This study constructs a stock-level topological anomaly score jointly conditioned on market-level topological structure and cross-sectional peer context, and tests whether its history carries predictive content for return curves. Intraday data for ten liquid S&P 500 constituents (April 2025--March 2026) are embedded via Takens delay embedding, graphed by BallMapper, and scored by three decoder-conditional variational autoencoder variants. Predictive content is assessed by penalised function-on-function regression and confirmed across all assets, intraday bar frequencies, and scoring variants, revealing a consistent temporal fingerprint -- gradual accumulation of return impact, a frequent early reversal of its direction, and broadly distributed predictive content weighted toward recent anomaly history. When the reversal occurs depends on market regime; how evenly the anomaly history contributes to prediction depends on bar frequency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a novel pipeline from Takens embeddings through BallMapper and decoder-conditional VAEs to produce cross-sectional topological anomaly scores, then tests their history in function-on-function regressions for intraday return prediction, but the results rest on unverified robustness to embedding choices and lack reported metrics or baselines.

read the letter

The core contribution here is a specific pipeline that turns intraday price series from ten S&P 500 names into topological anomaly scores via Takens delay embedding, BallMapper graphs, and three decoder-conditional VAE variants, then feeds the score histories into penalized function-on-function regression to predict return curves. The claimed pattern—gradual impact buildup, frequent early reversal, and heavier weight on recent history, with regime and frequency dependence—is the main empirical takeaway.

What the work does reasonably is lay out a coherent sequence of steps that tries to isolate persistent deviations in latent co-movement structure rather than raw statistical outliers. The cross-sectional conditioning and the use of function-on-function regression to capture the full temporal shape of the predictive relationship are sensible choices for this setting. The consistency claim across assets, bar frequencies, and scoring variants is at least a start toward checking stability.

The soft spots are mostly around validation and identifiability. The abstract supplies no quantitative measures of predictive strength, no error bars, and no explicit out-of-sample protocol, so it is hard to judge whether the reported temporal fingerprint survives proper hold-out testing or simply reflects in-sample structure captured by the VAE. The stress-test point on embedding artifacts is fair: without reported checks on delay and dimension choices, BallMapper thresholds, or ablations against simpler lagged-correlation or volatility features, it remains possible that the scores are proxying conventional signals rather than genuinely topological ones. Ten stocks over one year is also a narrow base for claiming broad applicability in liquid names.

This paper is aimed at quantitative researchers already working with topological data analysis or high-frequency alternative features. It deserves a serious referee because the pipeline is new enough and the prediction task is well-defined enough that external readers can evaluate the methods and demand the missing robustness checks. I would send it to review with the expectation that the authors will need to add parameter sensitivity tables, baseline comparisons, and clearer out-of-sample results before it could be published.

Referee Report

2 major / 2 minor

Summary. The paper constructs stock-level topological anomaly scores for ten S&P 500 constituents using Takens delay embeddings of intraday price series, BallMapper graphs, and three decoder-conditional VAE variants. These scores are fed into penalised function-on-function regressions to test whether their history predicts return curves. The central claim is that predictive content is confirmed across all assets, bar frequencies, and scoring variants, exhibiting gradual accumulation of impact, frequent early reversal, and weighting toward recent history, with regime- and frequency-dependent features.

Significance. If the topological scores isolate genuine latent co-movement deviations rather than embedding artifacts or conventional statistical features, the work would provide evidence that topological data analysis can yield incremental intraday return predictability beyond standard cross-sectional signals. The multi-variant consistency and function-on-function framework are potentially valuable for capturing temporal fingerprints in high-frequency data.

major comments (2)

[Methods] Methods section on VAE training and regression setup: the decoder-conditional VAE anomaly scores are learned from the same intraday series whose returns are subsequently predicted; without explicit out-of-sample partitioning, rolling-window validation details, or ablation against non-topological baselines (e.g., lagged correlations or volatility), it is unclear whether the reported predictive content reflects genuine topological structure or in-sample recovery of persistent features.
[Results] Results on robustness (likely §4 or §5): no sensitivity analysis is described for Takens embedding parameters (delay, dimension), BallMapper covering radius, or graph-construction thresholds; given only ten assets, the function-on-function regressions could attribute power to any persistent cross-sectional signal captured incidentally by the pipeline rather than topology-specific deviations.

minor comments (2)

[Abstract] Abstract and introduction: quantitative metrics (R², out-of-sample MSE, error bars) for the predictive regressions are not summarized, making it difficult to gauge effect sizes relative to the claimed consistency across frequencies and variants.
[Methods] Notation: the precise definition of the 'decoder-conditional' conditioning and how the anomaly score is extracted from the VAE latent space should be stated explicitly with an equation to avoid ambiguity in replication.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below, indicating planned revisions to clarify methodology and strengthen robustness checks while preserving the core contribution on topological anomaly scores.

read point-by-point responses

Referee: [Methods] Methods section on VAE training and regression setup: the decoder-conditional VAE anomaly scores are learned from the same intraday series whose returns are subsequently predicted; without explicit out-of-sample partitioning, rolling-window validation details, or ablation against non-topological baselines (e.g., lagged correlations or volatility), it is unclear whether the reported predictive content reflects genuine topological structure or in-sample recovery of persistent features.

Authors: We agree that explicit temporal separation and validation details are essential to establish that the predictive content arises from topological structure rather than in-sample recovery. The decoder-conditional VAE is designed to condition on market-level topological structure and cross-sectional context, but the current draft does not sufficiently document the training window relative to the prediction horizon. In revision we will add a dedicated subsection detailing rolling-window procedures for VAE training and the subsequent function-on-function regressions, ensuring that anomaly scores used for prediction are computed only on data preceding the return curves being forecasted. We will also include ablation experiments against non-topological baselines (lagged correlations, realized volatility, and simple PCA-based scores) to quantify incremental explanatory power attributable to the topological pipeline. revision: yes
Referee: [Results] Results on robustness (likely §4 or §5): no sensitivity analysis is described for Takens embedding parameters (delay, dimension), BallMapper covering radius, or graph-construction thresholds; given only ten assets, the function-on-function regressions could attribute power to any persistent cross-sectional signal captured incidentally by the pipeline rather than topology-specific deviations.

Authors: We accept that the absence of hyperparameter sensitivity checks leaves open the possibility that results are driven by incidental persistent signals. We will insert a new robustness subsection that systematically varies Takens delay and dimension, BallMapper covering radius, and edge-threshold values, reporting stability of the reported temporal fingerprint (gradual accumulation, early reversal, recent-history weighting). With respect to the sample of ten assets, this selection was deliberate to ensure high-quality, synchronized intraday data across liquid names; the uniformity of findings across all ten constituents and three bar frequencies offers some protection against idiosyncratic artifacts. Nevertheless, we will expand the limitations discussion to acknowledge the modest cross-section and will note that future work should scale the approach to larger universes. The planned baseline ablations will further help isolate topology-specific effects from generic cross-sectional persistence. revision: yes

Circularity Check

0 steps flagged

No circularity: pipeline steps remain distinct and externally verifiable

full rationale

The abstract and description outline a sequential pipeline (Takens embedding → BallMapper graph → decoder-conditional VAE anomaly scoring → penalised function-on-function regression on returns) with no quoted equations or text showing self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. Anomaly scores are constructed as derived features from the data before being used as regressors; this separation does not reduce the prediction step to the input by construction. No uniqueness theorems, ansatzes smuggled via citation, or renamings of known results are present in the supplied text. The central claim therefore retains independent content and is not forced by its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on unstated assumptions about the validity of Takens embedding, BallMapper graph construction, and VAE latent-space anomaly definition, none of which are detailed here.

pith-pipeline@v0.9.1-grok · 5711 in / 1328 out tokens · 26522 ms · 2026-06-27T17:32:54.549334+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 2 canonical work pages · 1 internal anchor

[1]

10 Cyril Bénézet and Stéphane Crépey

Amihud Y . (2002), Illiquidity and stock returns: cross -section and time -series effects, Journal of Financial Markets, 5(1), 31–56, DOI: 10.1016/S1386-4181(01)00024-6. An J., Cho S. (2015), Variational autoencoder based anomaly detection using reconstruction probability, Special Lecture on IE, 2(1), 1–18. Carlsson G. (2009), Topology and data, Bulletin ...

work page doi:10.1016/s1386-4181(01)00024-6 2002
[2]

Auto-Encoding Variational Bayes

, arXiv:1312.6114, DOI: 10.48550/arXiv.1312.6114. Mehrotra K.G., Mohan C.K., Huang H. (2017), Anomaly Detection Principles and Algorithms, Springer, DOI: 10.1007/978-3-319-67526-8. Ogasawara E., Salles R., Porto F., Pacitti E. (2025), Event Detection in Time Series , Synthesis Lectures on Data Management, Springer Nature, DOI: 10.1007/978-3-031-75941-3. P...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1312.6114 2017
[3]

illiquidity ratio over the sample period (2025 -04-01 to 2026-03-31). For stock 𝑖 on day 𝑡 the daily Amihud ILLIQ is 𝐼𝐿𝐿𝐼𝑄𝑖,𝑡 = |𝑟𝑖,𝑡| 𝑃𝑖,𝑡 ⋅ 𝑉𝑖,𝑡 , 𝑟𝑖,𝑡 = Close𝑖,𝑡 Open𝑖,𝑡 − 1, where 𝑃𝑖,𝑡 is the closing price and 𝑉𝑖,𝑡 the share volume (dollar volume is expressed in millions). The per-stock measure is the mean over all valid trading days with at least 80%...

2025
[4]

▪ Spread across large ℎ: impact is persistent - the anomaly predicts returns far into the future

Table 5 Notation for the anchored function-on-function regression model Symbol Domain Role 𝑘 1, … , 𝐾 Replication (anchor) index — one per snapshot ℎ [1, 𝐻] Horizon from anchor — response domain, identical for all replications 𝑤 [0, 𝑊] Integration variable — predictor domain; 𝑤 = 0 oldest observation in window, 𝑤 = 𝑊 anchor snapshot 𝑟𝑖(𝑎𝑘, 𝑎𝑘 + ℎ) Scalar ...

2006
[5]

+ 0.5)/𝑛 where 𝑛 is the total number of observations ( 𝑛 = 480 in the full design).⁡†Before applying logit, ℎ0.50/𝐻 is clamped to [10−4, 1 − 10−4] to prevent ±∞ at exact boundary values; no observed values fall outside this interval. Table 7 Mixed-effects model p-values for Measures 1–4 Term 𝜹(𝑯/𝟒) 𝒉𝟎.𝟓𝟎 𝒘‾ 𝑼 Intercept < .001 0.216 < .001 < .001 Quarter 0...

2025

[1] [1]

10 Cyril Bénézet and Stéphane Crépey

Amihud Y . (2002), Illiquidity and stock returns: cross -section and time -series effects, Journal of Financial Markets, 5(1), 31–56, DOI: 10.1016/S1386-4181(01)00024-6. An J., Cho S. (2015), Variational autoencoder based anomaly detection using reconstruction probability, Special Lecture on IE, 2(1), 1–18. Carlsson G. (2009), Topology and data, Bulletin ...

work page doi:10.1016/s1386-4181(01)00024-6 2002

[2] [2]

Auto-Encoding Variational Bayes

, arXiv:1312.6114, DOI: 10.48550/arXiv.1312.6114. Mehrotra K.G., Mohan C.K., Huang H. (2017), Anomaly Detection Principles and Algorithms, Springer, DOI: 10.1007/978-3-319-67526-8. Ogasawara E., Salles R., Porto F., Pacitti E. (2025), Event Detection in Time Series , Synthesis Lectures on Data Management, Springer Nature, DOI: 10.1007/978-3-031-75941-3. P...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1312.6114 2017

[3] [3]

illiquidity ratio over the sample period (2025 -04-01 to 2026-03-31). For stock 𝑖 on day 𝑡 the daily Amihud ILLIQ is 𝐼𝐿𝐿𝐼𝑄𝑖,𝑡 = |𝑟𝑖,𝑡| 𝑃𝑖,𝑡 ⋅ 𝑉𝑖,𝑡 , 𝑟𝑖,𝑡 = Close𝑖,𝑡 Open𝑖,𝑡 − 1, where 𝑃𝑖,𝑡 is the closing price and 𝑉𝑖,𝑡 the share volume (dollar volume is expressed in millions). The per-stock measure is the mean over all valid trading days with at least 80%...

2025

[4] [4]

▪ Spread across large ℎ: impact is persistent - the anomaly predicts returns far into the future

Table 5 Notation for the anchored function-on-function regression model Symbol Domain Role 𝑘 1, … , 𝐾 Replication (anchor) index — one per snapshot ℎ [1, 𝐻] Horizon from anchor — response domain, identical for all replications 𝑤 [0, 𝑊] Integration variable — predictor domain; 𝑤 = 0 oldest observation in window, 𝑤 = 𝑊 anchor snapshot 𝑟𝑖(𝑎𝑘, 𝑎𝑘 + ℎ) Scalar ...

2006

[5] [5]

+ 0.5)/𝑛 where 𝑛 is the total number of observations ( 𝑛 = 480 in the full design).⁡†Before applying logit, ℎ0.50/𝐻 is clamped to [10−4, 1 − 10−4] to prevent ±∞ at exact boundary values; no observed values fall outside this interval. Table 7 Mixed-effects model p-values for Measures 1–4 Term 𝜹(𝑯/𝟒) 𝒉𝟎.𝟓𝟎 𝒘‾ 𝑼 Intercept < .001 0.216 < .001 < .001 Quarter 0...

2025