arxiv: 2604.16835 · v1 · submitted 2026-04-18 · 💱 q-fin.ST · cs.AI· cs.LG

Recognition: unknown

The CTLNet for Shanghai Composite Index Prediction

Haibin Jiao

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:22 UTC · model grok-4.3

classification 💱 q-fin.ST cs.AIcs.LG

keywords Shanghai Composite Indextime series predictiondeep learningCNNTransformerLSTMfinancial forecastinghybrid neural networks

0 comments

The pith

The CTLNet hybrid model outperforms state-of-the-art baselines for predicting the Shanghai Composite Index.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces CTLNet, a hybrid deep learning architecture that combines convolutional neural networks, a transformer encoder, and long short-term memory units. The goal is to leverage feature extraction from CNNs, attention-based handling of long sequences from the transformer, and temporal dependency modeling from LSTMs for multivariate stock index forecasting. Experiments on Shanghai Composite Index data show the combined model beats existing top methods. A sympathetic reader would care because more accurate index predictions could support better-informed investment choices if the performance gain holds.

Core claim

The paper proposes the CNN-Transformer-LSTM Networks (CTLNet) for Shanghai Composite Index prediction. Drawing on the strengths of various models, the CTLNet integrates CNN for local feature extraction, the transformer encoder for parallel processing and long-range attention, and LSTM for sequential patterns. Comparative experiments show that the proposed model outperforms state-of-the-art baselines.

What carries the argument

The CTLNet architecture that merges CNN, transformer encoder, and LSTM components to handle multivariate time series forecasting by combining their complementary strengths.

If this is right

The model gains an advantage in managing long sequence dependencies in financial time series.
It more effectively captures correlations across multiple variables in the index data.
Hybrid networks that draw on CNN, transformer, and LSTM strengths deliver higher accuracy than single-architecture approaches.
The approach validates the use of such combinations for improved stock index forecasting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid structure could be applied to forecast other stock indices or asset classes with similar multivariate time series.
Testing the model across different market regimes or after major economic events would reveal limits to its generalization.
Adding external inputs such as trading volume or macroeconomic indicators might further boost performance beyond the current setup.

Load-bearing premise

Patterns learned from the training window of Shanghai Composite Index data will persist in future unseen periods without significant regime shifts or overfitting to noise.

What would settle it

Evaluating the trained CTLNet on Shanghai Composite Index data from a later time window not seen during training and finding that it no longer outperforms the baselines on standard accuracy metrics.

read the original abstract

Shanghai Composite Index prediction has become a hot issue for many investors and academic researchers. Deep learning models are widely applied in multivariate time series forecasting, including recurrent neural networks (RNN), convolutional neural networks (CNN), and transformers. Specifically, the Transformer encoder, with its unique attention mechanism and parallel processing capabilities, has become an important tool in time series prediction, and has an advantage in dealing with long sequence dependencies and multivariate data correlations. Drawing on the strengths of various models, we propose the CNN-Transformer-LSTM Networks (CTLNet). This paper explores the application of CTLNet for Shanghai Composite Index prediction and the comparative experiments show that the proposed model outperforms state-of-the-art baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CTLNet is a standard CNN-Transformer-LSTM stack applied to Shanghai Composite Index data whose reported outperformance cannot be assessed because the paper gives no details on temporal splits or significance testing.

read the letter

The core issue with this paper is that it claims the CTLNet hybrid beats baselines on Shanghai Composite Index forecasts, yet supplies no usable information on how those forecasts were actually tested. Without that, the numbers do not demonstrate generalization rather than in-sample fit or a convenient split. The architecture itself is a straightforward stacking: convolutional layers for local patterns, a transformer encoder for attention, and LSTM units for sequence modeling. The authors motivate the combination by referencing the strengths of each piece and then run head-to-head comparisons against other models on the same index. That part of the work is clear and follows the template seen in several recent time-series papers. The background citations are appropriate and the model description is readable. The soft spots sit in the evaluation. The manuscript does not state whether the data split was strictly chronological, whether walk-forward or rolling-origin validation was used, how many independent runs were averaged, or whether any formal test compared the forecast errors. For a non-stationary series like this one, those omissions matter. A single fixed window can easily favor one architecture over others even when no real predictive edge exists. No new mechanism or theorem is offered; the contribution stays at the level of trying an existing hybrid on one more dataset. This paper would mainly interest practitioners who are already experimenting with deep learning for Chinese equity indices and want to see one additional architecture tried. Readers who expect rigorous out-of-sample protocols or statistical backing will find the missing details a barrier. The work does not reach the threshold for serious refereeing because the central performance claim rests on an unevaluable experimental setup. I would not bring it to a reading group or cite it. The authors would need to add explicit validation procedures and re-run the tests before any further consideration.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a hybrid CNN-Transformer-LSTM network (CTLNet) for Shanghai Composite Index prediction. It combines convolutional layers for local feature extraction, a Transformer encoder for attention-based long-range dependencies, and LSTM layers for sequential modeling, and claims through comparative experiments that CTLNet outperforms state-of-the-art baselines.

Significance. A well-validated hybrid architecture could advance multivariate financial time-series forecasting by exploiting complementary strengths of CNN, attention, and recurrent components. However, the absence of any described out-of-sample protocol or significance testing in a non-stationary domain substantially reduces the potential impact of the reported results.

major comments (2)

[Abstract] Abstract: the claim that 'comparative experiments show that the proposed model outperforms state-of-the-art baselines' is unsupported because the manuscript supplies no information on the train-test partitioning procedure, whether chronological or walk-forward, the number of runs, or any statistical test (e.g., Diebold-Mariano) on forecast errors; in non-stationary financial series this omission makes the central superiority claim impossible to evaluate.
[Abstract] Abstract (and experimental description): no mention is made of how non-stationarity or regime shifts in the Shanghai Composite Index are handled, nor of any rolling-origin or purged cross-validation scheme; without these the reported metrics cannot be distinguished from in-sample fit quality.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments on the experimental protocol. We have revised the manuscript to provide explicit descriptions of the train-test partitioning, statistical testing, and handling of non-stationarity, thereby strengthening the validity of our superiority claims.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'comparative experiments show that the proposed model outperforms state-of-the-art baselines' is unsupported because the manuscript supplies no information on the train-test partitioning procedure, whether chronological or walk-forward, the number of runs, or any statistical test (e.g., Diebold-Mariano) on forecast errors; in non-stationary financial series this omission makes the central superiority claim impossible to evaluate.

Authors: We agree that the original abstract and experimental section omitted key methodological details. The revised manuscript now specifies a strict chronological train-test split (first 70% for training, last 30% for testing) to prevent lookahead bias. We report results averaged over 10 independent runs with different random seeds, including standard deviations. Diebold-Mariano tests have been added to the results section, confirming that CTLNet's improvements over baselines are statistically significant at the 5% level. revision: yes
Referee: [Abstract] Abstract (and experimental description): no mention is made of how non-stationarity or regime shifts in the Shanghai Composite Index are handled, nor of any rolling-origin or purged cross-validation scheme; without these the reported metrics cannot be distinguished from in-sample fit quality.

Authors: We acknowledge this limitation in the original submission. The revised version includes a new subsection detailing our approach to non-stationarity: we apply a rolling-origin evaluation scheme where the training window expands over time, combined with purged cross-validation that removes overlapping periods to avoid leakage from regime shifts. This protocol ensures the metrics reflect genuine out-of-sample performance rather than in-sample fit. revision: yes

Circularity Check

0 steps flagged

No circularity in architectural proposal or empirical comparison

full rationale

The paper proposes the CTLNet hybrid architecture (CNN-Transformer-LSTM) for Shanghai Composite Index forecasting and reports that comparative experiments show outperformance versus baselines. No load-bearing derivation chain, equations, or uniqueness theorems are presented that reduce by construction to fitted inputs, self-definitions, or self-citations. The model is defined architecturally and evaluated empirically; absent any quoted reduction of a claimed prediction to its own training fit or to an unverified self-citation, the work is self-contained against the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that multivariate financial time series contain learnable, persistent patterns and on the ad-hoc choice of layer ordering and hyperparameters that are tuned to the target dataset.

free parameters (1)

network hyperparameters and layer dimensions
All weights, attention heads, filter sizes, and LSTM units are learned from the Shanghai Composite data; no parameter-free derivation is supplied.

axioms (1)

domain assumption Historical price and volume series contain exploitable autocorrelation and cross-variable dependencies that survive into the forecast horizon
Invoked implicitly by training any supervised model on past observations to predict future index levels.

pith-pipeline@v0.9.0 · 5400 in / 1246 out tokens · 42056 ms · 2026-05-10T07:22:38.415721+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 6 canonical work pages · 3 internal anchors

[1]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271(2018)

work page internal anchor Pith review arXiv 2018
[2]

Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting

Rajat Sen, Hsiang-Fu Yu, and Inderjit S Dhillon. Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. Advances in neural information processing systems, 32(2019)

2019
[3]

A multiscale interactive recurrent network for time-series forecasting

Donghui Chen, Ling Chen, Youdong Zhang, Bo Wen, and Chenghu Yang. A multiscale interactive recurrent network for time-series forecasting. IEEE Transactions on Cybernetics,52(9):8793–8803(2021)

2021
[4]

TPRNN: A top-down pyramidal recurrent neural network for time series forecasting

Ling Chen and Jiahua Cui. TPRNN: A top-down pyramidal recurrent neural network for time series forecasting. arXiv preprint arXiv:2312.06328(2023)

work page arXiv 2023
[5]

Time-aware multi-scale RNNs for time series modeling

Zipeng Chen, Qianli Ma, and Zhenxi Lin. Time-aware multi-scale RNNs for time series modeling. In IJCAI, pages 2285–2291( 2021)

2021
[6]

FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting

Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, pages 27268 – 27286(2022)

2022
[7]

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words:Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730(2022)

work page internal anchor Pith review arXiv 2022
[8]

Crossformer: Transformer utilizing cross-dimension dependency for multivariatetime series forecasting

Yunhao Zhang and Junchi Yan. Crossformer: Transformer utilizing cross-dimension dependency for multivariatetime series forecasting. In The Eleventh International Conference on Learning Representations(2022). 9

2022
[9]

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625(2023)

work page internal anchor Pith review arXiv 2023
[10]

MSGNet: Learning multi-scale inter-series correlations for multivariate time series forecasting

Wanlin Cai, Yuxuan Liang, Xianggen Liu, Jianshuai Feng, and Yuankai Wu. MSGNet: Learning multi-scale inter-series correlations for multivariate time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 11141– 11149(2024)

2024
[11]

Multi-scale adaptive graph neural network for multivariate time series forecasting

Ling Chen, Donghui Chen, Zongjiang Shang, Binqing Wu, Cen Zheng, Bo Wen, and Wei Zhang. Multi-scale adaptive graph neural network for multivariate time series forecasting. IEEE Transactions on Knowledge and Data Engineering, pages 10748–10761(2023)

2023
[12]

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Xingjian Shi, Zhourong Chen, Hao Wang, et al. "Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting"[J]. Advances in Neural Information Processing Systems, 28:802-810(2015)

2015
[13]

InParformer: Evolutionary decomposition transformers with interactive parallel attention for long-term time series forecasting

Haizhou Cao, Zhenhao Huang, Tiechui Yao, Jue Wang, Hui He, and Yangang Wang. InParformer: Evolutionary decomposition transformers with interactive parallel attention for long-term time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 6906–6915(2023)

2023
[14]

MSHyper: Multi-scale hypergraph transformer for long- range time series forecasting

Zongjiang Shang and Ling Chen. MSHyper: Multi-scale hypergraph transformer for long- range time series forecasting. arXiv preprint arXiv:2401.09261(2024)

work page arXiv 2024
[15]

Time-series forecasting with deep learning: a survey

Bryan Lim, Stefan Zohren. "Time-series forecasting with deep learning: a survey"[J]. arXiv preprint arXiv:2009.05407(2020)

work page arXiv 2009
[16]

Informer:Beyond efficient transformer for long sequence time-series forecasting

Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. Informer:Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11106–11115(2021)

2021
[17]

Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting

Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Advances in neural information processing systems, 32(2019)

2019
[18]

Pyraformer:Low-complexity pyramidal attention for long-range time series modeling and forecasting

Shizhan Liu, Hang Yu, Cong Liao, Jianguo Li, Weiyao Lin, Alex X Liu, and Schahram Dustdar. Pyraformer:Low-complexity pyramidal attention for long-range time series modeling and forecasting. In International conference on learning representations(2021)

2021
[19]

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In ICLR(2018)

2018
[20]

Graph WaveNet for Deep Spatial-Temporal Graph Modeling

Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, and Chengqi Zhang. Graph WaveNet for Deep Spatial-Temporal Graph Modeling. In IJCAI. 1907–1913(2019)

1907
[21]

Hamilton, Zhitao Ying, and Jure Leskovec

William L. Hamilton, Zhitao Ying, and Jure Leskovec. Inductive Representation Learning on Large Graphs. In NeurIPS. 1024–1034(2019)

2019
[22]

Kipf and Max Welling

Thomas N. Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR(2017)

2017
[23]

Joint Learning of E- commerce Search and Recommendation with a Unified Graph Neural Network

Kai Zhao, Yukun Zheng, Tao Zhuang, Xiang Li, and Xiaoyi Zeng. Joint Learning of E- commerce Search and Recommendation with a Unified Graph Neural Network. In WSDM. 1461–1469(2022)

2022
[24]

Hypergraph convolution and hypergraph attention

Song Bai, Feihu Zhang, and Philip HS Torr. Hypergraph convolution and hypergraph attention. Pattern Recognition, 110:107637(2021)

2021
[25]

Learning multi granular hypergraphs for video-based person re-identification

Yichao Yan, Jie Qin, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, and Ling Shao. Learning multi granular hypergraphs for video-based person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2899 – 2908(2020)

2020