Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis
Pith reviewed 2026-05-21 11:21 UTC · model grok-4.3
The pith
A node transformer on a market graph fused with BERT sentiment reaches 0.80 percent MAPE for one-day stock forecasts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that representing the stock market as a graph with stocks as nodes and edges for sectoral affiliations, correlated price movements, and supply chain connections, then processing historical data with a node transformer while fusing sentiment extracted by a fine-tuned BERT model through attention mechanisms, yields superior one-day-ahead forecasts. Experiments on 20 S&P 500 stocks from January 1982 to March 2025 produce a mean absolute percentage error of 0.80 percent, compared with 1.20 percent for ARIMA and 1.00 percent for LSTM. Sentiment analysis accounts for a 10 percent overall error reduction and 25 percent during earnings announcements, the graph architecture adds a 15
What carries the argument
Node transformer architecture applied to a graph of stocks whose edges encode sectoral, correlation, and supply-chain relationships, with attention-based fusion of BERT sentiment features from social media posts.
If this is right
- The model achieves a mean absolute percentage error of 0.80 percent for one-day-ahead predictions on the 20 tested S&P 500 stocks.
- Sentiment integration reduces overall prediction error by 10 percent and by 25 percent during earnings announcements.
- The graph-based architecture contributes an additional 15 percent improvement by capturing inter-stock dependencies.
- Directional accuracy reaches 65 percent and error stays lower than baselines during high-volatility periods.
Where Pith is reading between the lines
- The same graph-plus-sentiment structure could be redefined for other asset classes such as commodities by substituting appropriate relationship edges.
- Real-time updating of the graph with fresh supply-chain data might support live trading systems that react to changing company linkages.
- Testing the model on multi-day horizons would show how long the relational and sentiment signals continue to provide value.
Load-bearing premise
The constructed graph edges and the social media sentiment scores supply genuine forward-looking predictive information rather than reflecting past patterns or introducing data leakage.
What would settle it
Re-training and testing the model on data from April 2025 onward to verify whether the 10 percent error reduction from sentiment and the 15 percent gain from the graph structure still appear on an out-of-sample period with no overlap to the original 1982-2025 dataset.
Figures
read the original abstract
Stock market prediction presents considerable challenges for investors, financial institutions, and policymakers operating in complex market environments characterized by noise, non-stationarity, and behavioral dynamics. Traditional forecasting methods, including fundamental analysis and technical indicators, often fail to capture the intricate patterns and cross-sectional dependencies inherent in financial markets. This paper presents an integrated framework combining a node transformer architecture with BERT-based sentiment analysis for stock price forecasting. The proposed model represents the stock market as a graph structure where individual stocks form nodes and edges capture relationships including sectoral affiliations, correlated price movements, and supply chain connections. A fine-tuned BERT model extracts sentiment information from social media posts and combines it with quantitative market features through attention-based fusion mechanisms. The node transformer processes historical market data while capturing both temporal evolution and cross-sectional dependencies among stocks. Experiments conducted on 20 S&P 500 stocks spanning January 1982 to March 2025 demonstrate that the integrated model achieves a mean absolute percentage error (MAPE) of 0.80% for one-day-ahead predictions, compared to 1.20% for ARIMA and 1.00% for LSTM. The inclusion of sentiment analysis reduces prediction error by 10% overall and 25% during earnings announcements, while the graph-based architecture contributes an additional 15% improvement by capturing inter-stock dependencies. Directional accuracy reaches 65% for one-day forecasts. Statistical validation through paired t-tests confirms the significance of these improvements (p < 0.05 for all comparisons). The model maintains lower error during high-volatility periods, achieving MAPE of 1.50% while baseline models range from 1.60% to 2.10%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an integrated node transformer architecture combined with BERT-based sentiment analysis for one-day-ahead stock price prediction. Stocks are represented as nodes in a graph with edges encoding sectoral affiliations, correlated price movements, and supply chain connections. A fine-tuned BERT extracts sentiment from social media, fused via attention mechanisms with quantitative features. On 20 S&P 500 stocks from January 1982 to March 2025, the model reports a MAPE of 0.80%, outperforming ARIMA (1.20%) and LSTM (1.00%), with sentiment reducing error by 10% overall (25% during earnings) and the graph contributing an additional 15% improvement; directional accuracy is 65% and paired t-tests show p < 0.05 significance.
Significance. If the central results hold after addressing potential leakage, the work would offer a concrete demonstration of how graph transformers can capture cross-sectional dependencies while incorporating behavioral signals from NLP, which could inform more robust forecasting approaches in noisy, non-stationary financial settings.
major comments (2)
- Abstract: The description of graph edges that 'capture relationships including ... correlated price movements' provides no detail on whether correlations are computed over the full 1982-2025 window or via rolling windows using only data available at each forecast origin. If the former, future price information contaminates the graph for earlier dates, directly violating the one-day-ahead setup and rendering the reported 15% graph contribution and overall MAPE claims unreliable.
- Abstract: No information is given on training/validation/test splits, hyperparameter tuning, or procedures to control for look-ahead bias in either the graph construction or the sentiment data alignment. In non-stationary financial series, these omissions make it impossible to determine whether the 0.80% MAPE, the 10%/25% sentiment reductions, or the t-test results reflect out-of-sample predictive power rather than in-sample fitting.
minor comments (2)
- Abstract: The fusion mechanism is described only at a high level ('attention-based fusion mechanisms'); adding a brief equation or diagram would clarify how sentiment embeddings are combined with node features.
- The abstract reports results on 20 stocks but does not specify selection criteria or whether results generalize beyond this small sample; a sensitivity table across different stock subsets would strengthen the claims.
Simulated Author's Rebuttal
We thank the referee for their valuable comments, which help improve the clarity and rigor of our manuscript regarding potential data leakage and experimental details. We address each concern below and will incorporate the necessary revisions.
read point-by-point responses
-
Referee: Abstract: The description of graph edges that 'capture relationships including ... correlated price movements' provides no detail on whether correlations are computed over the full 1982-2025 window or via rolling windows using only data available at each forecast origin. If the former, future price information contaminates the graph for earlier dates, directly violating the one-day-ahead setup and rendering the reported 15% graph contribution and overall MAPE claims unreliable.
Authors: We appreciate this observation on a potential source of look-ahead bias. Upon review, the original manuscript lacked explicit specification of the correlation calculation window. In the revised version, we will clarify that edge correlations are derived from rolling windows using only past data up to the forecast date. This ensures compliance with the one-day-ahead protocol and preserves the validity of the reported improvements from the graph structure. Additional details and justification will be added to the methods section. revision: yes
-
Referee: Abstract: No information is given on training/validation/test splits, hyperparameter tuning, or procedures to control for look-ahead bias in either the graph construction or the sentiment data alignment. In non-stationary financial series, these omissions make it impossible to determine whether the 0.80% MAPE, the 10%/25% sentiment reductions, or the t-test results reflect out-of-sample predictive power rather than in-sample fitting.
Authors: We agree that these procedural details are crucial for assessing the robustness of results in financial forecasting. We will revise the paper to include a dedicated subsection on the experimental protocol, specifying the chronological train-validation-test split, the use of walk-forward validation for hyperparameter selection, and safeguards against look-ahead bias in graph and sentiment feature construction. This will demonstrate that our evaluations are strictly out-of-sample. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper is an empirical ML study that trains a node transformer plus BERT model on historical stock data and reports test-set performance metrics such as MAPE and percentage improvements. No mathematical derivation chain, equations, or first-principles results are present that reduce to self-definition or to fitted inputs by construction. The graph construction and sentiment fusion are described as model components whose outputs are evaluated on held-out forecasts; these steps do not collapse into the input data by the paper's own statements. Standard self-citations, if any, are not load-bearing for the central empirical claims. The reported results therefore remain self-contained experimental outcomes rather than tautological restatements of the inputs.
Axiom & Free-Parameter Ledger
free parameters (2)
- Transformer attention weights and BERT fine-tuning parameters
- Graph edge weights for inter-stock relationships
axioms (2)
- domain assumption Historical price movements and social media sentiment contain exploitable patterns for future price prediction
- domain assumption The defined graph structure accurately represents relevant dependencies between stocks
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
node transformer architecture... graph structure where individual stocks form nodes and edges capture relationships including sectoral affiliations, correlated price movements, and supply chain connections... BERT-based sentiment analysis
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Efficient capital markets: A review of theory and empirical work,
E. F. Fama, “Efficient capital markets: A review of theory and empirical work,”The Journal of Finance, vol. 25, no. 2, pp. 383–417, 1970
work page 1970
-
[2]
Stock market prediction via deep learning techniques: A survey,
J. Zhang, Y . Teng, and W. Chen, “Stock market prediction via deep learning techniques: A survey,”arXiv preprint arXiv:2212.12717, 2022
-
[3]
Market efficiency in the age of big data,
I. W. R. Martin and S. Nagel, “Market efficiency in the age of big data,” Journal of Financial Economics, vol. 145, no. 1, pp. 154–177, 2022
work page 2022
-
[4]
Information in financial markets and its real effects,
I. Goldstein, “Information in financial markets and its real effects,” Review of Finance, vol. 27, no. 1, pp. 1–32, 2023
work page 2023
-
[5]
Prospect theory: An analysis of decision under risk,
D. Kahneman and A. Tversky, “Prospect theory: An analysis of decision under risk,”Econometrica, vol. 47, no. 2, pp. 263–291, 1979
work page 1979
-
[6]
J. J. Murphy,Technical analysis of the financial markets: A comprehen- sive guide to trading methods and applications. New York Institute of Finance, 1999
work page 1999
-
[7]
Deep learning with long short-term memory networks for financial market predictions,
T. Fischer and C. Krauss, “Deep learning with long short-term memory networks for financial market predictions,”European Journal of Oper- ational Research, vol. 270, no. 2, pp. 654–669, 2018
work page 2018
-
[8]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, vol. 30, 2017, pp. 5998–6008
work page 2017
-
[9]
A comprehensive survey on graph neural networks,
Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y . Philip, “A comprehensive survey on graph neural networks,”IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2020
work page 2020
-
[10]
Nodeformer: A scalable graph structure learning transformer for node classification,
Q. Wu, W. Zhao, Z. Li, D. P. Wipf, and J. Yan, “Nodeformer: A scalable graph structure learning transformer for node classification,” inAdvances in Neural Information Processing Systems, vol. 35, 2022, pp. 27 387– 27 401
work page 2022
-
[11]
P. R. Low and E. Sakk, “Comparison between autoregressive integrated moving average and long short-term memory models for stock price prediction,”IAES International Journal of Artificial Intelligence, vol. 12, no. 4, pp. 1828–1835, 2023
work page 2023
-
[12]
The ARIMA model for the indonesia stock price,
S. T. Wahyudi, “The ARIMA model for the indonesia stock price,” International Journal of Economics and Management, vol. 11, pp. 223– 236, 2017
work page 2017
-
[13]
A na ¨ıve SVM-KNN based stock market trend reversal analysis for Indian benchmark indices,
R. K. Nayak, D. Mishra, and A. K. Rath, “A na ¨ıve SVM-KNN based stock market trend reversal analysis for Indian benchmark indices,” Applied Soft Computing, vol. 35, pp. 670–680, 2015
work page 2015
-
[14]
Prediction of stock index of Tata Steel using hybrid machine learning based optimization techniques,
M. Siddique and D. Panda, “Prediction of stock index of Tata Steel using hybrid machine learning based optimization techniques,”International Journal of Recent Technology and Engineering, vol. 8, pp. 3186–3193, 2019
work page 2019
-
[15]
Predicting the direction of stock market prices using tree-based classifiers,
S. Basak, S. Kar, S. Saha, L. Khaidem, and S. R. Dey, “Predicting the direction of stock market prices using tree-based classifiers,”The North American Journal of Economics and Finance, vol. 47, pp. 552–567, 2019
work page 2019
-
[16]
Convolutional neural network for stock trading using technical indicators,
S. K. Chandar, “Convolutional neural network for stock trading using technical indicators,”Automated Software Engineering, vol. 29, pp. 1– 14, 2022
work page 2022
-
[17]
Stock market trend prediction using high-order information of time series,
M. Wen, P. Li, L. Zhang, and Y . Chen, “Stock market trend prediction using high-order information of time series,”IEEE Access, vol. 7, pp. 28 299–28 308, 2019
work page 2019
-
[18]
Recurrent neural networks approach to the financial forecast of Google assets,
L. Di Persio and O. Honchar, “Recurrent neural networks approach to the financial forecast of Google assets,”International Journal of Mathematics and Computers in Simulation, vol. 11, pp. 7–13, 2017
work page 2017
-
[19]
Hy- brid deep learning model for stock price prediction,
M. A. Hossain, R. Karim, R. Thulasiram, N. Bruce, and Y . Wang, “Hy- brid deep learning model for stock price prediction,”IEEE Symposium Series on Computational Intelligence, pp. 1837–1844, 2018
work page 2018
-
[20]
Y . Xu, L. Chhim, B. Zhenget al., “Stacked deep learning structure with bidirectional long-short term memory for stock market prediction,” in Communications in Computer and Information Science, vol. 1265, 2020, pp. 447–460
work page 2020
-
[21]
A new deep network model for stock price prediction,
M. Liu, H. Sheng, N. Zhanget al., “A new deep network model for stock price prediction,” inInternational Conference on Machine Learning for Cyber Security, 2022, pp. 413–426
work page 2022
-
[22]
A novel graph convolutional feature based convolutional neural network for stock trend prediction,
W. Chen, M. Jiang, W.-G. Zhang, and Z. Chen, “A novel graph convolutional feature based convolutional neural network for stock trend prediction,”Information Sciences, vol. 556, pp. 67–94, 2021
work page 2021
-
[23]
C. Wang, H. Liang, B. Wang, X. Cui, and Y . Xu, “MG-Conv: A spa- tiotemporal multi-graph convolutional neural network for stock market index trend prediction,”Computers and Electrical Engineering, vol. 103, p. 108285, 2022
work page 2022
-
[24]
A financial forecasting model based on transformer architecture,
S. Li and Z. Qian, “A financial forecasting model based on transformer architecture,”IEEE International Conference on Big Data, pp. 5384– 5386, 2019
work page 2019
-
[25]
A CNN-BiLSTM-AM method for stock price prediction,
W. Lu, J. Li, J. Wanget al., “A CNN-BiLSTM-AM method for stock price prediction,”Neural Computing and Applications, vol. 33, pp. 4741–4753, 2021
work page 2021
-
[26]
SemEval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news,
K. Cortis, A. Freitas, T. Daudert, M. Huerlimann, M. Zarrouk, S. Hand- schuh, and B. Davis, “SemEval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news,” inProceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017, pp. 519–535
work page 2017
-
[27]
BERT: Pre- training of deep bidirectional transformers for language understanding,
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre- training of deep bidirectional transformers for language understanding,” inProceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2019, pp. 4171–4186. 18
work page 2019
-
[28]
Comparing predictive accuracy,
F. X. Diebold and R. S. Mariano, “Comparing predictive accuracy,” Journal of Business & Economic Statistics, vol. 13, no. 3, pp. 253–263, 1995
work page 1995
-
[29]
Semi-supervised classification with graph convolutional networks,
T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” inProceedings of the International Conference on Learning Representations, 2017
work page 2017
-
[30]
Y . Nejatbakhsh and M. Aliasgari, “Enhancing stock market prediction with hybrid deep learning: Integrating LSTM, transformer attention, federated learning, and sentiment analysis,”IEEE Access, vol. 14, pp. 3926–3942, 2025
work page 2025
-
[31]
D. Le, S. Rajasegarar, W. Luo, T. T. Nguyen, N. V o, Q. Nguyen, and M. Angelova, “EGCN: Entropy-based graph convolutional network for anomalous pattern detection and forecasting in real estate markets,” PLoS ONE, vol. 20, no. 10, p. e0334141, 2025
work page 2025
-
[32]
D. Le, S. Rajasegarar, W. Luo, T. T. Nguyen, and M. Angelova, “Enhancing real estate prediction with entropy-based pattern analysis and economic sentiment integration,”Engineering Computations, pp. 1–24, 2025. Mohammad Al Ridhawireceived the B.A.Sc. de- gree in computer engineering and the M.Sc. degree in digital transformation and innovation (machine lea...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.