pith. sign in

arxiv: 2606.12210 · v1 · pith:3ZZNZETHnew · submitted 2026-06-10 · 💻 cs.CL

Can News Predict the Market? Limits of Zero-Shot Financial NLP and the Role of Explainable AI

Pith reviewed 2026-06-27 09:41 UTC · model grok-4.3

classification 💻 cs.CL
keywords zero-shot learningfinancial NLPstock predictionexplainable AInatural language inferencesentiment analysismarket forecasting
0
0 comments X

The pith

Zero-shot natural language inference fails to extract reliable signals from financial news for predicting short-term stock movements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates if financial news can predict short-term stock price changes using zero-shot large language models without any specialized training. The authors create a pipeline that uses natural language inference to assess news sentiment toward stocks and aggregates these assessments over time, considering how recent or impactful each article is. Across various models and time frames, these zero-shot methods do not surpass basic statistical baselines, and they perform especially poorly when trying to forecast price drops. The findings point to fundamental challenges in connecting news content directly to immediate market behavior. At the same time, the added explainability tools consistently identify which predictions are more or less reliable, providing a way to use the system cautiously even when overall accuracy is low.

Core claim

Zero-shot approaches consistently fail to outperform simple baselines, with particularly weak performance on negative movements, suggesting deeper structural limitations in mapping news sentiment to short-term price dynamics. However, explainability signals reliably distinguish between trustworthy and unreliable predictions, offering practical value even when accuracy is limited. These findings highlight the limits of zero-shot financial NLP and motivate a shift toward decision-support systems that prioritise transparency and uncertainty awareness.

What carries the argument

A zero-shot natural language inference pipeline combined with temporal aggregation of news articles that models recency and event-dependent impact horizons, paired with a multi-layered explainability framework that links predictions to token-level, article-level, and aggregate evidence with grounded natural language rationales.

If this is right

  • Zero-shot financial NLP approaches have structural limits when applied to short-term price prediction from news.
  • Multi-layered explainability can add practical value by distinguishing trustworthy predictions from unreliable ones even when overall accuracy remains low.
  • Financial applications should prioritize transparency and uncertainty awareness in system design over attempts at high-accuracy zero-shot prediction.
  • The consistent underperformance on negative movements indicates that certain directions of market response may be especially difficult to capture without additional training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The failure may indicate that short-term price movements are driven more by non-news factors than by sentiment signals accessible through zero-shot inference.
  • Similar structural limits could appear when zero-shot methods are applied to predict quantitative outcomes from text in other domains without domain-specific adaptation.
  • Applying the temporal aggregation approach to longer prediction horizons could test whether news effects become more detectable when allowed to accumulate over multiple days.

Load-bearing premise

Financial news contains extractable predictive signals for short-term price changes that a zero-shot NLI pipeline with temporal aggregation can access.

What would settle it

A demonstration that the described zero-shot NLI pipeline with temporal aggregation achieves higher accuracy than simple baselines on a held-out dataset of news articles and corresponding stock price movements, particularly for negative changes.

read the original abstract

Can financial news reliably predict short-term stock movements? Despite advances in large language models, this question remains unresolved. We revisit this problem using a zero-shot natural language processing framework, investigating whether models can extract actionable signals from financial news without domain-specific training. We design a structured pipeline that combines zero-shot natural language inference with temporal aggregation, explicitly modelling recency and event-dependent impact horizons when integrating information across articles. To address the need for transparency in high-stakes settings, we introduce a multi-layered explainability framework that links predictions to token-level, article-level, and aggregate evidence, and produces grounded natural language rationales. Across multiple models and prediction horizons, we find that zero-shot approaches consistently fail to outperform simple baselines, with particularly weak performance on negative movements, suggesting deeper structural limitations in mapping news sentiment to short-term price dynamics. However, explainability signals reliably distinguish between trustworthy and unreliable predictions, offering practical value even when accuracy is limited. These findings highlight the limits of zero-shot financial NLP and motivate a shift toward decision-support systems that prioritise transparency and uncertainty awareness. Code: https://github.com/alimert05/zero-shot-stock-xai

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines whether zero-shot natural language inference (NLI) combined with temporal aggregation of financial news can predict short-term stock price movements. It reports that such zero-shot pipelines consistently fail to outperform simple baselines across models and horizons, with especially weak results on negative movements, and introduces a multi-layered explainability framework (token-, article-, and aggregate-level) that produces grounded rationales and distinguishes trustworthy from unreliable predictions. The work concludes that these results indicate structural limits in mapping news sentiment to prices and advocates shifting toward transparent decision-support systems.

Significance. If the empirical results hold after addressing the noted concerns, the paper would provide concrete evidence that zero-shot NLI approaches have limited utility for short-term financial prediction tasks, while demonstrating practical value for explainability techniques even when predictive accuracy is low. This could usefully redirect research emphasis in financial NLP toward uncertainty-aware and interpretable systems rather than pure accuracy maximization.

major comments (2)
  1. [Abstract] Abstract: the central inference that zero-shot NLI + recency/event-horizon aggregation reveals 'deeper structural limitations in mapping news sentiment to short-term price dynamics' requires that the tested pipeline would have surfaced usable signals if they existed. No ablations on prompt variants, alternative zero-shot formulations (e.g., direct classification vs. NLI), or head-to-head comparisons with supervised models on the same data are described to establish this sensitivity; without them the observed failure is compatible with either absent signals or an insensitive probe.
  2. [Abstract] Abstract and evaluation description: the claim of 'consistent outperformance failure' is presented without any quantitative metrics, dataset sizes, statistical tests, error bars, or baseline definitions. This absence prevents assessment of effect sizes or reliability and is load-bearing for the headline result.
minor comments (2)
  1. The GitHub link is provided but the manuscript does not indicate whether the released code includes the exact prompts, aggregation logic, and evaluation scripts used for the reported experiments.
  2. Notation for temporal aggregation and event horizons could be clarified with a small diagram or pseudocode to make the pipeline reproducible from the text alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of robustness and presentation. We address each major point below and have revised the manuscript accordingly where the concerns are valid.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central inference that zero-shot NLI + recency/event-horizon aggregation reveals 'deeper structural limitations in mapping news sentiment to short-term price dynamics' requires that the tested pipeline would have surfaced usable signals if they existed. No ablations on prompt variants, alternative zero-shot formulations (e.g., direct classification vs. NLI), or head-to-head comparisons with supervised models on the same data are described to establish this sensitivity; without them the observed failure is compatible with either absent signals or an insensitive probe.

    Authors: We agree that stronger evidence of probe sensitivity would bolster the structural-limit claim. Our design already varies models, horizons, and aggregation methods, but we accept that explicit ablations on prompt variants and zero-shot formulations (e.g., direct classification) were not reported. In revision we add a dedicated ablation subsection covering these variants. Head-to-head supervised comparisons lie outside the zero-shot scope stated in the title and introduction; we have added a limitations paragraph noting this boundary and suggesting it as future work. We have also softened the abstract language to tie the inference explicitly to the zero-shot NLI setting tested. revision: partial

  2. Referee: [Abstract] Abstract and evaluation description: the claim of 'consistent outperformance failure' is presented without any quantitative metrics, dataset sizes, statistical tests, error bars, or baseline definitions. This absence prevents assessment of effect sizes or reliability and is load-bearing for the headline result.

    Authors: The referee is correct that the abstract as originally submitted omitted these details. The body of the manuscript reports dataset sizes (approximately 120k news articles), baseline definitions (random, majority-class, and lagged-price baselines), accuracy/F1 scores, and statistical tests with error bars. We have revised the abstract to include the key quantitative results, dataset scale, and reference to statistical testing so that the headline claim is self-contained and verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation

full rationale

The paper reports experimental results from applying zero-shot NLI pipelines with temporal aggregation to financial news for stock movement prediction, comparing against baselines. No equations, parameter fitting, derivations, or self-citations are invoked as load-bearing premises for any claimed result. Performance observations (failure to outperform baselines, especially on negative moves) and explainability findings are presented as direct experimental outcomes. The manuscript is self-contained against external benchmarks with no reduction of claims to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations, fitted parameters, background axioms, or new postulated entities are described; the work is an empirical evaluation of an existing modeling approach.

pith-pipeline@v0.9.1-grok · 5739 in / 1132 out tokens · 19701 ms · 2026-06-27T09:41:20.604735+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

89 extracted references · 5 canonical work pages · 3 internal anchors

  1. [1]

    The journal of Finance25(2), 383–417 (1970) 19

    Fama, E.F.: Efficient capital markets: A review of theory and empirical work. The journal of Finance25(2), 383–417 (1970) 19

  2. [2]

    ACM Transactions on Information Systems (TOIS) 27(2), 1–19 (2009)

    Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Transactions on Information Systems (TOIS) 27(2), 1–19 (2009)

  3. [3]

    In: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, pp

    Evans, N., Edge, D., Larson, J., White, C.: News provenance: Revealing news text reuse at web-scale in an augmented news search experience. In: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–8 (2020)

  4. [4]

    Expert Systems with applications42(11), 4999–5010 (2015)

    Kauter, M., Breesch, D., Hoste, V.: Fine-grained analysis of explicit and implicit sen- timent in financial news articles. Expert Systems with applications42(11), 4999–5010 (2015)

  5. [5]

    Electronics10(20), 2554 (2021)

    Jacobs, G., Hoste, V.: Fine-grained implicit sentiment in financial news: Uncovering hidden bulls and bears. Electronics10(20), 2554 (2021)

  6. [6]

    Intelligent Systems with Applications26, 200518 (2025)

    Dakalbab, F., Kumar, A., Talib, M.A., Nasir, Q.: Advancing forex prediction through multimodal text-driven model and attention mechanisms. Intelligent Systems with Applications26, 200518 (2025)

  7. [7]

    Journal of the Association for Information Science and Technology65(4), 782–796 (2014)

    Malo, P., Sinha, A., Korhonen, P., Wallenius, J., Takala, P.: Good debt or bad debt: Detect- ing semantic orientations in economic texts. Journal of the Association for Information Science and Technology65(4), 782–796 (2014)

  8. [8]

    Expert Systems with Applications217, 119509 (2023)

    Ashtiani, M.N., Raahemi, B.: News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review. Expert Systems with Applications217, 119509 (2023)

  9. [9]

    ACM Computing Surveys56(9), 1–42 (2024)

    Du, K., Xing, F., Mao, R., Cambria, E.: Financial sentiment analysis: Techniques and applications. ACM Computing Surveys56(9), 1–42 (2024)

  10. [10]

    part a: detecting concept drift

    Hinder, F., Vaquet, V., Hammer, B.: One or two things we know about concept drift—a survey on monitoring in evolving environments. part a: detecting concept drift. Frontiers in Artificial Intelligence7, 1330257 (2024)

  11. [11]

    ACM Computing Surveys57(5), 1–35 (2025)

    Bayram, F., Ahmed, B.S.: Towards trustworthy machine learning in production: An overview of the robustness in mlops approach. ACM Computing Surveys57(5), 1–35 (2025)

  12. [12]

    Machine Learning with Applications 14, 100508 (2023)

    Fatouros, G., Soldatos, J., Kouroumali, K., Makridis, G., Kyriazis, D.: Transforming sen- timent analysis in the financial domain with chatgpt. Machine Learning with Applications 14, 100508 (2023)

  13. [13]

    Artificial Intelligence Review58(6) (2025)

    Yeo, W.J., Van Der Heever, W., Mao, R., Cambria, E., Satapathy, R., Mengaldo, G.: A comprehensive review on financial explainable ai: A comprehensive review on financial explainable ai: Wj yeo et al. Artificial Intelligence Review58(6) (2025)

  14. [14]

    20 Journal of Financial Economics140(3), 916–940 (2021)

    Armstrong, W.J., Cardella, L., Sabah, N.: Information shocks, disagreement, and drift. 20 Journal of Financial Economics140(3), 916–940 (2021)

  15. [15]

    Journal of Behavioral and Experimental Finance29, 100446 (2021)

    Fink, J.: A review of the post-earnings-announcement drift. Journal of Behavioral and Experimental Finance29, 100446 (2021)

  16. [16]

    Journal of Economics and Business93, 62–79 (2017)

    Unsal, O., Hassan, M.K., Zirek, D.: Product recalls and security prices: New evidence from the us market. Journal of Economics and Business93, 62–79 (2017)

  17. [17]

    Journal of Marketing81(5), 30–48 (2017)

    Liu, Y., Shankar, V., Yun, W.: Crisis management strategies and the long-term effects of product recalls on firm value. Journal of Marketing81(5), 30–48 (2017)

  18. [18]

    Review of Quantitative Finance and Accounting60(1), 31–67 (2023)

    Kawas, S., Dockery, E.: What do we know about the stock markets’ reaction to regulatory announcements regarding financial institutions? evidence from uk financial institutions. Review of Quantitative Finance and Accounting60(1), 31–67 (2023)

  19. [19]

    Bri `ere, M., Huynh, K., Laudy, O., Pouget, S.: Stock market reaction to news: Do tense and horizon matter? Finance Research Letters58, 104630 (2023)

  20. [20]

    Journal of Economic Dynamics and Control 134, 104290 (2022)

    Chen, C.Y.-H., Fengler, M.R., H¨ardle, W.K., Liu, Y.: Media-expressed tone, option char- acteristics, and stock return predictability. Journal of Economic Dynamics and Control 134, 104290 (2022)

  21. [21]

    FinBERT: Financial Sentiment Analysis with Pre-trained Language Models

    Araci, D.: Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063 (2019)

  22. [22]

    2306.06031 , archivePrefix=

    Yang, H., Liu, X.-Y., Wang, C.D.: Fingpt: Open-source financial large language models. arXiv preprint arXiv:2306.06031 (2023)

  23. [23]

    Information and Software Technology159, 107202 (2023)

    Alhoshan, W., Ferrari, A., Zhao, L.: Zero-shot learning for requirements classification: An exploratory study. Information and Software Technology159, 107202 (2023)

  24. [24]

    The Journal of Finance and Data Science10, 100137 (2024)

    Zuo, X., Jiang, A.A., Zhou, K.: Reinforcement prompting for financial synthetic data generation. The Journal of Finance and Data Science10, 100137 (2024)

  25. [25]

    Pattern Recognition164, 111593 (2025)

    Bashiri, H., Naderi, H.: Syntapulse: An unsupervised framework for sentiment annotation and semantic topic extraction. Pattern Recognition164, 111593 (2025)

  26. [26]

    In: International Conference on Learning Representations, vol

    Ye, Z., Gowda, S., Chen, S., Huang, X., Xu, H., Khan, F., Jin, Y., Huang, K., Jin, X.: Zerodiff: Solidified visual-semantic correlation in zero-shot learning. In: International Conference on Learning Representations, vol. 2025, pp. 60471–60491 (2025)

  27. [27]

    Advances in neural information processing systems26(2013)

    Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. Advances in neural information processing systems26(2013)

  28. [28]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Chen, Z., Huang, Y., Chen, J., Geng, Y., Zhang, W., Fang, Y., Pan, J.Z., Chen, H.: Duet: Cross-modal semantic grounding for contrastive zero-shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 405–413 (2023) 21

  29. [29]

    Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery13(2), 1488 (2023)

    Cao, W., Wu, Y., Sun, Y., Zhang, H., Ren, J., Gu, D., Wang, X.: A review on multi- modal zero-shot learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery13(2), 1488 (2023)

  30. [30]

    In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Gowda, S.N.: Synthetic sample selection for generalized zero-shot learning. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 58–67 (2023)

  31. [31]

    In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pp

    Li, X., Chan, S., Zhu, X., Pei, Y., Ma, Z., Liu, X., Shah, S.: Are chatgpt and gpt-4 general-purpose solvers for financial text analytics? a study on several typical tasks. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pp. 408–422 (2023)

  32. [32]

    In: 2024 IEEE 6th International Conference on Power, Intelligent Computing and Systems (ICPICS), pp

    Shen, Y., Zhang, P.K.: Financial sentiment analysis on news and reports using large language models and finbert. In: 2024 IEEE 6th International Conference on Power, Intelligent Computing and Systems (ICPICS), pp. 717–721 (2024). IEEE

  33. [33]

    Yin, W., Hay, J., Roth, D.: Benchmarking zero-shot text classification: Datasets, eval- uation and entailment approach. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3914–3923 (2019)

  34. [34]

    DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

    He, P., Gao, J., Chen, W.: Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv preprint arXiv:2111.09543 (2021)

  35. [35]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  36. [36]

    Political Analysis32(1), 84–100 (2024)

    Laurer, M., Van Atteveldt, W., Casas, A., Welbers, K.: Less annotating, more classifying: Addressing the data scarcity issue of supervised machine learning with deep transfer learning and bert-nli. Political Analysis32(1), 84–100 (2024)

  37. [37]

    In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

    Ribeiro, M.T., Singh, S., Guestrin, C.: ” why should i trust you?” explaining the pre- dictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

  38. [38]

    Advances in neural information processing systems30(2017)

    Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Advances in neural information processing systems30(2017)

  39. [39]

    Finance Research Letters, 108146 (2025)

    Jang, J.: Selective news selection model for explainable stock prediction via cross- attention integration. Finance Research Letters, 108146 (2025)

  40. [40]

    In: Proceedings of the ACM Web Conference 2024, pp

    Koa, K.J., Ma, Y., Ng, R., Chua, T.-S.: Learning to generate explainable stock predictions using self-reflective large language models. In: Proceedings of the ACM Web Conference 2024, pp. 4304–4315 (2024) 22

  41. [41]

    Advances in Neural Information Processing Systems36, 74952–74965 (2023)

    Turpin, M., Michael, J., Perez, E., Bowman, S.: Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems36, 74952–74965 (2023)

  42. [42]

    ACM computing surveys 55(12), 1–38 (2023)

    Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y.J., Madotto, A., Fung, P.: Survey of hallucination in natural language generation. ACM computing surveys 55(12), 1–38 (2023)

  43. [43]

    Journal of Accounting Research6(2), 159–178 (1968)

    Ball, R., Brown, P.: An empirical evaluation of accounting. Journal of Accounting Research6(2), 159–178 (1968)

  44. [44]

    Bernard, V.L., Thomas, J.K.: Post-earnings-announcement drift: delayed price response or risk premium? Journal of Accounting research27, 1–36 (1989)

  45. [45]

    Journal of Financial and Quantitative Analysis32(4), 507–524 (1997)

    Kim, S.T., Lin, J.-C., Slovin, M.B.: Market structure, informed trading, and analysts’ recommendations. Journal of Financial and Quantitative Analysis32(4), 507–524 (1997)

  46. [46]

    Womack, K.L.: Do brokerage analysts’ recommendations have investment value? The journal of finance51(1), 137–167 (1996)

  47. [47]

    Journal of Marketing81(2), 64–82 (2017)

    Warren, N.L., Sorescu, A.: When 1+ 1¿ 2: How investors react to new product releases announced concurrently with other corporate news. Journal of Marketing81(2), 64–82 (2017)

  48. [48]

    Journal of financial economics9(2), 139–183 (1981)

    Vermaelen, T.: Common stock repurchases and market signalling: An empirical study. Journal of financial economics9(2), 139–183 (1981)

  49. [49]

    The Journal of Law and Economics36(2), 757–802 (1993)

    Karpoff, J.M., Lott Jr, J.R.: The reputational penalty firms bear from committing criminal fraud. The Journal of Law and Economics36(2), 757–802 (1993)

  50. [50]

    Journal of Financial economics41(2), 153–192 (1996)

    Schwert, G.W.: Markup pricing in mergers and acquisitions. Journal of Financial economics41(2), 153–192 (1996)

  51. [51]

    The Journal of finance50(4), 1029–1057 (1995)

    Denis, D.J., Denis, D.K.: Performance changes following top management dismissals. The Journal of finance50(4), 1029–1057 (1995)

  52. [52]

    Robert H

    Heston, S.L., Sinha, N.R.: News versus sentiment: Predicting stock returns from news stories. Robert H. Smith School Research Paper (2015)

  53. [53]

    Journal of Financial Economics17(1), 57–89 (1986)

    Holthausen, R.W., Leftwich, R.W.: The effect of bond rating changes on common stock prices. Journal of Financial Economics17(1), 57–89 (1986)

  54. [54]

    The Journal of finance 63(6), 2899–2939 (2008)

    Campbell, J.Y., Hilscher, J., Szilagyi, J.: In search of distress risk. The Journal of finance 63(6), 2899–2939 (2008)

  55. [55]

    The Journal of Finance67(2), 561–598 (2012) 23

    Gurun, U.G., Butler, A.W.: Don’t believe the hype: Local media slant, local advertising, and firm value. The Journal of Finance67(2), 561–598 (2012) 23

  56. [56]

    International Review of Financial Analysis82, 102185 (2022)

    Ballinari, D., Audrino, F., Sigrist, F.: When does attention matter? the effect of investor attention on stock market volatility around news releases. International Review of Financial Analysis82, 102185 (2022)

  57. [57]

    In: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp

    Boyle, A.D., Cheng, F., Zouhar, V., El-Assady, M.: Cafga: Customizing feature attribu- tions to explain language models. In: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 461–470 (2025)

  58. [58]

    Patterns6(6) (2025)

    Chen, B., Zhang, Z., Langren ´e, N., Zhu, S.: Unleashing the potential of prompt engineering for large language models. Patterns6(6) (2025)

  59. [59]

    Cutler, D.M., Poterba, J.M., Summers, L.H.: What moves stock prices? National Bureau of Economic Research Cambridge, Mass., USA (1988)

  60. [60]

    Journal of Financial Markets50, 100511 (2020)

    Ding, R., Zhou, H., Li, Y.: Social media, financial reporting opacity, and return comovement: Evidence from seeking alpha. Journal of Financial Markets50, 100511 (2020)

  61. [61]

    Expert Systems with Applications203, 117409 (2022)

    Dolatsara, H.A., Kibis, E., Caglar, M., Simsek, S., Dag, A., Dolatsara, G.A., Delen, D.: An interpretable decision-support systems for daily cryptocurrency trading. Expert Systems with Applications203, 117409 (2022)

  62. [62]

    Expert Systems with Applications285, 127864 (2025)

    Gupta, T., Devji, S., Tripathi, A.K.: Investigating the impact of sentiments on stock market using digital proxies: Current trends, challenges, and future directions. Expert Systems with Applications285, 127864 (2025)

  63. [63]

    Journal of International Money and Finance128, 102709 (2022)

    Hayo, B., Henseler, K., Rapp, M.S., Zahner, J.: Complexity of ecb communication and financial market trading. Journal of International Money and Finance128, 102709 (2022)

  64. [64]

    Pearson Education India, ??? (2016)

    Hull, J.C., Basu, S.: Options, Futures, and Other Derivatives. Pearson Education India, ??? (2016)

  65. [65]

    Pacific-Basin Finance Journal93, 102852 (2025)

    Huynh, T.H.H., Dang, T.L.: The liquidity effect of media coverage: International evidence. Pacific-Basin Finance Journal93, 102852 (2025)

  66. [66]

    Morgan Guaranty Trust Company of New York: New York51, 54 (1996)

    Longerstaey, J., Spencer, M.: Riskmetricstm—technical document. Morgan Guaranty Trust Company of New York: New York51, 54 (1996)

  67. [67]

    arXiv preprint arXiv:2310.05797 (2023)

    Kroeger, N., Ley, D., Krishna, S., Agarwal, C., Lakkaraju, H.: In-context explainers: Harnessing llms for explaining black box models. arXiv preprint arXiv:2310.05797 (2023)

  68. [68]

    In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp

    Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020)

  69. [69]

    ACM computing surveys55(9), 1–35 (2023)

    Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and 24 predict: A systematic survey of prompting methods in natural language processing. ACM computing surveys55(9), 1–35 (2023)

  70. [70]

    Information Fusion106, 102301 (2024)

    Longo, L., Brcic, M., Cabitza, F., Choi, J., Confalonieri, R., Del Ser, J., Guidotti, R., Hayashi, Y., Herrera, F., Holzinger, A.,et al.: Explainable artificial intelligence (xai) 2.0: A manifesto of open challenges and interdisciplinary research directions. Information Fusion106, 102301 (2024)

  71. [71]

    In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp

    Lu, Y., Bartolo, M., Moore, A., Riedel, S., Stenetorp, P.: Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8086–8098 (2022)

  72. [72]

    Computational Linguistics50(2), 657–723 (2024)

    Lyu, Q., Apidianaki, M., Callison-Burch, C.: Towards faithful model explanation in nlp: A survey. Computational Linguistics50(2), 657–723 (2024)

  73. [73]

    Ma, T., Yao, J.-G., Lin, C.-Y., Zhao, T.: Issues with entailment-based zero-shot text classification. In: Proceedings of the 59th Annual Meeting of the Association for Com- putational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 786–796 (2021)

  74. [74]

    Psychological review63(2), 81 (1956)

    Miller, G.A.: The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review63(2), 81 (1956)

  75. [75]

    Artificial intelligence267, 1–38 (2019)

    Miller, T.: Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence267, 1–38 (2019)

  76. [76]

    In: Proceedings of the 29th International Conference on Computational Linguistics, pp

    Mosca, E., Szigeti, F., Tragianni, S., Gallagher, D., Groh, G.: Shap-based explanation methods: a review for nlp interpretability. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 4593–4603 (2022)

  77. [77]

    ACM Computing Surveys55(13s), 1–42 (2023)

    Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters, M., Schmitt, Y., Schl ¨otterer, J., Van Keulen, M., Seifert, C.: From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai. ACM Computing Surveys55(13s), 1–42 (2023)

  78. [78]

    Global Finance Journal17(1), 92–104 (2006)

    Nikkinen, J., Omran, M., Sahlstr ¨om, P., ¨Aij¨o, J.: Global stock market reactions to sched- uled us macroeconomic news announcements. Global Finance Journal17(1), 92–104 (2006)

  79. [79]

    Finance Research Letters58, 104487 (2023)

    Pan, C., Zhang, W., Wang, W.: Global geopolitical risk and volatility connectedness among china’s sectoral stock markets. Finance Research Letters58, 104487 (2023)

  80. [80]

    In: International Conference on Learning Representations, vol

    Sclar, M., Choi, Y., Tsvetkov, Y., Suhr, A.: Quantifying language models’ sensitivity to spurious features in prompt design or: How i learned to start worrying about prompt formatting. In: International Conference on Learning Representations, vol. 2024, pp. 25055–25083 (2024) 25

Showing first 80 references.