pith. sign in

arxiv: 2506.11512 · v2 · submitted 2025-06-13 · 💻 cs.LG · cs.AI

From Time Series Analysis to Question Answering: A Survey in the LLM Era

Pith reviewed 2026-05-19 09:29 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords time series analysislarge language modelsquestion answeringalignment paradigmstime series question answeringsurveytemporal data tasks
0
0 comments X

The pith

Time series analysis is evolving into flexible question answering by shifting from external to internal alignment with large language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that traditional time series tasks like forecasting are giving way to a broader question-answering setup where users pose natural language queries about temporal data. This change matters because it moves control from specialized experts using fixed pipelines to everyday users who can explore data more freely. The authors group existing approaches into three alignment types based on whether the connection between language models and time series happens mostly outside the model, across a bridge, or inside the model itself. If the taxonomy holds, it supplies a practical map for picking methods that balance flexibility, cost, and applicability across fields such as finance and sensor data. Readers should care because it unifies scattered research under one evolutionary story and points to concrete ways to close the gap between language model strengths and temporal data needs.

Core claim

TSA is evolving toward TSQA, shifting from expert-driven and task-specific analysis to user-driven and task-unified question answering, organized into Injective Alignment, Bridging Alignment, and Internal Alignment paradigms driven by a shift from external to internal alignment.

What carries the argument

The three alignment paradigms (Injective Alignment, Bridging Alignment, and Internal Alignment) that organize literature by the degree of external versus internal integration between large language models and time series data.

If this is right

  • Practitioners gain concrete criteria for picking alignment methods that suit their data scale and compute budget.
  • Dataset creators should prioritize formats that support open-ended questions rather than single-task labels.
  • Model developers can focus design effort on internal alignment techniques that reduce the need for separate preprocessing steps.
  • Cross-domain applications become easier once the same alignment choice works for both short sensor streams and long financial series.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same external-to-internal lens could be applied to other data types such as graphs or spatial data to create similar unified frameworks.
  • Internal alignment may eventually allow single models to handle mixed temporal and textual queries without task-specific fine-tuning.
  • Testing the taxonomy on private industry datasets would reveal whether the guidance remains generalizable beyond public benchmarks.

Load-bearing premise

The proposed division into external-to-internal alignment stages correctly mirrors how the field has actually progressed and gives reliable advice for choosing methods in new settings.

What would settle it

A new survey or set of case studies that finds most current work still relies on external tools and does not show a measurable trend toward internal alignment methods.

Figures

Figures reproduced from arXiv: 2506.11512 by Dan Pei, Wei Li, Xiaofeng Meng, Xinli Hao, Yunyao Cheng, Yuxuan Liang, Zhe Xie.

Figure 1
Figure 1. Figure 1: Activating time-series reasoning capabilities within LLMs prioritizes [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The generalist perspective prioritizes data alignment over debugging and training models. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: LLMs are more aligned with time-series reasoning tasks than with classical analysis tasks. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Taxonomy of alignment paradigms grounded in time-series primitives. We categorize relevant literature into three alignment paradigms according to the most salient design principle observed in each work, and arrange them chronologically: Injective Alignment (grounded in Domain Primitive), Bridging Alignment (grounded in Characteristic Primitive), and Internal Alignment (grounded in Representation Primitive)… view at source ↗
Figure 5
Figure 5. Figure 5: Boundaries of three proposed alignment paradigms [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: illustrates the alignment process fθ(X) → Y under the three proposed alignment paradigms for time-series reasoning with LLMs: (1) Injective Alignment directly injects the numerical represen￾tation into the textual input, yielding X′ Text. This paradigm interacts with LLMs externally through instruction tuning, prefix prompting, or other techniques, formally defined as hγ (XNumber, XText), parameterized by … view at source ↗
Figure 7
Figure 7. Figure 7: Alignment advantages and disadvantages for novel time-series reasoning tasks. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Literature taxonomy of prioritizing alignment paradigms over task-specific model cus￾tomization in time-series LLMs. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Human interaction with DeepSeek-R1 and ChatGPT-o1 using [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Human interaction with DeepSeek-R1 and ChatGPT-o1 by [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Human interaction with DeepSeek-R1 and ChatGPT-o1 by [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
read the original abstract

Recently, Large Language Models (LLMs) have introduced a novel paradigm in Time Series Analysis (TSA), leveraging strong language capabilities to support tasks such as forecasting and anomaly detection. However, these analysis tasks cannot adequately cover temporal language tasks, such as interpretation and captioning. A fundamental gap remains between TSA and LLMs: LLMs are pre-trained to optimize natural language relevance for question answering rather than objectives specialized for TSA. To bridge this gap, TSA is evolving toward Time Series Question Answering (TSQA), shifting from expert-driven and task-specific analysis to user-driven and task-unified question answering. TSQA depends on flexible exploration rather than predefined TSA pipelines. In this survey, we first propose a taxonomy that reflects the evolution from TSA to TSQA, driven by a shift from external to internal alignment. We then organize existing literature into three alignment paradigms: Injective Alignment, Bridging Alignment, and Internal Alignment, and provide practical guidance for flexible, economical, and generalizable selection of alignment paradigms. We finally analyze datasets across domains and characteristics, identify challenges, and highlight future research directions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript is a survey on the integration of Large Language Models with Time Series Analysis (TSA). It claims that TSA is evolving toward Time Series Question Answering (TSQA), shifting from expert-driven, task-specific methods to user-driven, task-unified question answering. The central contribution is a taxonomy organizing the literature into three alignment paradigms—Injective Alignment, Bridging Alignment, and Internal Alignment—driven by a progression from external to internal alignment mechanisms. The paper reviews existing works under this taxonomy, supplies practical guidance for paradigm selection, analyzes datasets across domains, and identifies challenges plus future directions.

Significance. If the taxonomy is shown to be reproducible and the literature coverage is comprehensive, the survey would provide a useful organizing lens for a fast-moving interdisciplinary area. It synthesizes disparate TSA+LLM efforts, highlights the move toward flexible question-answering interfaces, and supplies dataset overviews that could aid new researchers. Explicit credit is due for attempting to move beyond task-specific pipelines toward unified, user-facing temporal reasoning.

major comments (1)
  1. [§3] §3 (Taxonomy of Alignment Paradigms): Explicit classification criteria, decision rules, or boundary examples are not supplied for assigning works to Injective, Bridging, or Internal Alignment. Without these, or coverage statistics showing how the surveyed papers partition, it remains unclear whether the external-to-internal shift accurately reflects the literature or functions mainly as a post-hoc organizing lens, which directly affects the defensibility of the practical guidance for paradigm selection.
minor comments (2)
  1. [Abstract] Abstract and §1: The scope of the literature search (keywords, time window, venues) is not stated, making it hard to assess completeness.
  2. [§5] §5 (Datasets): A summary table comparing domain, size, task types, and alignment paradigm coverage would improve readability and allow readers to quickly locate relevant resources.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our survey. We have reviewed the major comment carefully and provide a point-by-point response below, including planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Taxonomy of Alignment Paradigms): Explicit classification criteria, decision rules, or boundary examples are not supplied for assigning works to Injective, Bridging, or Internal Alignment. Without these, or coverage statistics showing how the surveyed papers partition, it remains unclear whether the external-to-internal shift accurately reflects the literature or functions mainly as a post-hoc organizing lens, which directly affects the defensibility of the practical guidance for paradigm selection.

    Authors: We thank the referee for this important observation. Section 3 defines the paradigms according to the primary alignment mechanism: Injective Alignment directly projects time-series representations into the LLM input space (e.g., via linear or convolutional adapters without intermediate modules); Bridging Alignment introduces auxiliary components such as separate time-series encoders, retrieval modules, or adapters that mediate between modalities; and Internal Alignment modifies the LLM itself through continued pre-training, architectural changes, or parameter-efficient fine-tuning to internalize temporal reasoning. These distinctions are illustrated with representative works, but we agree that explicit decision rules and boundary cases are needed for reproducibility. In the revised manuscript we will add a new subsection (3.4) containing (i) a decision flowchart with concrete criteria (e.g., “if the method uses an external encoder whose output is concatenated to the LLM prompt, classify as Bridging; if the encoder is removed at inference and the LLM weights are updated on time-series objectives, classify as Internal”), (ii) three boundary examples per category with justification, and (iii) a coverage table reporting the number and percentage of surveyed papers assigned to each paradigm. These additions will make the taxonomy falsifiable and will directly support the practical guidance for paradigm selection. revision: yes

Circularity Check

0 steps flagged

Survey taxonomy organizes external literature without self-referential reduction

full rationale

This is a survey paper that reviews and organizes existing TSA+LLM literature into three alignment paradigms (Injective, Bridging, Internal) based on a proposed external-to-internal shift. The central taxonomy is presented as an organizing framework derived from cited external works rather than any fitted parameters, equations, or self-definitional loops. No load-bearing self-citations, ansatzes smuggled via prior author work, or predictions that reduce to inputs by construction appear in the provided abstract or structure. The derivation chain is self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 4 invented entities

The survey introduces conceptual classifications without new empirical support; it relies on domain assumptions about LLM pretraining objectives and the existence of a gap between TSA and QA.

axioms (1)
  • domain assumption LLMs are pre-trained to optimize natural language relevance for question answering rather than objectives specialized for TSA
    Stated directly in the abstract as the fundamental gap motivating the shift to TSQA.
invented entities (4)
  • Time Series Question Answering (TSQA) no independent evidence
    purpose: Unifying concept for user-driven temporal language tasks
    New term introduced to describe the evolution beyond traditional TSA tasks.
  • Injective Alignment no independent evidence
    purpose: One of three alignment paradigms in the taxonomy
    New classification label for literature organization.
  • Bridging Alignment no independent evidence
    purpose: One of three alignment paradigms in the taxonomy
    New classification label for literature organization.
  • Internal Alignment no independent evidence
    purpose: One of three alignment paradigms in the taxonomy
    New classification label for literature organization.

pith-pipeline@v0.9.0 · 5741 in / 1341 out tokens · 31196 ms · 2026-05-19T09:29:13.668690+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

136 extracted references · 136 canonical work pages · 4 internal anchors

  1. [1]

    Abdel-Sater and A

    R. Abdel-Sater and A. B. Hamza. A federated large language model for long-term time series forecasting. arXiv preprint, abs/2407.20503, 2024

  2. [2]

    T. Aksu, C. Liu, A. Saha, S. Tan, C. Xiong, and D. Sahoo. Xforecast: Evaluating natural language explanations for time series forecasting. arXiv preprint, abs/2410.14180, 2024

  3. [3]

    A. F. Ansari, L. Stella, C. Turkmen, X. Zhang, P. Mercado, H. Shen, O. Shchur, S. S. Ran- gapuram, S. P. Arango, S. Kapoor, J. Zschiegner, D. C. Maddix, H. Wang, M. W. Mahoney, K. Torkkola, A. G. Wilson, M. Bohlke-Schneider, and Y . Wang. Chronos: Learning the language of time series. arXiv preprint, abs/2403.07815, 2024

  4. [4]

    Bellos and N

    F. Bellos and N. H. N. nd Jason J. Corso. Vitro: V ocabulary inversion for time-series represen- tation optimization. arXiv preprint, abs/2412.17921, 2024

  5. [5]

    Bhatia, E

    G. Bhatia, E. M. B. Nagoudi, H. Cavusoglu, and M. Abdul-Mageed. Fintral: A family of gpt-4 level multimodal financial large language models. arXiv preprint, abs/2402.10986, 2024

  6. [6]

    Y . Bian, X. Ju, J. Li, Z. Xu, D. Cheng, and Q. Xu. Multi-patch prediction: Adapting language models for time series representation learning. In ICML, 2024

  7. [7]

    Y . Cai, M. Goswami, A. Choudhry, A. Srinivasan, and A. Dubrawski. Jolt: Jointly learned representations of language and time- series. In NeurIPS, 2023

  8. [8]

    Y . Cai, A. Choudhry, M. Goswami, and A. Dubrawski. Timeseriesexam: A time series understanding exam. arXiv preprint, abs/2410.14752, 2024

  9. [9]

    D. Cao, F. Jia, S. ¨O. Arik, T. Pfister, Y . Zheng, W. Ye, and Y . Liu. TEMPO: prompt-based generative pre-trained transformer for time series forecasting. In ICLR, 2024

  10. [10]

    Carson, X

    E. Carson, X. Chen, and C. Kang. Llm-abba: Understanding time series via symbolic approximation. arXiv preprint, abs/2411.18506, 2024

  11. [11]

    N. Chan, F. Parker, W. Bennett, T. Wu, M. Y . Jia, et al. Medtsllm: Leveraging llms for multimodal medical time series analysis. arXiv preprint, abs/2408.07773, 2024

  12. [12]

    Chang, W

    C. Chang, W. Peng, and T. Chen. LLM4TS: two-stage fine-tuning for time-series forecasting with pre-trained llms. arXiv preprint, abs/2308.08469, 2023

  13. [13]

    C. Chen, G. Oliveira, H. S. Noghabi, and T. Sylvain. Llm-ts integrator: Integrating llm for enhanced time series modeling. arXiv preprint, abs/2410.16489, 2024

  14. [14]

    M. Chen, L. Shen, Z. Li, X. J. Wang, J. Sun, and C. Liu. Visionts: Visual masked autoencoders are free-lunch zero-shot time series forecasters. arXiv preprint, abs/2408.17253, 2024

  15. [15]

    Y . Chen, Z. Li, C. Yang, X. Wang, and G. Xu. Large language models are few-shot multivariate time series classifiers. arXiv preprint, abs/2502.00059, 2025

  16. [16]

    Cheng, Y

    M. Cheng, Y . Chen, Q. Liu, Z. Liu, and Y . Luo. Advancing time series classification with multimodal language modeling. arXiv preprint, abs/2403.12371, 2024

  17. [17]

    Cheng, X

    M. Cheng, X. Tao, Q. Liu, H. Zhang, et al. Cross-domain pre-training with language models for transferable time series representations. arXiv preprint, abs/2403.12372, 2024

  18. [18]

    Cosentino, A

    J. Cosentino, A. Belyaeva, X. Liu, et al. Towards a personal health large language model. arXiv preprint, abs/2406.06474, 2024

  19. [19]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    DeepSeek-AI, D. Guo, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint, abs/2501.12948, 2025. 10

  20. [20]

    Devlin, M

    J. Devlin, M. Chang, K. Lee, et al. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, pages 4171–4186, 2019

  21. [21]

    Y . Ding, S. Jia, T. Ma, B. Mao, X. Zhou, L. Li, and D. Han. Integrating stock features and global information via large language models for enhanced stock return prediction. arXiv preprint, abs/2310.05627, 2023

  22. [22]

    Z. Dong, X. Fan, and Z. Peng. Fnspid: A comprehensive financial news dataset in time series. arXiv preprint, abs/2402.06698, 2024

  23. [23]

    Y . Duan, C. Chau, Z. Wang, Y . Wang, and C. Lin. Dewave: Discrete encoding of EEG waves for EEG to text translation. In NeurIPS, 2023

  24. [24]

    Ermshaus, P

    A. Ermshaus, P. Sch¨afer, and U. Leser. Raising the class of streaming time series segmentation. Proc. VLDB Endow., 17(8):1953–1966, 2024

  25. [25]

    J. Feng, Y . Du, T. Liu, S. Guo, Y . Lin, and Y . Li. Citygpt: Empowering urban spatial cognition of large language models. arXiv preprint, abs/2406.13948, 2024

  26. [26]

    Y . Ge, J. Li, Y . Zhao, H. Wen, Z. Li, M. Qiu, H. Li, M. Jin, and S. Pan. T2s: High-resolution time series generation with text-to-series diffusion models. In IJCAI, 2025

  27. [27]

    Grauman, A

    K. Grauman, A. Westbury, E. Byrne, Z. Chavis, A. Furnari, R. Girdhar, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In CVPR, pages 18995–19012, 2022

  28. [28]

    Gruver, M

    N. Gruver, M. Finzi, S. Qiu, and A. G. Wilson. Large language models are zero-shot time series forecasters. In NeurIPS, 2023

  29. [29]

    Y . Gu, Y . Xiong, J. Mace, Y . Jiang, et al. Argos: Agentic time-series anomaly detection with autonomous rule generation via large language models. arXiv preprint, abs/2501.14170, 2025

  30. [30]

    C. Han, Q. Wang, H. Peng, W. Xiong, Y . Chen, H. Ji, and S. Wang. Lm-infinite: Zero-shot extreme length generalization for large language models. In NAACL, pages 3991–4008, 2024

  31. [31]

    X. Hao, Y . Chen, C. Yang, Z. Du, C. Ma, C. Wu, and X. Meng. From chaos to clarity: Time series anomaly detection in astronomical observations. In ICDE, pages 570–583, 2024

  32. [32]

    Hollenstein, M

    N. Hollenstein, M. Troendle, C. Zhang, and N. Langer. Zuco 2.0: A dataset of physiological recordings during natural reading and annotation. arXiv preprint, abs/1912.00903, 2019

  33. [33]

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models. In ICLR, 2022

  34. [34]

    Y . Hu, Q. Li, D. Zhang, J. Yan, and Y . Chen. Context-alignment: Activating and enhancing llm capabilities in time series. arXiv preprint, abs/2501.03747, 2025

  35. [35]

    Huang, Z

    Q. Huang, Z. Zhou, K. Yang, G. Lin, Z. Yi, and Y . Wang. Leret: Language-empowered retentive network for time series forecasting. In IJCAI, pages 4165–4173, 2024

  36. [36]

    A. Ito, K. Dohi, et al. Clasp: Learning concepts for time-series signals from natural language supervision. arXiv preprint, abs/2411.08397, 2025

  37. [37]

    F. Jia, K. Wang, Y . Zheng, D. Cao, and Y . Liu. GPT4MTS: prompt-based large language model for multimodal time-series forecasting. In AAAI, pages 23343–23351, 2024

  38. [38]

    Jiang, Z

    Y . Jiang, Z. Pan, X. Zhang, S. Garg, A. Schneider, Y . Nevmyvaka, and D. Song. Empowering time series analysis with large language models: A survey. In IJCAI, pages 8095–8103, 2024

  39. [39]

    Jiang, W

    Y . Jiang, W. Yu, G. Lee, D. Song, K. Shin, W. Cheng, Y . Liu, and H. Chen. Explainable multi-modal time series prediction with llm-in-the-loop. arXiv preprint, abs/2503.01013, 2025

  40. [40]

    M. Jin, Q. Wen, Y . Liang, C. Zhang, S. Xue, X. Wang, J. Zhang, Y . Wang, H. Chen, X. Li, S. Pan, V . S. Tseng, Y . Zheng, L. Chen, and H. Xiong. Large models for time series and spatio-temporal data: A survey and outlook. arXiv preprint, abs/2310.10196, 2023

  41. [41]

    M. Jin, H. Tang, C. Zhang, Q. Yu, C. Liu, S. Zhu, et al. Time series forecasting with llms: Understanding and enhancing model capabilities. arXiv preprint, abs/2402.10835, 2024

  42. [42]

    M. Jin, S. Wang, L. Ma, Z. Chu, J. Y . Zhang, X. Shi, P. Chen, Y . Liang, Y . Li, et al. Time-llm: Time series forecasting by reprogramming large language models. In ICLR, 2024

  43. [43]

    M. Jin, Y . Zhang, W. Chen, K. Zhang, Y . Liang, B. Yang, et al. Position: What can large language models tell us about time series analysis. arXiv preprint, abs/2402.02713, 2024. 11

  44. [44]

    Y . Kong, Y . Yang, Y . Hwang, W. Du, S. Zohren, Z. Wang, M. Jin, and Q. Wen. Time- mqa: Time series multi-task question answering with context enhancement. arXiv preprint, abs/2503.01875, 2025

  45. [45]

    Y . Kong, Y . Yang, S. Wang, et al. Position: Empowering time series reasoning with multimodal llms. arXiv preprint, abs/2502.01477, 2025

  46. [46]

    G. Lee, W. Yu, K. Shin, W. Cheng, and H. Chen. Timecap: Learning to contextualize, augment, and predict time series events with large language model agents. arXiv preprint, abs/2502.11418, 2025

  47. [47]

    H. Li, X. Chen, C. Zhang, S. F. Quan, W. D. S. Killgore, S.-F. Wung, C. X. Chen, G. Yuan, J. Lu, and A. Li. Enhancing visual inspection capability of multi-modal large language models on medical time series with supportive conformalized and interpretable small specialized models. arXiv preprint, abs/2501.16215, 2025

  48. [48]

    J. Li, C. Liu, S. Cheng, R. Arcucci, and S. Hong. Frozen language model helps ecg zero-shot learning. In Medical Imaging with Deep Learning, pages 402–415, 2024

  49. [49]

    S. Li, W. Yang, P. Zhang, X. Xiao, D. Cao, Y . Qin, X. Zhang, Y . Zhao, and P. Bogdan. Climatellm: Efficient weather forecasting via frequency-aware large language models. arXiv preprint, abs/2502.11059, 2025

  50. [50]

    Z. Li, S. Li, and X. Yan. Time series as images: Vision transformer for irregularly sampled time series. In NeurIPS, volume 36, pages 49187–49204, 2023

  51. [51]

    Z. Li, S. Deldari, L. Chen, et al. Sensorllm: Aligning large language models with motion sensors for human activity recognition. arXiv preprint, abs/2410.10624, 2024

  52. [52]

    Z. Li, L. Xia, J. Tang, Y . Xu, L. Shi, L. Xia, D. Yin, and C. Huang. Urbangpt: Spatio-temporal large language models. arXiv preprint, abs/2403.00813, 2024

  53. [53]

    Z. Li, X. Lin, Z. Liu, J. Zou, Z. Wu, et al. Language in the flow of time: Time-series-paired texts weaved into a unified temporal narrative. arXiv preprint, abs/2502.08942, 2025

  54. [54]

    Liang, H

    Y . Liang, H. Wen, Y . Nie, Y . Jiang, M. Jin, D. Song, S. Pan, and Q. Wen. Foundation models for time series analysis: A tutorial and survey. In SIGKDD, pages 6555–6565, 2024

  55. [55]

    Liang, H

    Y . Liang, H. Wen, Y . Xia, M. Jin, B. Yang, F. Salim, Q. Wen, S. Pan, and G. Cong. Foundation models for spatio-temporal data science: A tutorial and survey. In SIGKDD, 2025

  56. [56]

    M. Lin, Z. Chen, Y . Liu, X. Zhao, Z. Wu, J. Wang, X. Zhang, S. Wang, and H. Chen. Decoding time series with llms: A multi-agent framework for cross-domain annotation. arXiv preprint, abs/2410.17462, 2025

  57. [57]

    C. Liu, S. He, Q. Zhou, S. Li, and W. Meng. Large language model guided knowledge distillation for time series anomaly detection. arXiv preprint, abs/2401.15123, 2024

  58. [58]

    C. Liu, Z. Wan, S. Cheng, et al. ETP: learning transferable ecg representations via ecg-text pre-training. In ICASSP, pages 8230–8234, 2024

  59. [59]

    C. Liu, Q. Xu, H. Miao, S. Yang, L. Zhang, C. Long, et al. Timecma: Towards llm-empowered time series forecasting via cross-modality alignment. arXiv preprint, abs/2406.01638, 2024

  60. [60]

    C. Liu, S. Yang, Q. Xu, Z. Li, C. Long, et al. Spatial-temporal large language model for traffic prediction. arXiv preprint, abs/2401.10134, 2024

  61. [61]

    C. Liu, H. Miao, Q. Xu, S. Zhou, C. Long, Y . Zhao, Z. Li, and R. Zhao. Efficient multivariate time series forecasting via calibrated language models with privileged knowledge distillation. arXiv preprint, abs/2505.02138, 2025

  62. [62]

    C. Liu, S. Zhou, Q. Xu, et al. Towards cross-modality modeling for time series analytics: A survey in the llm era. arXiv preprint, abs/2505.02583, 2025

  63. [63]

    H. Liu, S. Xu, Z. Zhao, L. Kong, H. Kamarthi, et al. Time-mmd: Multi-domain multimodal dataset for time series analysis. NeurIPS Datasets and Benchmarks Track, 2024

  64. [64]

    H. Liu, Z. Zhao, J. Wang, et al. Lstprompt: Large language models as zero-shot time series forecasters by long-short-term prompting. arXiv preprint, abs/2402.16132, 2024

  65. [65]

    H. Liu, H. Kamarthi, Z. Zhao, S. Xu, S. Wang, Q. Wen, T. Hartvigsen, F. Wang, and B. A. Prakash. How can time series analysis benefit from multiple modalities? A survey and outlook. arXiv preprint, abs/2503.11835, 2025. 12

  66. [66]

    H. Liu, C. Liu, and B. A. Prakash. A picture is worth a thousand numbers: Enabling llms reason about time series via visualization. arXiv preprint, abs/2411.06018, 2025

  67. [67]

    H. Liu, S. Xu, Z. Zhao, L. Kong, H. Kamarthi, A. B. Sasanur, M. Sharma, J. Cui, Q. Wen, C. Zhang, and B. A. Prakash. Time-mmd: Multi-domain multimodal dataset for time series analysis. arXiv preprint, abs/2406.08627, 2025

  68. [68]

    J. Liu, C. Zhang, J. Qian, M. Ma, S. Qin, C. Bansal, Q. Lin, S. Rajmohan, and D. Zhang. Large language models can deliver accurate and interpretable time series anomaly detection. arXiv preprint, abs/2405.15370, 2024

  69. [69]

    P. Liu, H. Guo, T. Dai, N. Li, J. Bao, X. Ren, Y . Jiang, and S.-T. Xia. Calf: Aligning llms for time series forecasting via cross-modal fine-tuning. arXiv preprint, abs/2403.07300, 2024

  70. [70]

    X. Liu, J. Hu, Y . Li, et al. Unitime: A language-empowered unified model for cross-domain time series forecasting. arXiv preprint, abs/2310.09751, 2023

  71. [71]

    X. Liu, D. McDuff, G. Kovacs, I. Galatzer-Levy, J. Sunshine, J. Zhan, M.-Z. Poh, S. Liao, P. Di Achille, and S. Patel. Large language models are few-shot health learners.arXiv preprint, abs/2305.15525, 2023

  72. [72]

    Y . Liu, T. Hu, H. Zhang, et al. itransformer: Inverted transformers are effective for time series forecasting. In ICLR, 2024

  73. [73]

    Y . Liu, G. Qin, X. Huang, et al. Autotimes: Autoregressive time series forecasters via large language models. arXiv preprint, abs/2402.02370, 2024

  74. [74]

    Liu and R

    Z. Liu and R. Jia. Llm4fts: Enhancing large language models for financial time series prediction. arXiv preprint, abs/2505.02880, 2025

  75. [75]

    Can ChatGPT forecast stock price movements? Return predictability and large language models.arXiv preprint arXiv:2304.07619, 2023

    A. Lopez-Lira and Y . Tang. Can chatgpt forecast stock price movements? return predictability and large language models. arXiv preprint, abs/2304.07619, 2023

  76. [76]

    Q. Ma, Z. Liu, Z. Zheng, Z. Huang, S. Zhu, Z. Yu, and J. T. Kwok. A survey on time-series pre-trained models. TKDE, 36(12):7536–7555, 2024

  77. [77]

    Y . Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. In ICLR, 2023

  78. [78]

    P. Niu, T. Zhou, X. Wang, L. Sun, and R. Jin. Understanding the role of textual prompts in llm for time series forecasting: an adapter view. arXiv preprint, abs/2311.14782, 2024

  79. [79]

    J. Oh, G. Lee, S. Bae, et al. Ecg-qa: A comprehensive question answering dataset combined with electrocardiogram. NeurIPS, 36, 2024

  80. [80]

    GPT-4 Technical Report

    OpenAI. GPT-4 technical report. arXiv preprint, abs/2303.08774, 2023

Showing first 80 references.