Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

Cheng Jin; Musheng Li; Yuantao Gu; Ziying Zhang

arxiv: 2605.28866 · v1 · pith:K3LVUG3Lnew · submitted 2026-05-22 · 💻 cs.LG · cs.AI

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

Musheng Li , Ziying Zhang , Cheng jin , Yuantao Gu This is my paper

Pith reviewed 2026-06-30 16:32 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords time series analysislarge language modelstoken embeddingscontinuityordinalitygeometric constraintsTS-LLMs

0 comments

The pith

Preserving continuity and ordinality in time series token embeddings improves token-based TS-LLM performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that token-based time series large language models underperform because their embeddings ignore the continuous nature of values and their natural ordering. It proposes COM, a method that adds geometric constraints at the embedding initialization stage and throughout training to enforce these properties. Results on multiple benchmarks show consistent gains and broad applicability. A sympathetic reader would care because the fix targets a basic property of the data rather than requiring new architectures or data sources.

Core claim

Preserving continuity and ordinality in time series token embeddings is crucial for the effectiveness of token-based TS-LLMs. COM achieves this by integrating geometric constraints into both the initialization and training stages, leading to consistent performance improvements on multiple benchmarks.

What carries the argument

COM, the continuity- and ordinality-aware strategy that adds geometric constraints during embedding initialization and training to enforce order and smooth variation in the token space.

If this is right

Token embeddings must reflect gradual value changes to support accurate time series reasoning.
Preserving value order in the embedding space prevents models from treating nearby values as unrelated.
Constraints applied at both initialization and training produce better results than initialization alone.
The approach yields competitive accuracy on several standard time series tasks with little added complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same geometric idea might help sequence models that process other ordered data such as audio waveforms or sensor streams.
One could test whether removing the constraints after training still keeps most of the gained performance.
The method might combine with existing techniques like patching or frequency decomposition to produce further gains.

Load-bearing premise

The main shortcoming of earlier token-based TS-LLMs is the absence of continuity and ordinality in their embeddings, and that geometric constraints can restore these properties to raise performance without other major changes.

What would settle it

A replication study that applies the same geometric constraints to a standard token-based TS-LLM baseline and measures no accuracy gain, or even a drop, across the reported time series benchmarks.

Figures

Figures reproduced from arXiv: 2605.28866 by Cheng Jin, Musheng Li, Yuantao Gu, Ziying Zhang.

**Figure 1.** Figure 1: Overall illustration of our work. (a) Our approach follows the token-based TS-LLM paradigm, consisting of a TS-Processor, a TS-Tokenizer, and an LLM backbone. (b) To preserve the continuity and ordinality of TS tokens, we introduce the COM (Continuity and Ordinality Matter) strategy, which combines hard constraints through embedding initialization with soft constraints through dedicated regularization loss… view at source ↗

**Figure 2.** Figure 2: 2D PCA visualizations of TS embeddings un [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The figure illustrates the accuracy trajectories of different initialized model variants (descriptions pro [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Ablation study of four regularization loss [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 6.** Figure 6: Training loss curves under Default and Slerp [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 5.** Figure 5: Linear regression analysis between model [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: PCA-based 3D visualizations of TS embeddings under different embedding initialization schemes. The [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Template and example of TS-Processor for time-series processing. The template consists of statistical [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: The PCA visualization of TS-Token embed [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

**Figure 10.** Figure 10: Visualization of our model’s forecasting [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗

read the original abstract

Token-based time series large language models (TS-LLMs) have emerged as a promising direction for time series analysis and reasoning. However, prior studies largely overlook the inherent continuity and ordinality of time series tokens, which substantially limits model performance. In this paper, we argue that preserving these properties in time series token embeddings is crucial for the effectiveness of token-based TS-LLMs. To this end, we propose COM (Continuity and Ordinality Matter), a continuity- and ordinality-aware strategy that integrates geometric constraints into both the initialization and training stages. Empirical results on multiple time series analysis benchmarks demonstrate that COM consistently improves the performance of token-based TS-LLMs, achieving competitive results and strong generalizability. Code is available at https://anonymous.4open.science/r/COM .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

COM adds geometric constraints at init and training to preserve continuity and ordinality in time series token embeddings and claims consistent benchmark gains, but provides no direct measurements confirming the embeddings actually gain those properties.

read the letter

The core idea here is straightforward: prior token-based TS-LLMs ignored that time series values are both ordered and continuous, so the authors add geometric constraints during embedding initialization and training to enforce those traits. They report that this COM approach lifts performance across several benchmarks and models.

What is actually new is the specific two-stage geometric constraint method. Earlier work on TS tokenization focused on other aspects like patching or quantization, so targeting continuity and ordinality through explicit geometric terms is a distinct angle. The paper does a clean job explaining why these properties should matter for numerical sequences and shows the constraints are simple enough to add without major architecture changes.

The main soft spot is the missing verification step. There are no reported checks—such as distance histograms between embeddings of nearby time points, rank correlations between scalar values and embedding coordinates, or distortion measures—showing that the learned embeddings are measurably more continuous or ordinal than the baselines. Performance tables alone do not isolate whether the gains come from the intended mechanism or from the side effects of extra loss terms and hyperparameter adjustments. The abstract also gives little on experimental controls, so it is hard to judge robustness.

This is aimed at people already working on LLM adaptations for time series in applied settings like forecasting or anomaly detection. A reader looking for incremental tokenization tweaks would find it useful to try, especially since code is released.

It deserves peer review because the proposal is concrete, the claimed improvements are consistent, and the idea can be tested directly. Reviewers would likely ask for the embedding-property diagnostics, but the work is coherent enough on its own terms to warrant that step.

Referee Report

2 major / 0 minor

Summary. The paper claims that prior token-based time series large language models (TS-LLMs) overlook the inherent continuity and ordinality of time series tokens, which limits model performance. It proposes COM (Continuity and Ordinality Matter), a strategy that integrates geometric constraints into both the initialization and training stages of token embeddings. Empirical results on multiple time series analysis benchmarks are reported to show that COM consistently improves performance of token-based TS-LLMs, achieving competitive results with strong generalizability. Code is made available.

Significance. If the result holds, the work identifies an overlooked aspect of token embedding design for time series data and provides a targeted, relatively lightweight intervention via geometric constraints. The availability of code is a strength that supports reproducibility. This could influence future TS-LLM designs by emphasizing preservation of time-series-specific properties in embeddings.

major comments (2)

Abstract: The abstract asserts empirical improvements from COM but provides no details on experimental design, baselines, statistical significance, data splits, or controls. This prevents verification that the data supports the central claim of consistent performance gains.
Method (initialization and training stages): The central claim requires that COM's geometric constraints enforce continuity and ordinality and that this enforcement drives the reported gains. No quantitative check is provided (e.g., embedding-distance histograms for nearby time-series values, Spearman rank correlation between scalar value and embedding coordinate, or distortion metrics) showing that the learned token embeddings actually satisfy these properties better than prior token-based TS-LLMs. Performance tables alone cannot distinguish the intended mechanism from incidental effects of the added loss terms.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and will revise the manuscript to improve clarity and provide stronger supporting evidence.

read point-by-point responses

Referee: Abstract: The abstract asserts empirical improvements from COM but provides no details on experimental design, baselines, statistical significance, data splits, or controls. This prevents verification that the data supports the central claim of consistent performance gains.

Authors: We agree that the abstract would benefit from additional context on the experimental setup. In the revision we will expand the abstract to briefly specify the benchmarks, the token-based TS-LLM baselines compared, and that reported gains are consistent across datasets (with statistical significance noted where computed). revision: yes
Referee: Method (initialization and training stages): The central claim requires that COM's geometric constraints enforce continuity and ordinality and that this enforcement drives the reported gains. No quantitative check is provided (e.g., embedding-distance histograms for nearby time-series values, Spearman rank correlation between scalar value and embedding coordinate, or distortion metrics) showing that the learned token embeddings actually satisfy these properties better than prior token-based TS-LLMs. Performance tables alone cannot distinguish the intended mechanism from incidental effects of the added loss terms.

Authors: We acknowledge the value of direct quantitative verification of the embedding properties. Although the benchmark improvements are consistent with the intended effect of the constraints, performance tables alone do not isolate the mechanism. In the revised manuscript we will add quantitative checks, including embedding-distance histograms and rank-correlation metrics, comparing COM embeddings against prior token-based TS-LLMs to demonstrate improved preservation of continuity and ordinality. revision: yes

Circularity Check

0 steps flagged

No circularity: method and claims are independent of inputs

full rationale

The paper proposes COM as a new strategy that adds geometric constraints during token embedding initialization and training to enforce continuity and ordinality. Performance gains are shown via external benchmark experiments rather than any self-definitional reduction, fitted-parameter-as-prediction, or load-bearing self-citation. No equations or claims reduce the result to its own inputs by construction; the central argument rests on the introduced constraints plus independent evaluation data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no details on free parameters, axioms, or invented entities; the contribution is described at a high level without specifying any fitted values or new postulated constructs.

pith-pipeline@v0.9.1-grok · 5674 in / 1065 out tokens · 60736 ms · 2026-06-30T16:32:47.951982+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 13 canonical work pages · 10 internal anchors

[1]

Chronos: Learning the Language of Time Series

Chronos: Learning the language of time se- ries. arXiv preprint arXiv:2403.07815. Shaojie Bai. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wen- bin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, and 1 other...

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

Bioinformatics, 36(16):4406–4414

Transformercpi: improving compound– protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics, 36(16):4406–4414. Mingyue Cheng, Yiheng Chen, Qi Liu, Zhiding Liu, Y ucong Luo, and Enhong Chen. 2025. Instruc- time: Advancing time series classiﬁcation with mul- timodal language...

work page arXiv 2025
[3]

The Llama 3 Herd of Models

On embeddings for numerical features in tab- ular deep learning. Advances in Neural Information Processing Systems, 35:24991–25004. Aaron Grattaﬁori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al- Dahle, Aiesha Letman, Akhil Mathur, Alan Schel- ten, Alex V aughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint ...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Time-llm: Time series forecasting by repro- gramming large language models. arXiv preprint arXiv:2310.01728. Y axuan Kong, Yiyuan Y ang, Y oontae Hwang, Wenjie Du, Stefan Zohren, Zhangyang Wang, Ming Jin, and Qingsong Wen. 2025. Time-mqa: Time series multi- task question answering with context enhancement. arXiv preprint arXiv:2503.01875. Li Li, Xiaonan S...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

Qwen2.5 Technical Report

Ensemble learning for electricity consump- tion forecasting in ofﬁce buildings. Neurocomput- ing, 423:747–755. Martin F Porter. 1980. An algorithm for sufﬁx strip- ping. Program, 14(3):130–137. Paul Quinlan, Qingguo Li, and Xiaodan Zhu. 2026. Chat-ts: Enhancing multi-modal reasoning over time-series and natural language data. In Proceed- ings of the 19th ...

work page internal anchor Pith review Pith/arXiv arXiv 1980
[6]

BEDTime: A Unified Benchmark for Automatically Describing Time Series

Bedtime: A uniﬁed benchmark for auto- matically describing time series. arXiv preprint arXiv:2509.05215. Omer Berat Sezer, Mehmet Ugur Gudelek, and Ah- met Murat Ozbayoglu. 2020. Financial time series forecasting with deep learning: A systematic liter- ature review: 2005–2019. Applied soft computing , 90:106181. Y ang Song and Stefano Ermon. 2019. Generat...

work page internal anchor Pith review Pith/arXiv arXiv 2020
[7]

Gemma 3 Technical Report

Gemma 3 technical report . Preprint, arXiv:2503.19786. Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhu- patiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, and 1 others

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118. Chengsen Wang, Qi Qi, Jingyu Wang, Haifeng Sun, Zirui Zhuang, Jinming Wu, Lei Zhang, and Jianxin Liao. 2025. Chattime: A uniﬁed multimodal time series foundation model bridging numerical and tex- tual data. In AAAI Conference on Artiﬁcial Intelli- gence. Shiyu Wa...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[9]

IEEE Transactions on Pat- tern Analysis and Machine Intelligence

Deep time series models: A comprehensive survey and benchmark. IEEE Transactions on Pat- tern Analysis and Machine Intelligence . Y uxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Y ong Liu, Y unzhong Qiu, Jianmin Wang, and Mingsheng Long. 2024b. Timexer: Em- powering transformers for time series forecasting with exogenous variables. Advances ...

2026
[10]

A broad-coverage challenge corpus for sen- tence understanding through inference. In Proceed- ings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, V olume 1 (Long Papers), pages 1112–1122. Haixu Wu, Tengge Hu, Y ong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. 2022....

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

In Proceedings of the VLDB Endowment, 2025

Chatts: Aligning time series with llms via synthetic data for enhanced understanding and rea- soning. In Proceedings of the VLDB Endowment, 2025. An Y ang, Anfeng Li, Baosong Y ang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Y u, Chang Gao, Chengen Huang, Chenxu Lv, and 1 others

2025
[12]

Qwen3 Technical Report

Qwen3 technical report. arXiv preprint arXiv:2505.09388. Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu

work page internal anchor Pith review Pith/arXiv arXiv
[13]

5 GITCO: Inference-Time Context Optimization in TSFMs A

Are transformers effective for time series fore- casting? arXiv preprint arXiv:2205.13504. Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu

work page arXiv
[14]

BERTScore: Evaluating Text Generation with BERT

Are transformers effective for time series fore- casting? In Proceedings of the AAAI conference on artiﬁcial intelligence , volume 37, pages 11121– 11128. Tianyi Zhang, V arsha Kishore, Felix Wu, Kilian Q Weinberger, and Y oav Artzi. 2019. Bertscore: Eval- uating text generation with bert. arXiv preprint arXiv:1904.09675. Y unkai Zhang, Y awen Zhang, Ming...

work page internal anchor Pith review Pith/arXiv arXiv 2019
[15]

See it, Think it, Sorted: Large Multimodal Models are Few- shot Time Series Anomaly Analyzers,

Informer: Beyond efﬁcient transformer for long sequence time-series forecasting. In The Thirty- Fifth AAAI Conference on Artiﬁcial Intelligence, AAAI 2021, Virtual Conference , volume 35, pages 11106–11115. AAAI Press. Jiaxin Zhuang, Leon Y an, Zhenwei Zhang, Ruiqi Wang, Jiawei Zhang, and Y uantao Gu. 2024. See it, think it, sorted: Large multimodal model...

work page arXiv 2021
[16]

is a comprehensive multimodal benchmark designed to systematically evaluate the capabili- ties of Large Language Models (LLMs) in time series understanding and reasoning. Built upon a hierarchical taxonomy that spans feature analysis, temporal reasoning, and cross-modal alignment, the dataset comprises a total of 2,424 Time Se- ries Question Answering (TS...

2013

[1] [1]

Chronos: Learning the Language of Time Series

Chronos: Learning the language of time se- ries. arXiv preprint arXiv:2403.07815. Shaojie Bai. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271. Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wen- bin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, and 1 other...

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

Bioinformatics, 36(16):4406–4414

Transformercpi: improving compound– protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics, 36(16):4406–4414. Mingyue Cheng, Yiheng Chen, Qi Liu, Zhiding Liu, Y ucong Luo, and Enhong Chen. 2025. Instruc- time: Advancing time series classiﬁcation with mul- timodal language...

work page arXiv 2025

[3] [3]

The Llama 3 Herd of Models

On embeddings for numerical features in tab- ular deep learning. Advances in Neural Information Processing Systems, 35:24991–25004. Aaron Grattaﬁori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al- Dahle, Aiesha Letman, Akhil Mathur, Alan Schel- ten, Alex V aughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint ...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[4] [4]

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Time-llm: Time series forecasting by repro- gramming large language models. arXiv preprint arXiv:2310.01728. Y axuan Kong, Yiyuan Y ang, Y oontae Hwang, Wenjie Du, Stefan Zohren, Zhangyang Wang, Ming Jin, and Qingsong Wen. 2025. Time-mqa: Time series multi- task question answering with context enhancement. arXiv preprint arXiv:2503.01875. Li Li, Xiaonan S...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[5] [5]

Qwen2.5 Technical Report

Ensemble learning for electricity consump- tion forecasting in ofﬁce buildings. Neurocomput- ing, 423:747–755. Martin F Porter. 1980. An algorithm for sufﬁx strip- ping. Program, 14(3):130–137. Paul Quinlan, Qingguo Li, and Xiaodan Zhu. 2026. Chat-ts: Enhancing multi-modal reasoning over time-series and natural language data. In Proceed- ings of the 19th ...

work page internal anchor Pith review Pith/arXiv arXiv 1980

[6] [6]

BEDTime: A Unified Benchmark for Automatically Describing Time Series

Bedtime: A uniﬁed benchmark for auto- matically describing time series. arXiv preprint arXiv:2509.05215. Omer Berat Sezer, Mehmet Ugur Gudelek, and Ah- met Murat Ozbayoglu. 2020. Financial time series forecasting with deep learning: A systematic liter- ature review: 2005–2019. Applied soft computing , 90:106181. Y ang Song and Stefano Ermon. 2019. Generat...

work page internal anchor Pith review Pith/arXiv arXiv 2020

[7] [7]

Gemma 3 Technical Report

Gemma 3 technical report . Preprint, arXiv:2503.19786. Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhu- patiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, and 1 others

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118. Chengsen Wang, Qi Qi, Jingyu Wang, Haifeng Sun, Zirui Zhuang, Jinming Wu, Lei Zhang, and Jianxin Liao. 2025. Chattime: A uniﬁed multimodal time series foundation model bridging numerical and tex- tual data. In AAAI Conference on Artiﬁcial Intelli- gence. Shiyu Wa...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[9] [9]

IEEE Transactions on Pat- tern Analysis and Machine Intelligence

Deep time series models: A comprehensive survey and benchmark. IEEE Transactions on Pat- tern Analysis and Machine Intelligence . Y uxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Y ong Liu, Y unzhong Qiu, Jianmin Wang, and Mingsheng Long. 2024b. Timexer: Em- powering transformers for time series forecasting with exogenous variables. Advances ...

2026

[10] [10]

A broad-coverage challenge corpus for sen- tence understanding through inference. In Proceed- ings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, V olume 1 (Long Papers), pages 1112–1122. Haixu Wu, Tengge Hu, Y ong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. 2022....

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

In Proceedings of the VLDB Endowment, 2025

Chatts: Aligning time series with llms via synthetic data for enhanced understanding and rea- soning. In Proceedings of the VLDB Endowment, 2025. An Y ang, Anfeng Li, Baosong Y ang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Y u, Chang Gao, Chengen Huang, Chenxu Lv, and 1 others

2025

[12] [12]

Qwen3 Technical Report

Qwen3 technical report. arXiv preprint arXiv:2505.09388. Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

5 GITCO: Inference-Time Context Optimization in TSFMs A

Are transformers effective for time series fore- casting? arXiv preprint arXiv:2205.13504. Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu

work page arXiv

[14] [14]

BERTScore: Evaluating Text Generation with BERT

Are transformers effective for time series fore- casting? In Proceedings of the AAAI conference on artiﬁcial intelligence , volume 37, pages 11121– 11128. Tianyi Zhang, V arsha Kishore, Felix Wu, Kilian Q Weinberger, and Y oav Artzi. 2019. Bertscore: Eval- uating text generation with bert. arXiv preprint arXiv:1904.09675. Y unkai Zhang, Y awen Zhang, Ming...

work page internal anchor Pith review Pith/arXiv arXiv 2019

[15] [15]

See it, Think it, Sorted: Large Multimodal Models are Few- shot Time Series Anomaly Analyzers,

Informer: Beyond efﬁcient transformer for long sequence time-series forecasting. In The Thirty- Fifth AAAI Conference on Artiﬁcial Intelligence, AAAI 2021, Virtual Conference , volume 35, pages 11106–11115. AAAI Press. Jiaxin Zhuang, Leon Y an, Zhenwei Zhang, Ruiqi Wang, Jiawei Zhang, and Y uantao Gu. 2024. See it, think it, sorted: Large multimodal model...

work page arXiv 2021

[16] [16]

is a comprehensive multimodal benchmark designed to systematically evaluate the capabili- ties of Large Language Models (LLMs) in time series understanding and reasoning. Built upon a hierarchical taxonomy that spans feature analysis, temporal reasoning, and cross-modal alignment, the dataset comprises a total of 2,424 Time Se- ries Question Answering (TS...

2013