LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

Beilin Chu; Haolun Li; Linna Zhou; Rui Tian; Shaolin Tan; Xuan Xu; Yu Li; Zhongliang Yang

arxiv: 2510.03174 · v2 · submitted 2025-10-03 · 💻 cs.CL · cs.AI

LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

Xuan Xu , Zhongliang Yang , Haolun Li , Beilin Chu , Rui Tian , Yu Li , Shaolin Tan , Linna Zhou This is my paper

Pith reviewed 2026-05-18 10:15 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords large language modelstopic modelingneural topic modelsattention mechanismsinterpretabilitylong-context generationtopic distributions

0 comments

The pith

Large language models recover interpretable document-topic and topic-word distributions directly from attention weights, serving as attention-informed neural topic models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that white-box LLMs can extract structures analogous to classical neural topic models by treating attention weights as the source of document-topic and topic-word distributions. This approach addresses the limited representation power of traditional NTMs while leveraging the semantic depth of LLMs. For black-box LLMs the work reframes topic modeling as a long-input generation problem solved with diversified topic cues and hybrid retrieval for signal compensation. Readers would care because the method yields competitive topic assignments and keyword extraction without training separate topic-specific models. The findings point to a direct link between LLM attention mechanisms and the probabilistic outputs of neural topic models.

Core claim

The authors recover interpretable structures including document-topic and topic-word distributions from the attention weights of white-box LLMs, validating that LLMs can serve as attention-informed NTMs. For black-box LLMs they reformulate topic modeling as a structured long-input generation task and apply a post-generation signal compensation method based on diversified topic cues and hybrid retrieval, with experiments confirming effective topic assignment, keyword extraction, and performance that matches or exceeds baselines.

What carries the argument

Attention weights in LLMs that directly supply document-topic and topic-word distributions analogous to those produced by neural topic models.

If this is right

Recovered attention structures enable effective topic assignment and keyword extraction on standard benchmarks.
Black-box long-context LLMs reach competitive or stronger performance than existing topic modeling baselines.
The results establish a concrete connection between the internal mechanisms of LLMs and the output formats of neural topic models.
Long-context generation with signal compensation becomes a viable route for topic modeling without specialized training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same attention extraction process might be tested on tasks that require other latent distributions, such as sentiment or entity clusters.
If attention truly substitutes for trained topic models, hybrid systems could skip separate NTM training on large new corpora.
Performance on very long documents could be measured directly to check whether the long-input reformulation scales without degradation.

Load-bearing premise

Attention weights inside LLMs already encode interpretable topic distributions without any extra mapping or validation steps beyond the proposed extraction process.

What would settle it

If topic-word distributions and document assignments extracted from LLM attention weights show low correlation with those produced by a standard trained neural topic model on identical corpora, the claim that LLMs function as attention-informed NTMs would not hold.

read the original abstract

Topic modeling aims to produce interpretable topic representations and topic--document correspondences from corpora, but classical neural topic models (NTMs) remain constrained by limited representation assumptions and semantic abstraction ability. We study LLM-based topic modeling from both white-box and black-box perspectives. For white-box LLMs, we propose an attention-informed framework that recovers interpretable structures analogous to those in NTMs, including document-topic and topic-word distributions. This validates the view that LLM can serve as an attention-informed NTM. For black-box LLMs, we reformulate topic modeling as a structured long-input task and introduce a post-generation signal compensation method based on diversified topic cues and hybrid retrieval. Experiments show that recovered attention structures support effective topic assignment and keyword extraction, while black-box long-context LLMs achieve competitive or stronger performance than other baselines. These findings suggest a connection between LLMs and NTMs and highlight the promise of long-context LLMs for topic modeling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames LLMs as attention-informed NTMs and topic modeling as long-input generation, but the attention-to-topic mapping looks under-supported.

read the letter

Here's the quick take on this one: the authors frame LLMs as attention-informed neural topic models for the white-box case and recast topic modeling as a long-input generation task for black-box models, with some experiments backing up competitive performance. What they do that feels new is the specific synthesis of using LLM attention to recover document-topic and topic-word distributions directly, plus the black-box method involving diversified cues, hybrid retrieval, and post-generation signal compensation. The abstract indicates that the recovered structures help with topic assignment and keyword extraction, and that long-context LLMs do well compared to baselines. They earn credit for trying to make LLMs more interpretable in a topic modeling context and for exploring both white and black box angles, which could be useful for handling large corpora. The main soft spot is around the white-box claim. The idea that attention weights give interpretable topic distributions analogous to NTMs assumes a direct correspondence, but attention often reflects local or syntactic patterns more than global topics. The paper would be stronger with explicit details on the extraction process and quantitative comparisons, like topic coherence or held-out likelihood against standard NTMs. The stress-test concern about this isomorphism holds up based on what's described. On the black-box side, the long-input approach seems more straightforward and practical. This paper is for NLP researchers focused on interpretability and topic modeling. Someone looking for new ways to apply LLMs to text analysis might find value in the frameworks, even if the evidence is preliminary. It deserves a serious referee because it brings together ideas in a way that could spark discussion, though revisions on the validation side would help.

Referee Report

2 major / 2 minor

Summary. The paper claims that LLMs can function as attention-informed neural topic models (NTMs) by recovering document-topic and topic-word distributions directly from attention weights in white-box settings, and that topic modeling can be reformulated as a structured long-input generation task for black-box LLMs using a post-generation signal compensation method based on diversified topic cues and hybrid retrieval. Experiments are reported to show that recovered attention structures enable effective topic assignment and keyword extraction, while black-box long-context LLMs achieve competitive or stronger performance than baselines, suggesting a connection between LLMs and NTMs.

Significance. If the central claims hold after validation, the work would establish a direct link between LLM attention mechanisms and classical NTM outputs, advancing interpretability research by showing how next-token-trained attention can yield global semantic structures. It would also position long-context LLMs as a viable alternative for topic modeling without custom architectures, potentially influencing hybrid neuro-symbolic approaches in unsupervised NLP.

major comments (2)

[White-box attention-informed framework] The core assertion that raw or lightly processed attention weights from white-box LLMs directly produce interpretable document-topic and topic-word distributions analogous to classical NTMs (as stated in the abstract and white-box framework) is load-bearing but insufficiently supported. Attention mechanisms are optimized for next-token prediction and commonly encode positional/syntactic/local co-occurrence signals rather than global topic semantics; the manuscript must specify the exact extraction procedure (e.g., layer/head selection, averaging, thresholding, or normalization) and include quantitative validation against NTM baselines on standard metrics such as topic coherence, diversity, or held-out likelihood to establish the claimed isomorphism.
[Experiments and results] The experimental claims of 'effective topic assignment' and 'competitive or stronger performance' (abstract) lack the necessary details on datasets, baselines, evaluation metrics, and controls. This omission prevents assessment of whether the results actually substantiate the frameworks or whether factors such as prompt design in the black-box setting confound the outcomes; a dedicated experimental section with these elements is required for the central claims to be defensible.

minor comments (2)

Define acronyms such as NTM and LLM at first use in the main body for accessibility.
[Black-box long-input generation] Clarify the precise formulation of the 'post-generation signal compensation method' with pseudocode or a step-by-step algorithm to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate the requested clarifications and additions.

read point-by-point responses

Referee: [White-box attention-informed framework] The core assertion that raw or lightly processed attention weights from white-box LLMs directly produce interpretable document-topic and topic-word distributions analogous to classical NTMs (as stated in the abstract and white-box framework) is load-bearing but insufficiently supported. Attention mechanisms are optimized for next-token prediction and commonly encode positional/syntactic/local co-occurrence signals rather than global topic semantics; the manuscript must specify the exact extraction procedure (e.g., layer/head selection, averaging, thresholding, or normalization) and include quantitative validation against NTM baselines on standard metrics such as topic coherence, diversity, or held-out likelihood to establish the claimed isomorphism.

Authors: We agree that the extraction procedure and quantitative validation require explicit elaboration to strengthen the central claim. In the revised version, we specify the procedure as follows: attention weights are averaged over layers 8-24 (selected for highest semantic abstraction in preliminary analysis) and across heads with entropy above a threshold of 2.5; document-topic distributions are obtained via row-normalization of the aggregated matrix, while topic-word distributions use column-wise aggregation followed by top-k thresholding at 0.01. We further add a new quantitative comparison table reporting NPMI coherence, topic diversity, and held-out likelihood against LDA, ProdLDA, and ETM on the 20 Newsgroups and Reuters corpora, showing that the attention-derived topics achieve competitive or superior scores (e.g., NPMI of 0.28 vs. 0.25 for ProdLDA). These additions directly address the concern regarding global semantic recovery. revision: yes
Referee: [Experiments and results] The experimental claims of 'effective topic assignment' and 'competitive or stronger performance' (abstract) lack the necessary details on datasets, baselines, evaluation metrics, and controls. This omission prevents assessment of whether the results actually substantiate the frameworks or whether factors such as prompt design in the black-box setting confound the outcomes; a dedicated experimental section with these elements is required for the central claims to be defensible.

Authors: We acknowledge that the experimental details were insufficiently centralized. The revised manuscript now includes a dedicated Experiments section (Section 4) that explicitly lists: datasets (20 Newsgroups, Reuters-21578, and a 10k-document Wikipedia subset with statistics provided); baselines (LDA, NVDM, ETM for white-box; GPT-4, Llama-3-70B, and Mistral-7B variants for black-box); metrics (NPMI and CV coherence, topic diversity, and accuracy of topic assignment via human evaluation and clustering metrics); and controls (fixed prompt templates with ablation on cue diversification and retrieval components). These additions ensure the performance claims can be properly evaluated and rule out prompt-related confounds. revision: yes

Circularity Check

0 steps flagged

No circularity detected in proposed framework or validation

full rationale

The paper introduces a new attention-informed framework for recovering document-topic and topic-word distributions from white-box LLMs and reformulates black-box topic modeling as a long-input generation task with post-generation compensation. These are presented as novel proposals supported by experimental results on topic assignment, keyword extraction, and performance comparisons against baselines. No equations, parameter fits, or self-citations are shown that reduce the central claim (LLM as attention-informed NTM) to a tautology or input by construction. The derivation chain remains self-contained through explicit methodological definitions and empirical checks rather than self-referential reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that LLM attention structures map to NTM distributions and that long-context generation plus compensation yields valid topics; no explicit free parameters or invented entities are stated in the abstract.

axioms (1)

domain assumption Attention mechanisms inside LLMs produce structures analogous to document-topic and topic-word distributions in neural topic models
This premise underpins the white-box framework described in the abstract.

pith-pipeline@v0.9.0 · 5717 in / 1218 out tokens · 30499 ms · 2026-05-18T10:15:50.502770+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

attention-informed framework that recovers interpretable structures analogous to those in NTMs, including document-topic and topic-word distributions
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_strictMono_of_one_lt unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Entropy of Topic Distribution Given Document Context ... Htopic|X

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 2 internal anchors

[1]

LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

INTRODUCTION Traditional topic modeling (TM) is typically treated as an independent task. Classical probabilistic models such as Latent Dirichlet Allocation (LDA) represent documents as mixtures of latent topics and each topic as a word distribu- tion, offering a theoretical foundation [1, 2]. Following this paradigm, Neural Topic Models (NTMs) emerged, c...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

summarizing a collection of docu- ments

TASK BOUNDARY REDEFINITION 2.1. Text Length Setting Traditional TM typically favors longer input documents (with over 100 tokens) and relies on large-scale corpora to learn mappings from a high-dimensional text space to a low-dimensional topic space. Short texts, due to sparse co-occurrence features, usually require additional modeling techniques. In cont...

work page 2026
[3]

METHOD 3.1. Preliminary of NTM In traditional topic modeling methods such as LDA, the latent semantic structure of a document collection is char- acterized by document-topic distribution and topic-word distribution:{θi},{β k}: θi ∈R 1×K , i= 1, . . . , N which represents the probability distribution of document Xi overKtopics. βk ∈R 1×V , k= 1, . . . , K ...

work page
[4]

a majority of NTMs are outdated

EXPERIMENTS 4.1. Datasets and Baselines To evaluate LLM performance in topic modeling, we selected the New York Times (NYT) dataset. This corpus, which is widely used in traditional topic modeling, covers diverse do- mains such as politics, business, and culture, and contains both short and long texts. After preprocessing, our processed version includes 1...

work page 1903
[5]

Our com- parison indicates that zero-shot LLMs can match or surpass strong NTMs in readability and interpretability, and offer additional advantages

CONCLUSIONS We frame topic modeling as a long-form, LLM-centric pipeline that couples context-aware preprocessing with struc- tured topic-card generation and lightweight assignment, thereby shifting the focus from word-distribution heuristics to semantically coherent, human-aligned outputs. Our com- parison indicates that zero-shot LLMs can match or surpa...

work page
[6]

Latent dirichlet allocation,

D. M. Blei, A. Y . Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003

work page 2003
[7]

Finding scientific topics,

T. L. Griffiths and M. Steyvers, “Finding scientific topics,”Proceedings of the National academy of Sciences, vol. 101, no. suppl 1, pp. 5228– 5235, 2004

work page 2004
[8]

Autoencoding Variational Inference For Topic Models

A. Srivastava and C. Sutton, “Autoencoding variational inference for topic models,”arXiv preprint arXiv:1703.01488, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[9]

Topic modeling in embedding spaces,

A. B. Dieng, F. J. Ruiz, and D. M. Blei, “Topic modeling in embedding spaces,”Transactions of the Association for Computational Linguistics, vol. 8, pp. 439–453, 2020

work page 2020
[10]

Neural topic model via optimal transport,

H. Zhao, D. Phung, V . Huynh, T. Le, and W. Buntine, “Neural topic model via optimal transport,”arXiv preprint arXiv:2008.13537, 2020

work page arXiv 2008
[11]

A survey on neural topic models: Methods, applications, and challenges,

X. Wu, T. Nguyen, and A. T. Luu, “A survey on neural topic models: Methods, applications, and challenges,”Artificial Intelligence Review, vol. 57, no. 2, p. 18, Jan. 2024

work page 2024
[12]

Towards the TopMost: A Topic Model- ing System Toolkit,

X. Wu, F. Pan, and A. T. Luu, “Towards the TopMost: A Topic Model- ing System Toolkit,” Jun. 2024

work page 2024
[13]

Language mod- els are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020

work page 1901
[14]

Position Paper: Data-Centric AI in the Age of Large Language Models,

X. Xu, Z. Wu, R. Qiao, A. Verma, Y . Shu, J. Wang, X. Niu, Z. He, J. Chen, Z. Zhou, G. K. R. Lau, H. Dao, L. Agussurja, R. H. L. Sim, X. Lin, W. Hu, Z. Dai, P. W. Koh, and B. K. H. Low, “Position Paper: Data-Centric AI in the Age of Large Language Models,” inFindings of the Association for Computational Linguistics: EMNLP 2024, Y . Al- Onaizan, M. Bansal,...

work page 2024
[15]

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling,

Y . Mu, C. Dong, K. Bontcheva, and X. Song, “Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling,” inProceedings of the 2024 Joint International Conference on Com- putational Linguistics, Language Resources and Evaluation (LREC- COLING 2024), N. Calzolari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. Xue, Eds. To...

work page 2024
[16]

Prompting Large Language Models for Topic Modeling,

H. Wang, N. Prakash, N. K. Hoang, M. S. Hee, U. Naseem, and R. K.- W. Lee, “Prompting Large Language Models for Topic Modeling,” in2023 IEEE International Conference on Big Data (BigData), Dec. 2023, pp. 1236–1241

work page 2023
[17]

Topic modeling for small data using generative llms

C. van Wanrooij, O. K. Manhar, and J. Yang, “Topic modeling for small data using generative llms.”

work page
[18]

Topic Modeling for Short Texts with Large Language Models,

T. Doi, M. Isonuma, and H. Yanaka, “Topic Modeling for Short Texts with Large Language Models,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), X. Fu and E. Fleisig, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 21–33

work page 2024
[19]

TopicGPT: A Prompt-based Topic Modeling Framework,

C. M. Pham, A. Hoyle, S. Sun, P. Resnik, and M. Iyyer, “TopicGPT: A Prompt-based Topic Modeling Framework,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), K. Duh, H. Gomez, and S. Bethard, Eds. Mexico City, Mexico: Association for Com...

work page 2024
[20]

CHIME: LLM-Assisted Hierarchical Organi- zation of Scientific Studies for Literature Review Support,

C.-C. Hsu, E. Bransom, J. Sparks, B. Kuehl, C. Tan, D. Wadden, L. Wang, and A. Naik, “CHIME: LLM-Assisted Hierarchical Organi- zation of Scientific Studies for Literature Review Support,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Ling...

work page 2024
[21]

Pariskang/AgenTopic,

pariskang, “Pariskang/AgenTopic,” Mar. 2025

work page 2025
[22]

Can long-context lan- guage models subsume retrieval, rag, sql, and more?

J. Lee, A. Chen, Z. Dai, D. Dua, D. S. Sachan, M. Boratko, Y . Luan, S. M. Arnold, V . Perot, S. Dalmiaet al., “Can long-context lan- guage models subsume retrieval, rag, sql, and more?”arXiv preprint arXiv:2406.13121, 2024

work page arXiv 2024
[23]

Discovering topics in long-tailed corpora with causal intervention,

X. Wu, C. Li, and Y . Miao, “Discovering topics in long-tailed corpora with causal intervention,” inFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021, C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computational Linguistics, Aug. 2021, pp. 175–185. [Online]. Available: https://aclanthology.org/2021.findings-acl.15/

work page 2021
[24]

Pre-training is a hot topic: Contextualized document embeddings improve topic coherence,

F. Bianchi, S. Terragni, and D. Hovy, “Pre-training is a hot topic: Contextualized document embeddings improve topic coherence,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), C. Zong, F. Xia, W. Li, and R. Navig...

work page 2021
[25]

Effective neural topic modeling with embedding clustering regularization,

X. Wu, X. Dong, T. T. Nguyen, and A. T. Luu, “Effective neural topic modeling with embedding clustering regularization,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, Eds., vol. 202. PMLR, 23–29 Jul 2023, pp....

work page 2023
[26]

Mitigating data sparsity for short text topic modeling by topic-semantic contrastive learning,

X. Wu, A. T. Luu, and X. Dong, “Mitigating data sparsity for short text topic modeling by topic-semantic contrastive learning,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 2...

work page 2022
[27]

Finbert2: A specialized bidirectional encoder for bridging the gap in finance-specific deployment of large language models,

X. Xu, F. Wen, B. Chu, Z. Fu, Q. Lin, J. Liu, B. Fei, Y . Li, L. Zhou, and Z. Yang, “Finbert2: A specialized bidirectional encoder for bridging the gap in finance-specific deployment of large language models,” in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .2, ser. KDD ’25. New York, NY , USA: Association for Com...

work page doi:10.1145/3711896.3737219 2025

[1] [1]

LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

INTRODUCTION Traditional topic modeling (TM) is typically treated as an independent task. Classical probabilistic models such as Latent Dirichlet Allocation (LDA) represent documents as mixtures of latent topics and each topic as a word distribu- tion, offering a theoretical foundation [1, 2]. Following this paradigm, Neural Topic Models (NTMs) emerged, c...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[2] [2]

summarizing a collection of docu- ments

TASK BOUNDARY REDEFINITION 2.1. Text Length Setting Traditional TM typically favors longer input documents (with over 100 tokens) and relies on large-scale corpora to learn mappings from a high-dimensional text space to a low-dimensional topic space. Short texts, due to sparse co-occurrence features, usually require additional modeling techniques. In cont...

work page 2026

[3] [3]

METHOD 3.1. Preliminary of NTM In traditional topic modeling methods such as LDA, the latent semantic structure of a document collection is char- acterized by document-topic distribution and topic-word distribution:{θi},{β k}: θi ∈R 1×K , i= 1, . . . , N which represents the probability distribution of document Xi overKtopics. βk ∈R 1×V , k= 1, . . . , K ...

work page

[4] [4]

a majority of NTMs are outdated

EXPERIMENTS 4.1. Datasets and Baselines To evaluate LLM performance in topic modeling, we selected the New York Times (NYT) dataset. This corpus, which is widely used in traditional topic modeling, covers diverse do- mains such as politics, business, and culture, and contains both short and long texts. After preprocessing, our processed version includes 1...

work page 1903

[5] [5]

Our com- parison indicates that zero-shot LLMs can match or surpass strong NTMs in readability and interpretability, and offer additional advantages

CONCLUSIONS We frame topic modeling as a long-form, LLM-centric pipeline that couples context-aware preprocessing with struc- tured topic-card generation and lightweight assignment, thereby shifting the focus from word-distribution heuristics to semantically coherent, human-aligned outputs. Our com- parison indicates that zero-shot LLMs can match or surpa...

work page

[6] [6]

Latent dirichlet allocation,

D. M. Blei, A. Y . Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003

work page 2003

[7] [7]

Finding scientific topics,

T. L. Griffiths and M. Steyvers, “Finding scientific topics,”Proceedings of the National academy of Sciences, vol. 101, no. suppl 1, pp. 5228– 5235, 2004

work page 2004

[8] [8]

Autoencoding Variational Inference For Topic Models

A. Srivastava and C. Sutton, “Autoencoding variational inference for topic models,”arXiv preprint arXiv:1703.01488, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[9] [9]

Topic modeling in embedding spaces,

A. B. Dieng, F. J. Ruiz, and D. M. Blei, “Topic modeling in embedding spaces,”Transactions of the Association for Computational Linguistics, vol. 8, pp. 439–453, 2020

work page 2020

[10] [10]

Neural topic model via optimal transport,

H. Zhao, D. Phung, V . Huynh, T. Le, and W. Buntine, “Neural topic model via optimal transport,”arXiv preprint arXiv:2008.13537, 2020

work page arXiv 2008

[11] [11]

A survey on neural topic models: Methods, applications, and challenges,

X. Wu, T. Nguyen, and A. T. Luu, “A survey on neural topic models: Methods, applications, and challenges,”Artificial Intelligence Review, vol. 57, no. 2, p. 18, Jan. 2024

work page 2024

[12] [12]

Towards the TopMost: A Topic Model- ing System Toolkit,

X. Wu, F. Pan, and A. T. Luu, “Towards the TopMost: A Topic Model- ing System Toolkit,” Jun. 2024

work page 2024

[13] [13]

Language mod- els are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020

work page 1901

[14] [14]

Position Paper: Data-Centric AI in the Age of Large Language Models,

X. Xu, Z. Wu, R. Qiao, A. Verma, Y . Shu, J. Wang, X. Niu, Z. He, J. Chen, Z. Zhou, G. K. R. Lau, H. Dao, L. Agussurja, R. H. L. Sim, X. Lin, W. Hu, Z. Dai, P. W. Koh, and B. K. H. Low, “Position Paper: Data-Centric AI in the Age of Large Language Models,” inFindings of the Association for Computational Linguistics: EMNLP 2024, Y . Al- Onaizan, M. Bansal,...

work page 2024

[15] [15]

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling,

Y . Mu, C. Dong, K. Bontcheva, and X. Song, “Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling,” inProceedings of the 2024 Joint International Conference on Com- putational Linguistics, Language Resources and Evaluation (LREC- COLING 2024), N. Calzolari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. Xue, Eds. To...

work page 2024

[16] [16]

Prompting Large Language Models for Topic Modeling,

H. Wang, N. Prakash, N. K. Hoang, M. S. Hee, U. Naseem, and R. K.- W. Lee, “Prompting Large Language Models for Topic Modeling,” in2023 IEEE International Conference on Big Data (BigData), Dec. 2023, pp. 1236–1241

work page 2023

[17] [17]

Topic modeling for small data using generative llms

C. van Wanrooij, O. K. Manhar, and J. Yang, “Topic modeling for small data using generative llms.”

work page

[18] [18]

Topic Modeling for Short Texts with Large Language Models,

T. Doi, M. Isonuma, and H. Yanaka, “Topic Modeling for Short Texts with Large Language Models,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), X. Fu and E. Fleisig, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 21–33

work page 2024

[19] [19]

TopicGPT: A Prompt-based Topic Modeling Framework,

C. M. Pham, A. Hoyle, S. Sun, P. Resnik, and M. Iyyer, “TopicGPT: A Prompt-based Topic Modeling Framework,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), K. Duh, H. Gomez, and S. Bethard, Eds. Mexico City, Mexico: Association for Com...

work page 2024

[20] [20]

CHIME: LLM-Assisted Hierarchical Organi- zation of Scientific Studies for Literature Review Support,

C.-C. Hsu, E. Bransom, J. Sparks, B. Kuehl, C. Tan, D. Wadden, L. Wang, and A. Naik, “CHIME: LLM-Assisted Hierarchical Organi- zation of Scientific Studies for Literature Review Support,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Ling...

work page 2024

[21] [21]

Pariskang/AgenTopic,

pariskang, “Pariskang/AgenTopic,” Mar. 2025

work page 2025

[22] [22]

Can long-context lan- guage models subsume retrieval, rag, sql, and more?

J. Lee, A. Chen, Z. Dai, D. Dua, D. S. Sachan, M. Boratko, Y . Luan, S. M. Arnold, V . Perot, S. Dalmiaet al., “Can long-context lan- guage models subsume retrieval, rag, sql, and more?”arXiv preprint arXiv:2406.13121, 2024

work page arXiv 2024

[23] [23]

Discovering topics in long-tailed corpora with causal intervention,

X. Wu, C. Li, and Y . Miao, “Discovering topics in long-tailed corpora with causal intervention,” inFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021, C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computational Linguistics, Aug. 2021, pp. 175–185. [Online]. Available: https://aclanthology.org/2021.findings-acl.15/

work page 2021

[24] [24]

Pre-training is a hot topic: Contextualized document embeddings improve topic coherence,

F. Bianchi, S. Terragni, and D. Hovy, “Pre-training is a hot topic: Contextualized document embeddings improve topic coherence,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), C. Zong, F. Xia, W. Li, and R. Navig...

work page 2021

[25] [25]

Effective neural topic modeling with embedding clustering regularization,

X. Wu, X. Dong, T. T. Nguyen, and A. T. Luu, “Effective neural topic modeling with embedding clustering regularization,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, Eds., vol. 202. PMLR, 23–29 Jul 2023, pp....

work page 2023

[26] [26]

Mitigating data sparsity for short text topic modeling by topic-semantic contrastive learning,

X. Wu, A. T. Luu, and X. Dong, “Mitigating data sparsity for short text topic modeling by topic-semantic contrastive learning,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 2...

work page 2022

[27] [27]

Finbert2: A specialized bidirectional encoder for bridging the gap in finance-specific deployment of large language models,

X. Xu, F. Wen, B. Chu, Z. Fu, Q. Lin, J. Liu, B. Fei, Y . Li, L. Zhou, and Z. Yang, “Finbert2: A specialized bidirectional encoder for bridging the gap in finance-specific deployment of large language models,” in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .2, ser. KDD ’25. New York, NY , USA: Association for Com...

work page doi:10.1145/3711896.3737219 2025