Sparse Autoencoders are Topic Models

Leander Girrbach; Zeynep Akata

arxiv: 2511.16309 · v2 · pith:VQNMFQ6Anew · submitted 2025-11-20 · 💻 cs.CV · cs.LG

Sparse Autoencoders are Topic Models

Leander Girrbach , Zeynep Akata This is my paper

Pith reviewed 2026-05-21 19:39 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords sparse autoencoderstopic modelscontinuous topic modelembedding spacesthematic analysisimage datasetstext datasetstopic coherence

0 comments

The pith

Sparse autoencoders function as topic models by deriving their objective from a continuous topic model on embeddings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that sparse autoencoders used on embedding spaces can be understood directly as topic models. The authors define a continuous topic model for embeddings that draws from the structure of Latent Dirichlet Allocation. They prove that the standard sparse autoencoder loss corresponds exactly to maximum a posteriori estimation under this generative model. This reframes the features an SAE learns as thematic topic components rather than steerable directions. The result lets researchers apply SAEs to discover and track themes across large text and image collections by training reusable topic atoms once and combining them flexibly afterward.

Core claim

Sparse autoencoders are topic models because their training objective is the maximum a posteriori estimator for a continuous topic model on embedding spaces, in which each embedding arises as a sparse mixture of thematic basis vectors under a suitable prior and likelihood, so that the SAE decoder directions recover the topic distributions of the model.

What carries the argument

Continuous topic model (CTM) for embedding spaces, under which the SAE reconstruction-plus-sparsity objective is derived as maximum a posteriori estimation.

If this is right

SAE features act as reusable thematic components that admit direct interpretation as word or patch distributions.
The SAE-TM procedure learns topic atoms in one training run and merges them into any desired number of topics for new data without retraining.
Topics extracted this way show higher coherence scores and comparable diversity to strong baselines on both text and image collections.
Thematic structure and its changes over time become measurable in image datasets such as historical print series.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Topic-model evaluation metrics such as coherence could be applied directly to SAE features to quantify their interpretability.
The derivation suggests hybrid training schemes that add explicit topic-model regularizers to SAE objectives for domain-specific data.
The same framing could be tested on embeddings from audio or video models to extract thematic patterns in those modalities.

Load-bearing premise

Embedding spaces are generated according to the continuous topic model with its chosen prior on mixtures and likelihood on observed vectors.

What would settle it

If SAE features recovered from real embeddings cannot be interpreted as coherent distributions over words or visual elements on held-out data while dedicated topic models can, the claimed equivalence does not hold in practice.

Figures

Figures reproduced from arXiv: 2511.16309 by Leander Girrbach, Zeynep Akata.

**Figure 2.** Figure 2: Overview of our SAE topic model (SAE-TM): (a) pretrain foundational SAEs on large text or vision datasets to learn transferable [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Statistics of top 10 topics with the highest variance across four popular image datasets. Values indicate the proportion of images [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Statistics of top 10 topics with the highest variance in Japanese woodblock prints from different artistic periods. Changes in topic [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

Sparse autoencoders (SAEs) are used to analyze embeddings, but their role and practical value are debated. We propose a new perspective on SAEs by demonstrating that they can be naturally understood as topic models. We propose a continuous topic model (CTM) inspired by Latent Dirichlet Allocation (LDA) for embedding spaces and derive the SAE objective as a maximum a posteriori estimator under this model. This view implies SAE features are thematic components rather than steerable directions. To confirm our theoretical findings, we introduce SAE-TM, a topic modeling framework that: (1) trains an SAE to learn reusable topic atoms, (2) interprets them as word distributions on downstream data, and (3) merges them into any number of topics without retraining. SAE-TM yields more coherent topics than strong baselines on text and image datasets while maintaining diversity. Finally, we analyze thematic structure in image datasets and trace topic changes over time in Japanese woodblock prints. Our work positions SAEs as effective tools for large-scale thematic analysis across modalities. Code is available at https://github.com/ExplainableML/SAE-TM .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes viewing sparse autoencoders (SAEs) as topic models through a continuous topic model (CTM) inspired by LDA for embedding spaces. It derives the SAE objective (reconstruction loss plus L1 sparsity) as the maximum a posteriori estimator under this CTM. This leads to the SAE-TM framework that trains SAEs for reusable topic atoms, interprets them as word distributions, and merges them into topics. SAE-TM is shown to yield more coherent topics than baselines on text and image datasets, with applications to thematic analysis in images and temporal topic tracing in Japanese woodblock prints.

Significance. If the proposed CTM provides a good description of real embedding spaces, the work establishes a theoretical link between SAEs and topic models, suggesting SAE features capture thematic components. The SAE-TM approach offers a flexible, reusable way to perform topic modeling without retraining for different topic counts. The availability of code at the provided GitHub repository enhances reproducibility. This perspective could be significant for interpretability research in computer vision and multimodal learning.

major comments (2)

[Derivation of SAE objective as MAP estimator under CTM] The manuscript constructs the CTM with a specific prior and likelihood chosen to make the SAE loss the MAP objective. While the algebraic equivalence holds within the model, the claim that this implies SAEs are topic models for real embeddings requires evidence that the CTM's induced distribution matches real data statistics. A concrete test, such as matching the distribution of pairwise cosine similarities or the sparsity patterns in activations, should be included to support the transfer of the interpretation.
[SAE-TM empirical evaluation] The coherence metrics used to compare SAE-TM to baselines have known sensitivity to post-processing choices such as the merging procedure or threshold for interpreting atoms as word distributions. The manuscript should quantify this sensitivity, perhaps via ablation on the merging step or alternative coherence measures, to strengthen the claim of superior performance.

minor comments (2)

[Notation and model definition] The definition of the continuous topic model could benefit from an explicit equation contrasting it with standard LDA to highlight the adaptations for continuous embeddings.
[Related work] Additional citations to prior work on using autoencoders or sparse representations for topic modeling would help contextualize the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below and indicate planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Derivation of SAE objective as MAP estimator under CTM] The manuscript constructs the CTM with a specific prior and likelihood chosen to make the SAE loss the MAP objective. While the algebraic equivalence holds within the model, the claim that this implies SAEs are topic models for real embeddings requires evidence that the CTM's induced distribution matches real data statistics. A concrete test, such as matching the distribution of pairwise cosine similarities or the sparsity patterns in activations, should be included to support the transfer of the interpretation.

Authors: We agree that the algebraic derivation alone does not automatically establish that the CTM describes real embedding spaces. In the revised manuscript we will add experiments that directly compare the distribution of pairwise cosine similarities and the sparsity patterns of activations between samples drawn from the fitted CTM and the actual embeddings used in our text and image experiments. These tests will provide the requested empirical support for transferring the topic-model interpretation to real data. revision: yes
Referee: [SAE-TM empirical evaluation] The coherence metrics used to compare SAE-TM to baselines have known sensitivity to post-processing choices such as the merging procedure or threshold for interpreting atoms as word distributions. The manuscript should quantify this sensitivity, perhaps via ablation on the merging step or alternative coherence measures, to strengthen the claim of superior performance.

Authors: We acknowledge that coherence scores can be sensitive to post-processing decisions. In the revision we will include an ablation study that varies the merging threshold and procedure, and we will additionally report results under alternative coherence measures (e.g., NPMI with different window sizes). These additions will quantify the sensitivity and demonstrate that the performance advantage of SAE-TM remains consistent across reasonable post-processing choices. revision: yes

Circularity Check

1 steps flagged

CTM prior/likelihood chosen so MAP recovers SAE loss exactly, making equivalence hold by model construction

specific steps

self definitional [Abstract; derivation of SAE as MAP under CTM]
"We propose a continuous topic model (CTM) inspired by Latent Dirichlet Allocation (LDA) for embedding spaces and derive the SAE objective as a maximum a posteriori estimator under this model."

The CTM prior and likelihood are defined such that the MAP estimator under the model is exactly the SAE training objective (reconstruction error plus L1 sparsity). The claimed interpretation that SAE features are thematic components therefore follows tautologically from the choice of generative model rather than from any external property of embedding spaces.

full rationale

The paper's central derivation proposes a continuous topic model whose generative assumptions (prior and likelihood) are selected to make the standard SAE reconstruction-plus-L1 objective its MAP estimator. This algebraic equivalence is true inside the assumed model but transfers to real embeddings only if those embeddings are generated by the CTM; no independent validation (posterior-predictive checks, moment matching, or marginal likelihood comparison) is reported. The step therefore reduces to a self-definitional construction rather than an independent justification that SAEs are topic models on observed data.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the CTM generative story being a reasonable model for embeddings and on the SAE loss being exactly the MAP objective under that story; no new particles or forces are introduced.

free parameters (1)

sparsity penalty coefficient
Standard SAE hyperparameter that controls feature activation rate; its value is chosen to match the topic sparsity implicit in the CTM prior.

axioms (1)

domain assumption Embeddings are generated as mixtures of latent topic distributions with a specific prior that induces sparsity.
This generative assumption is introduced to derive the SAE objective; it is not a standard result from embedding literature.

pith-pipeline@v0.9.0 · 5722 in / 1280 out tokens · 60300 ms · 2026-05-21T19:39:13.875556+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

111 extracted references · 111 canonical work pages

[1]

Phi-4 technical report

Marah Abdin, Jyoti Aneja, Harkirat Behl, S ´ebastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harri- son, Russell J Hewett, Mojan Javaheripi, Piero Kauffmann, et al. Phi-4 technical report. InarXiv, 2024. 5, 7

work page 2024
[2]

Unsupervised domain clusters in pretrained language models

Roee Aharoni and Yoav Goldberg. Unsupervised domain clusters in pretrained language models. InACL, 2020. 2

work page 2020
[3]

Top2vec: Distributed representations of topics

Dimo Angelov. Top2vec: Distributed representations of topics. InarXiv, 2020. 2

work page 2020
[4]

Granite embedding r2 models

Parul Awasthy, Aashka Trivedi, Yulong Li, Meet Doshi, Riyaz Bhat, Vishwajeet Kumar, Yushu Yang, Bhavani Iyer, Abraham Daniels, Rudra Murthy, et al. Granite embedding r2 models. InarXiv, 2025. 5

work page 2025
[5]

Cross-lingual contextual- ized topic models with zero-shot learning

Federico Bianchi, Silvia Terragni, Dirk Hovy, Debora Nozza, Elisabetta Fersini, et al. Cross-lingual contextual- ized topic models with zero-shot learning. InEACL, 2021. 5, 6

work page 2021
[6]

Pre- training is a hot topic: contextualized document embed- dings improve topic coherence

Federico Bianchi, Silvia Terragni, Dirk Hovy, et al. Pre- training is a hot topic: contextualized document embed- dings improve topic coherence. InACL, 2021. 5

work page 2021
[7]

Nltk: the natural language toolkit

Steven Bird. Nltk: the natural language toolkit. InCOL- ING/ACL, 2006. 5

work page 2006
[8]

Correlated topic models

David Blei and John Lafferty. Correlated topic models. In NeurIPS, 2006. 2

work page 2006
[9]

Dynamic topic models

David M Blei and John D Lafferty. Dynamic topic models. InICML, 2006. 2

work page 2006
[10]

Latent dirichlet allocation

David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation. InJMLR, 2003. 1, 2, 4

work page 2003
[11]

Food-101 – mining discriminative components with ran- dom forests

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101 – mining discriminative components with ran- dom forests. InECCV, 2014. 6

work page 2014
[12]

Generating sentences from a continuous space

Samuel Bowman, Luke Vilnis, Oriol Vinyals, Andrew Dai, Rafal Jozefowicz, and Samy Bengio. Generating sentences from a continuous space. InSIGNLL, 2016. 2

work page 2016
[13]

Towards monose- manticity: Decomposing language models with dictionary learning

Trenton Bricken, Adly Templeton, Joshua Batson, Brian Chen, Adam Jermyn, Tom Conerly, Nick Turner, Cem Anil, Carson Denison, Amanda Askell, et al. Towards monose- manticity: Decomposing language models with dictionary learning. InTransformer Circuits Thread, 2023. 1, 2, 3, 4

work page 2023
[14]

Decoupling spar- sity and smoothness in the dirichlet variational autoencoder topic model

Sophie Burkhardt and Stefan Kramer. Decoupling spar- sity and smoothness in the dirichlet variational autoencoder topic model. InJMLR, 2019. 2, 5, 6

work page 2019
[15]

Batchtopk sparse autoencoders

Bart Bussmann, Patrick Leask, and Neel Nanda. Batchtopk sparse autoencoders. InNeurIPS Workshop on Scientific Methods for Understanding Deep Learning, 2024. 2, 3, 5, 6, 13

work page 2024
[16]

Neural mod- els for documents with metadata

Dallas Card, Chenhao Tan, and Noah A Smith. Neural mod- els for documents with metadata. InACL, 2018. 2

work page 2018
[17]

Reading tea leaves: How humans interpret topic models

Jonathan Chang, Sean Gerrish, Chong Wang, Jordan Boyd- Graber, and David Blei. Reading tea leaves: How humans interpret topic models. InNeurIPS, 2009. 5

work page 2009
[18]

Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts

Soravit Changpinyo, Piyush Sharma, Nan Ding, and Radu Soricut. Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts. In CVPR, 2021. 6

work page 2021
[19]

You are where you tweet: a content-based approach to geo-locating twitter users

Zhiyuan Cheng, James Caverlee, and Kyumin Lee. You are where you tweet: a content-based approach to geo-locating twitter users. InCIKM, 2010. 5

work page 2010
[20]

From flat to hierarchical: Ex- tracting sparse representations with matching pursuit

Val ´erie Costa, Thomas Fel, Ekdeep Singh Lubana, Bahareh Tolooshams, and Demba Ba. From flat to hierarchical: Ex- tracting sparse representations with matching pursuit. In arXiv, 2025. 2

work page 2025
[21]

Sparse autoencoders find highly interpretable features in language models

Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models. InarXiv, 2023. 2

work page 2023
[22]

Imagenet: A large-scale hierarchical im- age database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical im- age database. InCVPR, 2009. 6

work page 2009
[23]

Topic modeling in embedding spaces

Adji B Dieng, Francisco JR Ruiz, and David M Blei. Topic modeling in embedding spaces. InTACL, 2020. 2, 5, 6

work page 2020
[24]

Toy models of superposition

Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield- Dodds, Robert Lasenby, Dawn Drain, Carol Chen, et al. Toy models of superposition. InarXiv, 2022. 2

work page 2022
[25]

Not all language model features are one-dimensionally linear

Joshua Engels, Eric J Michaud, Isaac Liao, Wes Gurnee, and Max Tegmark. Not all language model features are one-dimensionally linear. InICLR, 2025. 2

work page 2025
[26]

Decomposing the dark matter of sparse autoencoders

Joshua Engels, Logan Riggs Smith, and Max Tegmark. Decomposing the dark matter of sparse autoencoders. In TMLR, 2025. 2

work page 2025
[27]

Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba E

Thomas Fel, Ekdeep Singh Lubana, Jacob S. Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba E. Ba, and Talia Konkle. Archetypal SAE: Adaptive and stable dictionary learning for concept extraction in large vision models. In FICML, 2025. 2

work page 2025
[28]

Scaling and evaluating sparse autoencoders

Leo Gao, Tom Dupre la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, and Jeffrey Wu. Scaling and evaluating sparse autoencoders. In ICLR, 2025. 2, 3, 13

work page 2025
[29]

Uncurated image-text datasets: Shedding light on demographic bias

Noa Garcia, Yusuke Hirota, Yankun Wu, and Yuta Nakashima. Uncurated image-text datasets: Shedding light on demographic bias. InCVPR, 2023. 6

work page 2023
[30]

Sparse-coding variational autoencoders

Victor Geadah, Gabriel Barello, Daniel Greenidge, Adam S Charles, and Jonathan W Pillow. Sparse-coding variational autoencoders. InNeural computation, 2024. 2

work page 2024
[31]

Bertopic: Neural topic modeling with a class-based tf-idf procedure

Maarten Grootendorst. Bertopic: Neural topic modeling with a class-based tf-idf procedure. InarXiv, 2022. 2 9

work page 2022
[32]

Representing mixtures of word embeddings with mixtures of topic embeddings

Dan Guo, He Zhao, Huangjie Zheng, Korawat Tanwisuth, Bo Chen, Mingyuan Zhou, et al. Representing mixtures of word embeddings with mixtures of topic embeddings. In ICLR, 2022. 2

work page 2022
[33]

Apples to apples: A systematic evaluation of topic models

Ismail Harrando, Pasquale Lisena, and Raphael Troncy. Apples to apples: A systematic evaluation of topic models. InRANLP, 2021. 5

work page 2021
[34]

Teaching machines to read and comprehend

Karl Moritz Hermann, Tomas Kocisky, Edward Grefen- stette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. Teaching machines to read and comprehend. InNeurIPS, 2015. 5

work page 2015
[35]

Projecting assumptions: The dual- ity between sparse autoencoders and concept geometry

Sai Sumedh R Hindupur, Ekdeep Singh Lubana, Thomas Fel, and Demba Ba. Projecting assumptions: The dual- ity between sparse autoencoders and concept geometry. In arXiv, 2025. 2

work page 2025
[36]

Online learning for latent dirichlet allocation

Matthew Hoffman, Francis Bach, and David Blei. Online learning for latent dirichlet allocation. InNeurIPS, 2010. 2

work page 2010
[37]

Probabilistic latent semantic indexing

Thomas Hofmann. Probabilistic latent semantic indexing. InSIGIR, 1999. 2

work page 1999
[38]

The curious case of neural text degeneration

Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration. In ICLR, 2020. 5

work page 2020
[39]

Is au- tomated topic model evaluation broken? the incoherence of coherence

Alexander Hoyle, Pranav Goel, Andrew Hian-Cheong, De- nis Peskov, Jordan Boyd-Graber, and Philip Resnik. Is au- tomated topic model evaluation broken? the incoherence of coherence. InNeurIPS, 2021. 2, 5

work page 2021
[40]

Open-set image tagging with multi-grained text su- pervision

Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, and Lei Zhang. Open-set image tagging with multi-grained text su- pervision. InarXiv, 2023. 7

work page 2023
[41]

Sparse autoencoders find highly interpretable features in language models

Robert Huben, Hoagy Cunningham, Logan Riggs Smith, Aidan Ewart, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models. InICLR,

work page
[42]

Openclip

Gabriel Ilharco, Mitchell Wortsman, Ross Wightman, Cade Gordon, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Han- naneh Hajishirzi, and Ludwig Farhadi, Ali an Schmidt. Openclip. InGitHub, 2021. 6

work page 2021
[43]

Brave: Broadening the visual encoding of vision-language models

O ˘guzhan Fatih Kar, Alessio Tonioni, Petra Poklukar, Achin Kulshrestha, Amir Zamir, and Federico Tombari. Brave: Broadening the visual encoding of vision-language models. InECCV, 2024. 6

work page 2024
[44]

SAEBench: A comprehensive benchmark for sparse autoencoders in language model in- terpretability

Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Isaac Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum Stuart McDougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, and Neel Nanda. SAEBench: A comprehensive benchmark for sparse autoencoders in language model in- terpretability. InICML, 2025. 2

work page 2025
[45]

Stylistic multi-task analysis of ukiyo-e woodblock prints

Selina Khan and Nanne van Noord. Stylistic multi-task analysis of ukiyo-e woodblock prints. InBMVC, 2021. 7

work page 2021
[46]

Interpret- ing vision transformers via residual replacement model

Jinyeong Kim, Junhyeok Kim, Yumin Shim, Joohyeok Kim, Sunyoung Jung, and Seong Jae Hwang. Interpret- ing vision transformers via residual replacement model. In arXiv, 2025. 1

work page 2025
[47]

Auto-encoding vari- ational bayes

Diederik P Kingma and Max Welling. Auto-encoding vari- ational bayes. InICLR, 2014. 2

work page 2014
[48]

From superposition to sparse codes: interpretable representations in neural networks

David Klindt, Charles O’Neill, Patrik Reizinger, Harald Maurer, and Nina Miolane. From superposition to sparse codes: interpretable representations in neural networks. In arXiv, 2025. 2

work page 2025
[49]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. InTech Report, 2009. 6

work page 2009
[50]

From word embeddings to document distances

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Wein- berger. From word embeddings to document distances. In ICML, 2015. 5

work page 2015
[51]

Ma- chine reading tea leaves: Automatically evaluating topic co- herence and topic model quality

Jey Han Lau, David Newman, and Timothy Baldwin. Ma- chine reading tea leaves: Automatically evaluating topic co- herence and topic model quality. InEACL, 2014. 5

work page 2014
[52]

Unbiased region- language alignment for open-vocabulary dense prediction

Yunheng Li, Yuxuan Li, Quan-Sheng Zeng, Wenhai Wang, Qibin Hou, and Ming-Ming Cheng. Unbiased region- language alignment for open-vocabulary dense prediction. InCVPR, 2025. 6

work page 2025
[53]

Sparsemax and re- laxed wasserstein for topic sparsity

Tianyi Lin, Zhiyue Hu, and Xin Guo. Sparsemax and re- laxed wasserstein for topic sparsity. InWDSM, 2019. 2

work page 2019
[54]

Sparse autoencoders, again? InICML, 2025

Yin Lu, Xuening Zhu, Tong He, and David Wipf. Sparse autoencoders, again? InICML, 2025. 2

work page 2025
[55]

Learning word vectors for sentiment analysis

Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InACL-HLT, 2011. 5

work page 2011
[56]

K-sparse autoen- coders

Alireza Makhzani and Brendan Frey. K-sparse autoen- coders. InICLR, 2014. 2

work page 2014
[57]

From softmax to sparsemax: A sparse model of attention and multi-label classification

Andre Martins and Ramon Astudillo. From softmax to sparsemax: A sparse model of attention and multi-label classification. InICML, 2016. 2

work page 2016
[58]

Neural variational inference for text processing

Yishu Miao, Lei Yu, and Phil Blunsom. Neural variational inference for text processing. InICML, 2016. 2

work page 2016
[59]

Dis- covering discrete latent topics with neural variational infer- ence

Yishu Miao, Edward Grefenstette, and Phil Blunsom. Dis- covering discrete latent topics with neural variational infer- ence. InICML, 2017. 2

work page 2017
[60]

Efficient estimation of word representations in vector space

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. InarXiv, 2013. 5

work page 2013
[61]

Mitchell.Machine Learning

Tom M. Mitchell.Machine Learning. McGraw-Hill, 1997. 5

work page 1997
[62]

Incorporating hierarchical semantics in sparse au- toencoder architectures

Mark Muchane, Sean Richardson, Kiho Park, and Victor Veitch. Incorporating hierarchical semantics in sparse au- toencoder architectures. InarXiv, 2025. 2

work page 2025
[63]

Matryoshka sparse autoencoders

Noa Nabeshima. Matryoshka sparse autoencoders. InLess- Wrong AI Alignment Forum, 2024. 2

work page 2024
[64]

Topic modeling with wasserstein autoencoders

Feng Nan, Ran Ding, Ramesh Nallapati, and Bing Xiang. Topic modeling with wasserstein autoencoders. InACL,

work page
[65]

Automatic evaluation of topic coherence

David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. Automatic evaluation of topic coherence. In NAACL-HLT, 2010. 5

work page 2010
[66]

Contrastive learning for neural topic model

Thong Nguyen and Anh Tuan Luu. Contrastive learning for neural topic model. InNeurIPS, 2021. 2

work page 2021
[67]

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Bruno A Olshausen and David J Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. InNature, 1996. 2

work page 1996
[68]

The lin- ear representation hypothesis and the geometry of large lan- guage models

Kiho Park, Yo Joong Choe, and Victor Veitch. The lin- ear representation hypothesis and the geometry of large lan- guage models. InICML, 2024. 3 10

work page 2024
[69]

Use sparse autoencoders to discover un- known concepts, not to act on known concepts

Kenny Peng, Rajiv Movva, Jon Kleinberg, Emma Pierson, and Nikhil Garg. Use sparse autoencoders to discover un- known concepts, not to act on known concepts. InarXiv,

work page
[70]

Glove: Global vectors for word representation

Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. InEMNLP, 2014. 5

work page 2014
[71]

Topicgpt: A prompt-based topic modeling framework

Chau Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, and Mohit Iyyer. Topicgpt: A prompt-based topic modeling framework. InNAACL, 2024. 2

work page 2024
[72]

Exploring the limits of transfer learning with a unified text-to-text transformer

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. InJMLR, 2020. 5

work page 2020
[73]

Contex- tualized topic coherence metrics

Hamed Rahimi, David Mimno, Jacob Hoover Vigly, Hubert Naacke, Camelia Constantin, and Bernd Amann. Contex- tualized topic coherence metrics. InEACL Findings, 2024. 5

work page 2024
[74]

Jumping ahead: Improving reconstruction fi- delity with jumprelu sparse autoencoders

Senthooran Rajamanoharan, Tom Lieberum, Nicolas Son- nerat, Arthur Conmy, Vikrant Varma, J ´anos Kram ´ar, and Neel Nanda. Jumping ahead: Improving reconstruction fi- delity with jumprelu sparse autoencoders. InarXiv, 2024. 2

work page 2024
[75]

Efficient learning of sparse repre- sentations with an energy-based model

Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann Cun. Efficient learning of sparse repre- sentations with an energy-based model. InNeurIPS, 2006. 2

work page 2006
[76]

Sparse feature learning for deep belief networks

Marc’Aurelio Ranzato, Y-Lan Boureau, Yann Cun, et al. Sparse feature learning for deep belief networks. In NeurIPS, 2007. 2

work page 2007
[77]

Discover-then-name: Task-agnostic concept bot- tlenecks via automated concept discovery

Sukrut Rao, Sweta Mahajan, Moritz B ¨ohle, and Bernt Schiele. Discover-then-name: Task-agnostic concept bot- tlenecks via automated concept discovery. InECCV, 2024. 5

work page 2024
[78]

Laion- 400m: Open dataset of clip-filtered 400 million image-text pairs

Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion- 400m: Open dataset of clip-filtered 400 million image-text pairs. InarXiv, 2021. 6

work page 2021
[79]

Large scale vari- ational inference and experimental design for sparse gener- alized linear models

Matthias W Seeger and Hannes Nickisch. Large scale vari- ational inference and experimental design for sparse gener- alized linear models. InarXiv, 2008. 2

work page 2008
[80]

Conceptual captions: A cleaned, hypernymed, im- age alt-text dataset for automatic image captioning

Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. Conceptual captions: A cleaned, hypernymed, im- age alt-text dataset for automatic image captioning. InACL,

work page

Showing first 80 references.

[1] [1]

Phi-4 technical report

Marah Abdin, Jyoti Aneja, Harkirat Behl, S ´ebastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harri- son, Russell J Hewett, Mojan Javaheripi, Piero Kauffmann, et al. Phi-4 technical report. InarXiv, 2024. 5, 7

work page 2024

[2] [2]

Unsupervised domain clusters in pretrained language models

Roee Aharoni and Yoav Goldberg. Unsupervised domain clusters in pretrained language models. InACL, 2020. 2

work page 2020

[3] [3]

Top2vec: Distributed representations of topics

Dimo Angelov. Top2vec: Distributed representations of topics. InarXiv, 2020. 2

work page 2020

[4] [4]

Granite embedding r2 models

Parul Awasthy, Aashka Trivedi, Yulong Li, Meet Doshi, Riyaz Bhat, Vishwajeet Kumar, Yushu Yang, Bhavani Iyer, Abraham Daniels, Rudra Murthy, et al. Granite embedding r2 models. InarXiv, 2025. 5

work page 2025

[5] [5]

Cross-lingual contextual- ized topic models with zero-shot learning

Federico Bianchi, Silvia Terragni, Dirk Hovy, Debora Nozza, Elisabetta Fersini, et al. Cross-lingual contextual- ized topic models with zero-shot learning. InEACL, 2021. 5, 6

work page 2021

[6] [6]

Pre- training is a hot topic: contextualized document embed- dings improve topic coherence

Federico Bianchi, Silvia Terragni, Dirk Hovy, et al. Pre- training is a hot topic: contextualized document embed- dings improve topic coherence. InACL, 2021. 5

work page 2021

[7] [7]

Nltk: the natural language toolkit

Steven Bird. Nltk: the natural language toolkit. InCOL- ING/ACL, 2006. 5

work page 2006

[8] [8]

Correlated topic models

David Blei and John Lafferty. Correlated topic models. In NeurIPS, 2006. 2

work page 2006

[9] [9]

Dynamic topic models

David M Blei and John D Lafferty. Dynamic topic models. InICML, 2006. 2

work page 2006

[10] [10]

Latent dirichlet allocation

David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation. InJMLR, 2003. 1, 2, 4

work page 2003

[11] [11]

Food-101 – mining discriminative components with ran- dom forests

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101 – mining discriminative components with ran- dom forests. InECCV, 2014. 6

work page 2014

[12] [12]

Generating sentences from a continuous space

Samuel Bowman, Luke Vilnis, Oriol Vinyals, Andrew Dai, Rafal Jozefowicz, and Samy Bengio. Generating sentences from a continuous space. InSIGNLL, 2016. 2

work page 2016

[13] [13]

Towards monose- manticity: Decomposing language models with dictionary learning

Trenton Bricken, Adly Templeton, Joshua Batson, Brian Chen, Adam Jermyn, Tom Conerly, Nick Turner, Cem Anil, Carson Denison, Amanda Askell, et al. Towards monose- manticity: Decomposing language models with dictionary learning. InTransformer Circuits Thread, 2023. 1, 2, 3, 4

work page 2023

[14] [14]

Decoupling spar- sity and smoothness in the dirichlet variational autoencoder topic model

Sophie Burkhardt and Stefan Kramer. Decoupling spar- sity and smoothness in the dirichlet variational autoencoder topic model. InJMLR, 2019. 2, 5, 6

work page 2019

[15] [15]

Batchtopk sparse autoencoders

Bart Bussmann, Patrick Leask, and Neel Nanda. Batchtopk sparse autoencoders. InNeurIPS Workshop on Scientific Methods for Understanding Deep Learning, 2024. 2, 3, 5, 6, 13

work page 2024

[16] [16]

Neural mod- els for documents with metadata

Dallas Card, Chenhao Tan, and Noah A Smith. Neural mod- els for documents with metadata. InACL, 2018. 2

work page 2018

[17] [17]

Reading tea leaves: How humans interpret topic models

Jonathan Chang, Sean Gerrish, Chong Wang, Jordan Boyd- Graber, and David Blei. Reading tea leaves: How humans interpret topic models. InNeurIPS, 2009. 5

work page 2009

[18] [18]

Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts

Soravit Changpinyo, Piyush Sharma, Nan Ding, and Radu Soricut. Conceptual 12m: Pushing web-scale image-text pre-training to recognize long-tail visual concepts. In CVPR, 2021. 6

work page 2021

[19] [19]

You are where you tweet: a content-based approach to geo-locating twitter users

Zhiyuan Cheng, James Caverlee, and Kyumin Lee. You are where you tweet: a content-based approach to geo-locating twitter users. InCIKM, 2010. 5

work page 2010

[20] [20]

From flat to hierarchical: Ex- tracting sparse representations with matching pursuit

Val ´erie Costa, Thomas Fel, Ekdeep Singh Lubana, Bahareh Tolooshams, and Demba Ba. From flat to hierarchical: Ex- tracting sparse representations with matching pursuit. In arXiv, 2025. 2

work page 2025

[21] [21]

Sparse autoencoders find highly interpretable features in language models

Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models. InarXiv, 2023. 2

work page 2023

[22] [22]

Imagenet: A large-scale hierarchical im- age database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical im- age database. InCVPR, 2009. 6

work page 2009

[23] [23]

Topic modeling in embedding spaces

Adji B Dieng, Francisco JR Ruiz, and David M Blei. Topic modeling in embedding spaces. InTACL, 2020. 2, 5, 6

work page 2020

[24] [24]

Toy models of superposition

Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield- Dodds, Robert Lasenby, Dawn Drain, Carol Chen, et al. Toy models of superposition. InarXiv, 2022. 2

work page 2022

[25] [25]

Not all language model features are one-dimensionally linear

Joshua Engels, Eric J Michaud, Isaac Liao, Wes Gurnee, and Max Tegmark. Not all language model features are one-dimensionally linear. InICLR, 2025. 2

work page 2025

[26] [26]

Decomposing the dark matter of sparse autoencoders

Joshua Engels, Logan Riggs Smith, and Max Tegmark. Decomposing the dark matter of sparse autoencoders. In TMLR, 2025. 2

work page 2025

[27] [27]

Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba E

Thomas Fel, Ekdeep Singh Lubana, Jacob S. Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba E. Ba, and Talia Konkle. Archetypal SAE: Adaptive and stable dictionary learning for concept extraction in large vision models. In FICML, 2025. 2

work page 2025

[28] [28]

Scaling and evaluating sparse autoencoders

Leo Gao, Tom Dupre la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, and Jeffrey Wu. Scaling and evaluating sparse autoencoders. In ICLR, 2025. 2, 3, 13

work page 2025

[29] [29]

Uncurated image-text datasets: Shedding light on demographic bias

Noa Garcia, Yusuke Hirota, Yankun Wu, and Yuta Nakashima. Uncurated image-text datasets: Shedding light on demographic bias. InCVPR, 2023. 6

work page 2023

[30] [30]

Sparse-coding variational autoencoders

Victor Geadah, Gabriel Barello, Daniel Greenidge, Adam S Charles, and Jonathan W Pillow. Sparse-coding variational autoencoders. InNeural computation, 2024. 2

work page 2024

[31] [31]

Bertopic: Neural topic modeling with a class-based tf-idf procedure

Maarten Grootendorst. Bertopic: Neural topic modeling with a class-based tf-idf procedure. InarXiv, 2022. 2 9

work page 2022

[32] [32]

Representing mixtures of word embeddings with mixtures of topic embeddings

Dan Guo, He Zhao, Huangjie Zheng, Korawat Tanwisuth, Bo Chen, Mingyuan Zhou, et al. Representing mixtures of word embeddings with mixtures of topic embeddings. In ICLR, 2022. 2

work page 2022

[33] [33]

Apples to apples: A systematic evaluation of topic models

Ismail Harrando, Pasquale Lisena, and Raphael Troncy. Apples to apples: A systematic evaluation of topic models. InRANLP, 2021. 5

work page 2021

[34] [34]

Teaching machines to read and comprehend

Karl Moritz Hermann, Tomas Kocisky, Edward Grefen- stette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. Teaching machines to read and comprehend. InNeurIPS, 2015. 5

work page 2015

[35] [35]

Projecting assumptions: The dual- ity between sparse autoencoders and concept geometry

Sai Sumedh R Hindupur, Ekdeep Singh Lubana, Thomas Fel, and Demba Ba. Projecting assumptions: The dual- ity between sparse autoencoders and concept geometry. In arXiv, 2025. 2

work page 2025

[36] [36]

Online learning for latent dirichlet allocation

Matthew Hoffman, Francis Bach, and David Blei. Online learning for latent dirichlet allocation. InNeurIPS, 2010. 2

work page 2010

[37] [37]

Probabilistic latent semantic indexing

Thomas Hofmann. Probabilistic latent semantic indexing. InSIGIR, 1999. 2

work page 1999

[38] [38]

The curious case of neural text degeneration

Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration. In ICLR, 2020. 5

work page 2020

[39] [39]

Is au- tomated topic model evaluation broken? the incoherence of coherence

Alexander Hoyle, Pranav Goel, Andrew Hian-Cheong, De- nis Peskov, Jordan Boyd-Graber, and Philip Resnik. Is au- tomated topic model evaluation broken? the incoherence of coherence. InNeurIPS, 2021. 2, 5

work page 2021

[40] [40]

Open-set image tagging with multi-grained text su- pervision

Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, and Lei Zhang. Open-set image tagging with multi-grained text su- pervision. InarXiv, 2023. 7

work page 2023

[41] [41]

Sparse autoencoders find highly interpretable features in language models

Robert Huben, Hoagy Cunningham, Logan Riggs Smith, Aidan Ewart, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models. InICLR,

work page

[42] [42]

Openclip

Gabriel Ilharco, Mitchell Wortsman, Ross Wightman, Cade Gordon, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Han- naneh Hajishirzi, and Ludwig Farhadi, Ali an Schmidt. Openclip. InGitHub, 2021. 6

work page 2021

[43] [43]

Brave: Broadening the visual encoding of vision-language models

O ˘guzhan Fatih Kar, Alessio Tonioni, Petra Poklukar, Achin Kulshrestha, Amir Zamir, and Federico Tombari. Brave: Broadening the visual encoding of vision-language models. InECCV, 2024. 6

work page 2024

[44] [44]

SAEBench: A comprehensive benchmark for sparse autoencoders in language model in- terpretability

Adam Karvonen, Can Rager, Johnny Lin, Curt Tigges, Joseph Isaac Bloom, David Chanin, Yeu-Tong Lau, Eoin Farrell, Callum Stuart McDougall, Kola Ayonrinde, Demian Till, Matthew Wearden, Arthur Conmy, Samuel Marks, and Neel Nanda. SAEBench: A comprehensive benchmark for sparse autoencoders in language model in- terpretability. InICML, 2025. 2

work page 2025

[45] [45]

Stylistic multi-task analysis of ukiyo-e woodblock prints

Selina Khan and Nanne van Noord. Stylistic multi-task analysis of ukiyo-e woodblock prints. InBMVC, 2021. 7

work page 2021

[46] [46]

Interpret- ing vision transformers via residual replacement model

Jinyeong Kim, Junhyeok Kim, Yumin Shim, Joohyeok Kim, Sunyoung Jung, and Seong Jae Hwang. Interpret- ing vision transformers via residual replacement model. In arXiv, 2025. 1

work page 2025

[47] [47]

Auto-encoding vari- ational bayes

Diederik P Kingma and Max Welling. Auto-encoding vari- ational bayes. InICLR, 2014. 2

work page 2014

[48] [48]

From superposition to sparse codes: interpretable representations in neural networks

David Klindt, Charles O’Neill, Patrik Reizinger, Harald Maurer, and Nina Miolane. From superposition to sparse codes: interpretable representations in neural networks. In arXiv, 2025. 2

work page 2025

[49] [49]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. InTech Report, 2009. 6

work page 2009

[50] [50]

From word embeddings to document distances

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Wein- berger. From word embeddings to document distances. In ICML, 2015. 5

work page 2015

[51] [51]

Ma- chine reading tea leaves: Automatically evaluating topic co- herence and topic model quality

Jey Han Lau, David Newman, and Timothy Baldwin. Ma- chine reading tea leaves: Automatically evaluating topic co- herence and topic model quality. InEACL, 2014. 5

work page 2014

[52] [52]

Unbiased region- language alignment for open-vocabulary dense prediction

Yunheng Li, Yuxuan Li, Quan-Sheng Zeng, Wenhai Wang, Qibin Hou, and Ming-Ming Cheng. Unbiased region- language alignment for open-vocabulary dense prediction. InCVPR, 2025. 6

work page 2025

[53] [53]

Sparsemax and re- laxed wasserstein for topic sparsity

Tianyi Lin, Zhiyue Hu, and Xin Guo. Sparsemax and re- laxed wasserstein for topic sparsity. InWDSM, 2019. 2

work page 2019

[54] [54]

Sparse autoencoders, again? InICML, 2025

Yin Lu, Xuening Zhu, Tong He, and David Wipf. Sparse autoencoders, again? InICML, 2025. 2

work page 2025

[55] [55]

Learning word vectors for sentiment analysis

Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InACL-HLT, 2011. 5

work page 2011

[56] [56]

K-sparse autoen- coders

Alireza Makhzani and Brendan Frey. K-sparse autoen- coders. InICLR, 2014. 2

work page 2014

[57] [57]

From softmax to sparsemax: A sparse model of attention and multi-label classification

Andre Martins and Ramon Astudillo. From softmax to sparsemax: A sparse model of attention and multi-label classification. InICML, 2016. 2

work page 2016

[58] [58]

Neural variational inference for text processing

Yishu Miao, Lei Yu, and Phil Blunsom. Neural variational inference for text processing. InICML, 2016. 2

work page 2016

[59] [59]

Dis- covering discrete latent topics with neural variational infer- ence

Yishu Miao, Edward Grefenstette, and Phil Blunsom. Dis- covering discrete latent topics with neural variational infer- ence. InICML, 2017. 2

work page 2017

[60] [60]

Efficient estimation of word representations in vector space

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. InarXiv, 2013. 5

work page 2013

[61] [61]

Mitchell.Machine Learning

Tom M. Mitchell.Machine Learning. McGraw-Hill, 1997. 5

work page 1997

[62] [62]

Incorporating hierarchical semantics in sparse au- toencoder architectures

Mark Muchane, Sean Richardson, Kiho Park, and Victor Veitch. Incorporating hierarchical semantics in sparse au- toencoder architectures. InarXiv, 2025. 2

work page 2025

[63] [63]

Matryoshka sparse autoencoders

Noa Nabeshima. Matryoshka sparse autoencoders. InLess- Wrong AI Alignment Forum, 2024. 2

work page 2024

[64] [64]

Topic modeling with wasserstein autoencoders

Feng Nan, Ran Ding, Ramesh Nallapati, and Bing Xiang. Topic modeling with wasserstein autoencoders. InACL,

work page

[65] [65]

Automatic evaluation of topic coherence

David Newman, Jey Han Lau, Karl Grieser, and Timothy Baldwin. Automatic evaluation of topic coherence. In NAACL-HLT, 2010. 5

work page 2010

[66] [66]

Contrastive learning for neural topic model

Thong Nguyen and Anh Tuan Luu. Contrastive learning for neural topic model. InNeurIPS, 2021. 2

work page 2021

[67] [67]

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Bruno A Olshausen and David J Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. InNature, 1996. 2

work page 1996

[68] [68]

The lin- ear representation hypothesis and the geometry of large lan- guage models

Kiho Park, Yo Joong Choe, and Victor Veitch. The lin- ear representation hypothesis and the geometry of large lan- guage models. InICML, 2024. 3 10

work page 2024

[69] [69]

Use sparse autoencoders to discover un- known concepts, not to act on known concepts

Kenny Peng, Rajiv Movva, Jon Kleinberg, Emma Pierson, and Nikhil Garg. Use sparse autoencoders to discover un- known concepts, not to act on known concepts. InarXiv,

work page

[70] [70]

Glove: Global vectors for word representation

Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. InEMNLP, 2014. 5

work page 2014

[71] [71]

Topicgpt: A prompt-based topic modeling framework

Chau Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, and Mohit Iyyer. Topicgpt: A prompt-based topic modeling framework. InNAACL, 2024. 2

work page 2024

[72] [72]

Exploring the limits of transfer learning with a unified text-to-text transformer

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. InJMLR, 2020. 5

work page 2020

[73] [73]

Contex- tualized topic coherence metrics

Hamed Rahimi, David Mimno, Jacob Hoover Vigly, Hubert Naacke, Camelia Constantin, and Bernd Amann. Contex- tualized topic coherence metrics. InEACL Findings, 2024. 5

work page 2024

[74] [74]

Jumping ahead: Improving reconstruction fi- delity with jumprelu sparse autoencoders

Senthooran Rajamanoharan, Tom Lieberum, Nicolas Son- nerat, Arthur Conmy, Vikrant Varma, J ´anos Kram ´ar, and Neel Nanda. Jumping ahead: Improving reconstruction fi- delity with jumprelu sparse autoencoders. InarXiv, 2024. 2

work page 2024

[75] [75]

Efficient learning of sparse repre- sentations with an energy-based model

Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann Cun. Efficient learning of sparse repre- sentations with an energy-based model. InNeurIPS, 2006. 2

work page 2006

[76] [76]

Sparse feature learning for deep belief networks

Marc’Aurelio Ranzato, Y-Lan Boureau, Yann Cun, et al. Sparse feature learning for deep belief networks. In NeurIPS, 2007. 2

work page 2007

[77] [77]

Discover-then-name: Task-agnostic concept bot- tlenecks via automated concept discovery

Sukrut Rao, Sweta Mahajan, Moritz B ¨ohle, and Bernt Schiele. Discover-then-name: Task-agnostic concept bot- tlenecks via automated concept discovery. InECCV, 2024. 5

work page 2024

[78] [78]

Laion- 400m: Open dataset of clip-filtered 400 million image-text pairs

Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion- 400m: Open dataset of clip-filtered 400 million image-text pairs. InarXiv, 2021. 6

work page 2021

[79] [79]

Large scale vari- ational inference and experimental design for sparse gener- alized linear models

Matthias W Seeger and Hannes Nickisch. Large scale vari- ational inference and experimental design for sparse gener- alized linear models. InarXiv, 2008. 2

work page 2008

[80] [80]

Conceptual captions: A cleaned, hypernymed, im- age alt-text dataset for automatic image captioning

Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. Conceptual captions: A cleaned, hypernymed, im- age alt-text dataset for automatic image captioning. InACL,

work page