pith. sign in

arxiv: 2511.01680 · v3 · submitted 2025-11-03 · 💰 econ.EM · cs.LG

Making Interpretable Discoveries from Unstructured Data: A High-Dimensional Multiple Hypothesis Testing Approach

Pith reviewed 2026-05-18 01:54 UTC · model grok-4.3

classification 💰 econ.EM cs.LG
keywords unstructured dataselective inferenceconcept embeddingsmultiple hypothesis testingAI interpretabilitydiscoveryhigh-dimensional statisticsempirical economics
0
0 comments X

The pith

A framework maps unstructured data to concept embeddings and uses selective inference to produce statistically valid interpretable discoveries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Social scientists increasingly analyze unstructured data such as text, audio, or video to uncover new empirical patterns, but they often cannot pre-specify every relevant aspect in advance. This paper offers a framework that converts each data point into high-dimensional sparse concept embeddings drawn from AI interpretability methods, then tests hypotheses about individual concepts. Selective inference algorithms, supported by new high-dimensional central limit theory results, select reliable discoveries while guarding against the effects of examining many possible patterns at once. The same pipeline produces and evaluates plain-language descriptions of the selected findings. The method limits researcher degrees of freedom and supports rapid sensitivity checks and replication.

Core claim

The paper presents a general framework for unsupervised discovery from unstructured data. It maps data points to high-dimensional, sparse, and interpretable concept embeddings using recent AI interpretability techniques, computes statistics for testing interpretable concept-by-concept hypotheses, performs selective inference on these hypotheses with algorithms validated by new high-dimensional central limit theory results to yield a selected set of discoveries, and generates and evaluates human-interpretable natural language descriptions of the discoveries. The framework features few researcher degrees of freedom and is robust to data snooping and post-selection inference concerns.

What carries the argument

concept embeddings: high-dimensional, sparse, and interpretable representations of unstructured data points produced by AI interpretability methods, which enable the computation of per-concept test statistics for subsequent selective inference.

Load-bearing premise

The selective inference procedures remain valid when applied to statistics derived from AI-generated concept embeddings rather than from pre-specified variables, which requires the new high-dimensional central limit theory results to hold for the dependence structure induced by the embedding step.

What would settle it

Apply the framework to a simulated dataset in which known concept effects are planted at known rates and check whether the recovered discoveries match the planted effects at the nominal false-discovery rate.

Figures

Figures reproduced from arXiv: 2511.01680 by Jacob Carlson.

Figure 1
Figure 1. Figure 1: Discoveries for Experiment 2 of Bursztyn et al. (2023) [PITH_FULL_IMAGE:figures/full_fig_p023_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ranked Discoveries for “High inflation is caused by...” in Stantcheva (2024) [PITH_FULL_IMAGE:figures/full_fig_p025_2.png] view at source ↗
Figure 2
Figure 2. Figure 2 [PITH_FULL_IMAGE:figures/full_fig_p033_2.png] view at source ↗
read the original abstract

Social scientists are increasingly turning to unstructured datasets to unlock new empirical insights, e.g., estimating descriptive statistics of or causal effects on quantitative measures derived from text, audio, or video data. In many such settings, unsupervised analysis is of primary interest, in that the researcher does not want to (or cannot) manually pre-specify all important aspects of the unstructured data to measure; they are interested in "discovery." This paper proposes a general and flexible framework for pursuing such discovery from unstructured data in a statistically principled way. The framework leverages recent methods from the literature on AI interpretability to map unstructured data points to high-dimensional, sparse, and interpretable "concept embeddings"; computes statistics from these concept embeddings for testing interpretable, concept-by-concept hypotheses; performs selective inference on these hypotheses using algorithms validated by new results in high-dimensional central limit theory, producing a selected set ("discoveries"); and both generates and evaluates human-interpretable natural language descriptions of these discoveries. The proposed framework has few researcher degrees of freedom, is robust to data snooping and other post-selection inference concerns, and facilitates fast and inexpensive sensitivity analysis and replication. Applications to recent descriptive and causal analyses of unstructured data in empirical economics are explored. Open source code is provided for researchers to implement the framework in their own projects.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a general framework for interpretable discovery from unstructured data (text, audio, video) in empirical economics. It maps data points to high-dimensional sparse concept embeddings via AI interpretability methods, computes concept-by-concept test statistics, performs selective inference on these hypotheses using algorithms justified by new high-dimensional central limit theory results, selects a set of discoveries, and generates/evaluates human-interpretable natural language descriptions. The approach is claimed to have few researcher degrees of freedom, robustness to data snooping, and support for sensitivity analysis and replication; applications to recent descriptive and causal analyses are explored, with open-source code provided.

Significance. If the selective inference remains valid when applied to statistics derived from AI-generated concept embeddings, the framework could meaningfully advance principled discovery in social science applications of unstructured data by reducing post-selection concerns and researcher degrees of freedom while producing interpretable outputs. The open-source code and emphasis on fast sensitivity analysis are concrete strengths that would aid adoption and reproducibility if the core validity claims hold.

major comments (2)
  1. [Framework description and theoretical results] The selective inference step relies on new high-dimensional central limit theory results to guarantee validity for test statistics computed from concept embeddings. The abstract and framework description do not specify how the dependence structure induced by shared AI model parameters, nonlinear feature extraction, and sparsity patterns is incorporated into the CLT assumptions (e.g., weak dependence or covariance decay conditions). This is load-bearing for the post-selection coverage guarantees.
  2. [High-dimensional CLT results and selective inference algorithms] No derivation, simulation evidence, or explicit verification is provided in the abstract or framework overview showing that the CLT theorems apply under the embedding-induced dependence rather than pre-specified variables. Without this, the claim that the procedures produce valid selected discoveries cannot be assessed.
minor comments (2)
  1. [Section 2] Clarify the precise definition and dimensionality of 'concept embeddings' early in the manuscript to avoid ambiguity when discussing the mapping from unstructured data.
  2. [Applications and evaluation] The description of how natural language descriptions of discoveries are generated and evaluated could include more detail on the evaluation metrics or human validation procedure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. The points raised regarding the specification of dependence structures in our high-dimensional central limit theory and the need for explicit verification in the framework overview are well taken. We address each major comment below and will revise the manuscript accordingly to improve clarity without altering the core claims.

read point-by-point responses
  1. Referee: [Framework description and theoretical results] The selective inference step relies on new high-dimensional central limit theory results to guarantee validity for test statistics computed from concept embeddings. The abstract and framework description do not specify how the dependence structure induced by shared AI model parameters, nonlinear feature extraction, and sparsity patterns is incorporated into the CLT assumptions (e.g., weak dependence or covariance decay conditions). This is load-bearing for the post-selection coverage guarantees.

    Authors: We agree that the framework overview would benefit from explicit discussion of how AI-induced dependence is accommodated. The CLT theorems (Section 4) are derived under general conditions permitting weak dependence and covariance decay, which are satisfied by the sparse concept embeddings even when derived from models with shared parameters and nonlinear extraction; the sparsity limits effective dependence. We will revise Section 2 to state these conditions and their applicability to typical AI embeddings, thereby clarifying support for the coverage guarantees. revision: yes

  2. Referee: [High-dimensional CLT results and selective inference algorithms] No derivation, simulation evidence, or explicit verification is provided in the abstract or framework overview showing that the CLT theorems apply under the embedding-induced dependence rather than pre-specified variables. Without this, the claim that the procedures produce valid selected discoveries cannot be assessed.

    Authors: The derivations appear in the appendix and simulation evidence verifying the CLT under dependent embedding-like structures is in Section 5.2. We acknowledge that the framework overview lacks a direct summary of this applicability. In revision we will add a brief paragraph to Section 2 summarizing the relevant CLT conditions and noting that the simulations support validity for embedding-induced dependence, enabling readers to assess the selected discoveries. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework relies on external methods and independent new theory

full rationale

The paper introduces a framework that maps unstructured data to concept embeddings via external AI interpretability literature, derives test statistics from those embeddings, and applies selective inference whose validity is supported by newly stated high-dimensional central limit theory results developed in the paper itself. No quoted equation or step reduces a claimed prediction or discovery to a fitted parameter or self-referential definition by construction. The new CLT results are presented as general theoretical contributions addressing high-dimensional dependence, not as tautological restatements of the target empirical discoveries or as load-bearing self-citations. The derivation chain therefore remains self-contained against external benchmarks from AI methods and the paper's own theory sections, with no reduction of the central claims to their inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the validity of newly invoked high-dimensional central limit theory results for the dependence structure created by AI embeddings, plus the assumption that the embeddings themselves are sufficiently sparse and interpretable for concept-by-concept testing. No explicit free parameters are named in the abstract, but the choice of embedding model and any regularization thresholds function as researcher choices that affect downstream discoveries.

axioms (1)
  • domain assumption New results in high-dimensional central limit theory validate the selective inference algorithms when applied to statistics computed from concept embeddings
    Abstract states that selective inference uses 'algorithms validated by new results in high-dimensional central limit theory'
invented entities (1)
  • concept embeddings no independent evidence
    purpose: Map unstructured data to high-dimensional sparse interpretable vectors for hypothesis testing
    Introduced as the bridge between raw unstructured data and testable concepts; independent evidence would be external validation that these embeddings recover human-interpretable features without supervision

pith-pipeline@v0.9.0 · 5757 in / 1592 out tokens · 43437 ms · 2026-05-18T01:54:06.637306+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. In your own words: computationally identifying interpretable themes in free-text survey data

    cs.CY 2026-03 unverdicted novelty 6.0

    A computational framework identifies more coherent themes in free-text survey data on race, gender, and sexual orientation than previous methods, with applications for survey design, explaining variation, and detectin...

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · cited by 1 Pith paper · 7 internal anchors

  1. [1]

    The Voice of Monetary Policy

    Yuriy Gorodnichenko, Tho Pham, and Oleksandr Talavera. The Voice of Monetary Policy . American Economic Review, 113 0 (2): 0 548--584, February 2023. ISSN 0002-8282. doi:10.1257/aer.20220129. URL https://pubs.aeaweb.org/doi/10.1257/aer.20220129

  2. [2]

    Persuading Investors : A Video ‐ Based Study

    Allen Hu and Song Ma. Persuading Investors : A Video ‐ Based Study . The Journal of Finance, 80 0 (5): 0 2639--2688, October 2025. ISSN 0022-1082, 1540-6261. doi:10.1111/jofi.13471. URL https://onlinelibrary.wiley.com/doi/10.1111/jofi.13471

  3. [3]

    Understanding Economic Behavior Using Open -ended Survey Data

    Ingar Haaland, Christopher Roth, Stefanie Stantcheva, and Johannes Wohlfart. Understanding Economic Behavior Using Open -ended Survey Data . Technical Report w32421, National Bureau of Economic Research, Cambridge, MA, May 2024. URL http://www.nber.org/papers/w32421.pdf

  4. [4]

    Machine Learning as a Tool for Hypothesis Generation

    Jens Ludwig and Sendhil Mullainathan. Machine Learning as a Tool for Hypothesis Generation . The Quarterly Journal of Economics, 139 0 (2): 0 751--827, March 2024. ISSN 0033-5533, 1531-4650. doi:10.1093/qje/qjad055. URL https://academic.oup.com/qje/article/139/2/751/7515309

  5. [5]

    State of the Art : Economic Development Through the Lens of Paintings

    Clément Gorin, Stephan Heblich, and Yanos Zylberberg. State of the Art : Economic Development Through the Lens of Paintings . Technical Report w33976, National Bureau of Economic Research, Cambridge, MA, June 2025. URL http://www.nber.org/papers/w33976.pdf

  6. [6]

    American Life Histories

    David Lagakos, Stelios Michalopoulos, and Hans-Joachim Voth. American Life Histories . Technical Report w33373, National Bureau of Economic Research, Cambridge, MA, January 2025. URL http://www.nber.org/papers/w33373.pdf

  7. [7]

    Katz, and Christopher Palmer

    Peter Bergman, Raj Chetty, Stefanie DeLuca, Nathaniel Hendren, Lawrence F. Katz, and Christopher Palmer. Creating Moves to Opportunity : Experimental Evidence on Barriers to Neighborhood Choice . American Economic Review, 114 0 (5): 0 1281--1337, May 2024. ISSN 0002-8282. doi:10.1257/aer.20200407. URL https://pubs.aeaweb.org/doi/10.1257/aer.20200407

  8. [8]

    The Impact of Unconditional Cash Transfers on Parenting and Children

    Patrick Krause, Elizabeth Rhodes, Sarah Miller, Alexander Bartik, David Broockman, and Eva Vivalt. The Impact of Unconditional Cash Transfers on Parenting and Children . Technical Report w34040, National Bureau of Economic Research, Cambridge, MA, July 2025. URL http://www.nber.org/papers/w34040.pdf

  9. [9]

    Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I

    Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I. Jordan, and Tijana Zrnic. Prediction-powered inference. Science, 382 0 (6671): 0 669--674, November 2023. doi:10.1126/science.adi6000. URL https://www.science.org/doi/10.1126/science.adi6000. Publisher: American Association for the Advancement of Science

  10. [10]

    Large Language Models : An Applied Econometric Framework , December 2024

    Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. Large Language Models : An Applied Econometric Framework , December 2024. URL http://arxiv.org/abs/2412.07031. arXiv:2412.07031 [econ]

  11. [11]

    A Unifying Framework for Robust and Efficient Inference with Unstructured Data , 2025

    Jacob Carlson and Melissa Dell. A Unifying Framework for Robust and Efficient Inference with Unstructured Data , 2025. URL https://arxiv.org/abs/2505.00282. Version Number: 2

  12. [12]

    Justifying Dissent

    Leonardo Bursztyn, Georgy Egorov, Ingar Haaland, Aakaash Rao, and Christopher Roth. Justifying Dissent . The Quarterly Journal of Economics, 138 0 (3): 0 1403--1451, June 2023. ISSN 0033-5533, 1531-4650. doi:10.1093/qje/qjad007. URL https://academic.oup.com/qje/article/138/3/1403/7000850

  13. [13]

    Why Do We Dislike Inflation ? Technical Report w32300, National Bureau of Economic Research, Cambridge, MA, April 2024

    Stefanie Stantcheva. Why Do We Dislike Inflation ? Technical Report w32300, National Bureau of Economic Research, Cambridge, MA, April 2024. URL http://www.nber.org/papers/w32300.pdf

  14. [14]

    Towards monosemanticity: Decomposing language models with dictionary learning

    Trenton Bricken, Adly Templeton, Joshua Batson, Brian Chen, Adam Jermyn, Tom Conerly, Nick Turner, Cem Anil, Carson Denison, Amanda Askell, Robert Lasenby, Yifan Wu, Shauna Kravec, Nicholas Schiefer, Tim Maxwell, Nicholas Joseph, Zac Hatfield-Dodds, Alex Tamkin, Karina Nguyen, Brayden McLean, Josiah E Burke, Tristan Hume, Shan Carter, Tom Henighan, and Ch...

  15. [15]

    Romano and Michael Wolf

    Joseph P. Romano and Michael Wolf. Control of generalized error rates in multiple testing. The Annals of Statistics, 35 0 (4), August 2007. ISSN 0090-5364. doi:10.1214/009053606000001622. URL https://projecteuclid.org/journals/annals-of-statistics/volume-35/issue-4/Control-of-generalized-error-rates-in-multiple-testing/10.1214/009053606000001622.full

  16. [16]

    Towards universality: Studying mechanistic similarity across language model architectures

    Junxuan Wang, Xuyang Ge, Wentao Shu, Qiong Tang, Yunhua Zhou, Zhengfu He, and Xipeng Qiu. Towards universality: Studying mechanistic similarity across language model architectures. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=2J18i8T0oI

  17. [17]

    Machine Learning : An Applied Econometric Approach

    Sendhil Mullainathan and Jann Spiess. Machine Learning : An Applied Econometric Approach . Journal of Economic Perspectives, 31 0 (2): 0 87--106, May 2017. ISSN 0895-3309. doi:10.1257/jep.31.2.87. URL https://pubs.aeaweb.org/doi/10.1257/jep.31.2.87

  18. [18]

    Text as Data

    Matthew Gentzkow, Bryan Kelly, and Matt Taddy. Text as Data . Journal of Economic Literature, 57 0 (3): 0 535--574, September 2019 a . ISSN 0022-0515. doi:10.1257/jel.20181020. URL https://pubs.aeaweb.org/doi/10.1257/jel.20181020

  19. [19]

    Text Algorithms in Economics

    Elliott Ash and Stephen Hansen. Text Algorithms in Economics . Annual Review of Economics, 15 0 (1): 0 659--688, September 2023. ISSN 1941-1383, 1941-1391. doi:10.1146/annurev-economics-082222-074352. URL https://www.annualreviews.org/content/journals/10.1146/annurev-economics-082222-074352

  20. [20]

    Deep Learning for Economists

    Melissa Dell. Deep Learning for Economists . Journal of Economic Literature, 63 0 (1): 0 5--58, March 2025. ISSN 0022-0515, 2328-8175. doi:10.1257/jel.20241733. URL https://pubs.aeaweb.org/doi/10.1257/jel.20241733

  21. [21]

    Program Evaluation with Remotely Sensed Outcomes , 2024

    Ashesh Rambachan, Rahul Singh, and Davide Viviano. Program Evaluation with Remotely Sensed Outcomes , 2024. URL https://arxiv.org/abs/2411.10959. Version Number: 2

  22. [22]

    Shapiro, and Matt Taddy

    Matthew Gentzkow, Jesse M. Shapiro, and Matt Taddy. Measuring Group Differences in High ‐ Dimensional Choices : Method and Application to Congressional Speech . Econometrica, 87 0 (4): 0 1307--1340, 2019 b . ISSN 0012-9682. doi:10.3982/ECTA16566. URL https://www.econometricsociety.org/doi/10.3982/ECTA16566

  23. [23]

    Inference for Regression with Variables Generated by AI or Machine Learning , 2024

    Laura Battaglia, Timothy Christensen, Stephen Hansen, and Szymon Sacher. Inference for Regression with Variables Generated by AI or Machine Learning , 2024. URL https://arxiv.org/abs/2402.15585. Version Number: 5

  24. [24]

    Causal Inference on Outcomes Learned from Text , 2025

    Iman Modarressi, Jann Spiess, and Amar Venugopal. Causal Inference on Outcomes Learned from Text , 2025. URL https://arxiv.org/abs/2503.00725. Version Number: 1

  25. [25]

    Machine-Learning Tests for Effects on Multiple Outcomes

    Jens Ludwig, Sendhil Mullainathan, and Jann Spiess. Machine- Learning Tests for Effects on Multiple Outcomes , 2017. URL https://arxiv.org/abs/1707.01473. Version Number: 2

  26. [26]

    Romano, Azeem M

    Joseph P. Romano, Azeem M. Shaikh, and Michael Wolf. Hypothesis Testing in Econometrics . Annual Review of Economics, 2 0 (1): 0 75--104, September 2010. ISSN 1941-1383, 1941-1391. doi:10.1146/annurev.economics.102308.124342. URL https://www.annualreviews.org/doi/10.1146/annurev.economics.102308.124342

  27. [27]

    Controlling the False Discovery Rate : A Practical and Powerful Approach to Multiple Testing

    Yoav Benjamini and Yosef Hochberg. Controlling the False Discovery Rate : A Practical and Powerful Approach to Multiple Testing . Journal of the Royal Statistical Society. Series B (Methodological), 57 0 (1): 0 289--300, 1995. ISSN 00359246. URL http://www.jstor.org/stable/2346101. Publisher: [Royal Statistical Society, Oxford University Press]

  28. [28]

    The Control of the False Discovery Rate in Multiple Testing under Dependency

    Yoav Benjamini and Daniel Yekutieli. The Control of the False Discovery Rate in Multiple Testing under Dependency . The Annals of Statistics, 29 0 (4): 0 1165--1188, 2001. ISSN 00905364, 21688966. URL http://www.jstor.org/stable/2674075. Publisher: Institute of Mathematical Statistics

  29. [29]

    Improved central limit theorem and bootstrap approximations in high dimensions

    Victor Chernozhuokov, Denis Chetverikov, Kengo Kato, and Yuta Koike. Improved central limit theorem and bootstrap approximations in high dimensions. The Annals of Statistics, 50 0 (5), October 2022. ISSN 0090-5364. doi:10.1214/22-AOS2193. URL https://projecteuclid.org/journals/annals-of-statistics/volume-50/issue-5/Improved-central-limit-theorem-and-boots...

  30. [30]

    Gaussian Multiplier Bootstrap Procedure for the k-th Largest Coordinate of High - Dimensional Statistics , 2025

    Yixi Ding, Qizhai Li, Yuke Shi, and Liuquan Sun. Gaussian Multiplier Bootstrap Procedure for the k-th Largest Coordinate of High - Dimensional Statistics , 2025. URL https://arxiv.org/abs/2508.14400. Version Number: 1

  31. [31]

    High-Dimensional Econometrics and Regularized GMM

    Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Christian Hansen, and Kengo Kato. High- Dimensional Econometrics and Regularized GMM , 2018. URL https://arxiv.org/abs/1806.01888. Version Number: 2

  32. [32]

    Phase transition and regularized bootstrap in large-scale \ t\ -tests with false discovery rate control

    Weidong Liu and Qi-Man Shao. Phase transition and regularized bootstrap in large-scale \ t\ -tests with false discovery rate control. The Annals of Statistics, 42 0 (5), October 2014. ISSN 0090-5364. doi:10.1214/14-AOS1249. URL https://projecteuclid.org/journals/annals-of-statistics/volume-42/issue-5/Phase-transition-and-regularized-bootstrap-in-large-sca...

  33. [33]

    Sparse Autoencoders for Hypothesis Generation , 2025

    Rajiv Movva, Kenny Peng, Nikhil Garg, Jon Kleinberg, and Emma Pierson. Sparse Autoencoders for Hypothesis Generation , 2025. URL https://arxiv.org/abs/2502.04382. Version Number: 3

  34. [34]

    Towards A Rigorous Science of Interpretable Machine Learning

    Finale Doshi-Velez and Been Kim. Towards A Rigorous Science of Interpretable Machine Learning , 2017. URL https://arxiv.org/abs/1702.08608. Version Number: 2

  35. [35]

    Daniel Freeman, Theodore R

    Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Trenton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L Turner, Callum McDougall, Monte MacDiarmid, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermyn, Shan Carter, Chris Olah, and Tom Henighan. Scaling monosema...

  36. [36]

    2025 , archivePrefix=

    Patrick Leask, Bart Bussmann, Michael Pearce, Joseph Bloom, Curt Tigges, Noura Al Moubayed, Lee Sharkey, and Neel Nanda. Sparse Autoencoders Do Not Find Canonical Units of Analysis , February 2025. URL http://arxiv.org/abs/2502.04878. arXiv:2502.04878 [cs]

  37. [37]

    Use sparse autoencoders to discover unknown concepts, not to act on known concepts.arXiv preprint arXiv:2506.23845,

    Kenny Peng, Rajiv Movva, Jon Kleinberg, Emma Pierson, and Nikhil Garg. Use Sparse Autoencoders to Discover Unknown Concepts , Not to Act on Known Concepts , 2025. URL https://arxiv.org/abs/2506.23845. Version Number: 1

  38. [38]

    2025 , archivePrefix=

    Gonçalo Paulo, Stepan Shabalin, and Nora Belrose. Transcoders Beat Sparse Autoencoders for Interpretability , 2025. URL https://arxiv.org/abs/2501.18823. Version Number: 2

  39. [39]

    Language models can explain neurons in language models

    Steven Bills, Nick Cammarata, Dan Mossing, Henk Tillman, Leo Gao, Gabriel Goh, Ilya Sutskever, Jan Leike, Jeff Wu, and William Saunders. Language models can explain neurons in language models. https://openaipublic.blob.core.windows.net/neuron-explainer/paper/index.html, 2023

  40. [40]

    A Multimodal Automated Interpretability Agent , 2024

    Tamar Rott Shaham, Sarah Schwettmann, Franklin Wang, Achyuta Rajaram, Evan Hernandez, Jacob Andreas, and Antonio Torralba. A Multimodal Automated Interpretability Agent , 2024. URL https://arxiv.org/abs/2404.14394. Version Number: 2

  41. [41]

    Automatically interpreting millions of features in large language models.arXiv preprint arXiv:2410.13928, 2024

    Gonçalo Paulo, Alex Mallen, Caden Juang, and Nora Belrose. Automatically Interpreting Millions of Features in Large Language Models , 2024. URL https://arxiv.org/abs/2410.13928. Version Number: 3

  42. [42]

    Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

    Senthooran Rajamanoharan, Tom Lieberum, Nicolas Sonnerat, Arthur Conmy, Vikrant Varma, János Kramár, and Neel Nanda. Jumping Ahead : Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders , 2024. URL https://arxiv.org/abs/2407.14435. Version Number: 3

  43. [43]

    Aishwarya Agrawal, Jiasen Lu, Stanislaw Antol, Margaret Mitchell, C

    Ahmed Abdulaal, Hugo Fry, Nina Montaña-Brown, Ayodeji Ijishakin, Jack Gao, Stephanie Hyland, Daniel C. Alexander, and Daniel C. Castro. An X - Ray Is Worth 15 Features : Sparse Autoencoders for Interpretable Radiology Report Generation , 2024. URL https://arxiv.org/abs/2410.03334. Version Number: 1

  44. [44]

    Calmon, and Himabindu Lakkaraju

    Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio P. Calmon, and Himabindu Lakkaraju. Interpreting CLIP with Sparse Linear Concept Embeddings ( SpLiCE ), 2024. URL https://arxiv.org/abs/2402.10376. Version Number: 2

  45. [45]

    Towards multimodal interpretability: Learning sparse interpretable features in vision transformers, Apr 2024

    Hugo Fry. Towards multimodal interpretability: Learning sparse interpretable features in vision transformers, Apr 2024. URL https://www.lesswrong.com/posts/iYFuZo9BMvr6GgMs5/case-study-interpreting-manipulating-and-controlling-clip. Accessed: 2024-05-16

  46. [46]

    Case study: Interpreting, manipulating, and controlling clip with sparse autoencoders, Aug 2024

    Gytis Daujotas. Case study: Interpreting, manipulating, and controlling clip with sparse autoencoders, Aug 2024. URL https://www.lesswrong.com/posts/iYFuZo9BMvr6GgMs5/case-study-interpreting-manipulating-and-controlling-clip. Accessed: 2025-10-03

  47. [47]

    Daniel Pluth, Yu Zhou, and Vijay K. Gurbani. Sparse Autoencoder Insights on Voice Embeddings , 2025. URL https://arxiv.org/abs/2502.00127. Version Number: 1

  48. [48]

    Central limit theorems and bootstrap in high dimensions

    Victor Chernozhukov, Denis Chetverikov, and Kengo Kato. Central limit theorems and bootstrap in high dimensions. The Annals of Probability, 45 0 (4), July 2017. ISSN 0091-1798. doi:10.1214/16-AOP1113. URL https://projecteuclid.org/journals/annals-of-probability/volume-45/issue-4/Central-limit-theorems-and-bootstrap-in-high-dimensions/10.1214/16-AOP1113.full

  49. [49]

    High- Dimensional Data Bootstrap

    Victor Chernozhukov, Denis Chetverikov, Kengo Kato, and Yuta Koike. High- Dimensional Data Bootstrap . Annual Review of Statistics and Its Application, 10 0 (1): 0 427--449, March 2023. ISSN 2326-8298, 2326-831X. doi:10.1146/annurev-statistics-040120-022239. URL https://www.annualreviews.org/doi/10.1146/annurev-statistics-040120-022239

  50. [50]

    Inference on Winners

    Isaiah Andrews, Toru Kitagawa, and Adam McCloskey. Inference on Winners . The Quarterly Journal of Economics, 139 0 (1): 0 305--358, January 2024. ISSN 0033-5533, 1531-4650. doi:10.1093/qje/qjad043. URL https://academic.oup.com/qje/article/139/1/305/7276491

  51. [51]

    Adding Error Bars to Evals : A Statistical Approach to Language Model Evaluations , 2024

    Evan Miller. Adding Error Bars to Evals : A Statistical Approach to Language Model Evaluations , 2024. URL https://arxiv.org/abs/2411.00640. Version Number: 1

  52. [52]

    Matthew Gentzkow and Jesse M. Shapiro. What Drives Media Slant ? Evidence From U . S . Daily Newspapers . Econometrica, 78 0 (1): 0 35--71, January 2010. ISSN 0012-9682. doi:10.3982/ECTA7195. URL https://doi.org/10.3982/ECTA7195. Publisher: John Wiley & Sons, Ltd

  53. [53]

    Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

    Tom Lieberum, Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Nicolas Sonnerat, Vikrant Varma, János Kramár, Anca Dragan, Rohin Shah, and Neel Nanda. Gemma Scope : Open Sparse Autoencoders Everywhere All At Once on Gemma 2, 2024. URL https://arxiv.org/abs/2408.05147. Version Number: 2

  54. [54]

    gpt-oss-120b & gpt-oss-20b Model Card

    OpenAI. gpt-oss-120b & gpt-oss-20b model card, 2025. URL https://arxiv.org/abs/2508.10925

  55. [55]

    Eliciting People ’s First - Order Concerns : Text Analysis of Open - Ended Survey Questions

    Beatrice Ferrario and Stefanie Stantcheva. Eliciting People ’s First - Order Concerns : Text Analysis of Open - Ended Survey Questions . AEA Papers and Proceedings, 112: 0 163--169, May 2022. ISSN 2574-0768, 2574-0776. doi:10.1257/pandp.20221071. URL https://pubs.aeaweb.org/doi/10.1257/pandp.20221071

  56. [56]

    Vershynin.High-dimensional Probability

    Roman Vershynin. High- Dimensional Probability : An Introduction with Applications in Data Science . Cambridge University Press, 1 edition, September 2018. ISBN 978-1-108-23159-6 978-1-108-41519-4. doi:10.1017/9781108231596. URL https://www.cambridge.org/core/product/identifier/9781108231596/type/book

  57. [57]

    Moving beyond sub- Gaussianity in high-dimensional statistics: applications in covariance estimation and linear regression

    Arun Kumar Kuchibhotla and Abhishek Chakrabortty. Moving beyond sub- Gaussianity in high-dimensional statistics: applications in covariance estimation and linear regression. Information and Inference: A Journal of the IMA, 11 0 (4): 0 1389--1456, December 2022. ISSN 2049-8772. doi:10.1093/imaiai/iaac012. URL https://academic.oup.com/imaiai/article/11/4/13...