Pith Number

pith:QXWMAQZU

pith:2024:QXWMAQZURQIIK5MLWJWKSA4UXL

not attested not anchored not stored refs resolved

Improving Dictionary Learning with Gated Sparse Autoencoders

Arthur Conmy, J\'anos Kram\'ar, Lewis Smith, Neel Nanda, Rohin Shah, Senthooran Rajamanoharan, Tom Lieberum, Vikrant Varma

Gated Sparse Autoencoders separate feature selection from magnitude estimation to eliminate L1-induced shrinkage in language model dictionary learning.

arxiv:2404.16014 v2 · 2024-04-24 · cs.LG · cs.AI

Open paper page JSON Open Graph Bundle Merged state What is a Pith Number?

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Through training SAEs on LMs of up to 7B parameters we find that, in typical hyper-parameter ranges, Gated SAEs solve shrinkage, are similarly interpretable, and require half as many firing features to achieve comparable reconstruction fidelity.

C2weakest assumption

That restricting the L1 penalty to the gating branch does not introduce new biases or degrade feature quality in dimensions not measured by the reported reconstruction and interpretability metrics.

C3one line summary

Gated SAEs decouple which features to use from how large their activations should be, applying the L1 penalty only to selection and thereby eliminating shrinkage while halving the number of firing features needed for good fidelity.

References

255 extracted · 255 resolved · 5 Pith anchors

[1] M. Aharon, M. Elad, and A. Bruckstein. K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54 0 (11): 0 4311--4322, 2006. doi 2006 · doi:10.1109/tsp.2006.881199

[2] Introducing the next generation of Claude 2024

[3] J. Batson, B. Chen, A. Jones, A. Templeton, T. Conerly, J. Marcus, T. Henighan, N. L. Turner, and A. Pearce. Circuits Updates - March 2024 . Transformer Circuits Thread, 2024. URL https://transformer- 2024

[4] Y. Bengio. Deep learning of representations: Looking forward, 2013 2013

[5] S. Biderman, H. Schoelkopf, Q. G. Anthony, H. Bradley, K. O’Brien, E. Hallahan, M. A. Khan, S. Purohit, U. S. Prashanth, E. Raff, et al. Pythia: A suite for analyzing large language models across trai 2023

Formal links

2 machine-checked theorem links

Cited by

18 papers in Pith

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

WriteSAE: Sparse Autoencoders for Recurrent State

Sparse Autoencoders as a Steering Basis for Phase Synchronization in Graph-Based CFD Surrogates

WriteSAE: Sparse Autoencoders for Recurrent State

Receipt and verification

First computed	2026-05-17T23:38:13.270899Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

85ecc043348c1085758bb26ca90394baf62ecde2b4a65f0faa10053019f8335c

Aliases

arxiv: 2404.16014 · arxiv_version: 2404.16014v2 · doi: 10.48550/arxiv.2404.16014 · pith_short_12: QXWMAQZURQII · pith_short_16: QXWMAQZURQIIK5ML · pith_short_8: QXWMAQZU

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/QXWMAQZURQIIK5MLWJWKSA4UXL \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 85ecc043348c1085758bb26ca90394baf62ecde2b4a65f0faa10053019f8335c

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "a5a5978bca297540afaf137cdc1c11e59dd3aa7ff92132d2dba627675ae9dca9",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2024-04-24T17:47:22Z",
    "title_canon_sha256": "de78f0873097f3b9f45e65322afe73347a1488e485186984f4ef162891cec806"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2404.16014",
    "kind": "arxiv",
    "version": 2
  }
}