pith. sign in

Diab, Virginia Smith, and Kun Zhang

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 2

years

2026 2

representative citing papers

Are Sparse Autoencoder Benchmarks Reliable?

cs.LG · 2026-05-18 · unverdicted · novelty 6.0

An audit of SAEBench reveals that Targeted Probe Perturbation and Spurious Correlation Removal metrics fail reliability tests and should not be used to evaluate sparse autoencoders.

citing papers explorer

Showing 2 of 2 citing papers.