Martin, Tongsu (Serena) Peng, and Michael W

Charles H · 2021 · DOI 10.1038/s41467-021-24025-8

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

From Mechanistic to Compositional Interpretability

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

The paper introduces compositional interpretability as a category-theoretic framework that casts mechanistic explanations as commuting syntactic-semantic mappings optimized under faithfulness and complexity constraints derived from minimum description length.

SMA-DP: Spectral Memory-Aware Differential Privacy for Deep Learning

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

SMA-DP-SGD augments DP-SGD with a spectral memory-aware fractional branch from prior privatized updates to improve accuracy on CIFAR and MNIST while preserving conditional differential privacy.

Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics

cs.LG · 2026-05-19 · conditional · novelty 6.0

Weight decay controls distinct learning regimes in grokking transformers on modular arithmetic, tracked by new cheap attention-based diagnostics with empirical critical value and exponent fits.

Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory

cs.LG · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

Random Matrix Theory detects overfitting via growing Correlation Traps in weight spectra during the anti-grokking phase of neural network training.

Pruning Deep Neural Networks via the Marchenko--Pastur Distribution

cs.LG · 2026-05-23 · unverdicted · novelty 5.0

Marchenko-Pastur random-matrix pruning of DNNs yields theoretical certificates for accuracy preservation under small fine-tuning and empirical ImageNet results with 50-60% MAC reduction and sub-2pp accuracy drops on ViT and CNN models.

When Does Removing LayerNorm Help? Activation Bounding as a Regime-Dependent Implicit Regularizer

cs.LG · 2026-04-25 · unverdicted · novelty 5.0

DyT improves validation loss 27% at 64M params/1M tokens but worsens it 19% at 118M tokens, with saturation levels predicting the sign of the effect.

citing papers explorer

Showing 5 of 5 citing papers after filters.

From Mechanistic to Compositional Interpretability cs.LG · 2026-05-09 · unverdicted · none · ref 50
The paper introduces compositional interpretability as a category-theoretic framework that casts mechanistic explanations as commuting syntactic-semantic mappings optimized under faithfulness and complexity constraints derived from minimum description length.
SMA-DP: Spectral Memory-Aware Differential Privacy for Deep Learning cs.LG · 2026-05-19 · unverdicted · none · ref 10
SMA-DP-SGD augments DP-SGD with a spectral memory-aware fractional branch from prior privatized updates to improve accuracy on CIFAR and MNIST while preserving conditional differential privacy.
Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory cs.LG · 2026-05-12 · unverdicted · none · ref 16 · 2 links
Random Matrix Theory detects overfitting via growing Correlation Traps in weight spectra during the anti-grokking phase of neural network training.
Pruning Deep Neural Networks via the Marchenko--Pastur Distribution cs.LG · 2026-05-23 · unverdicted · none · ref 19
Marchenko-Pastur random-matrix pruning of DNNs yields theoretical certificates for accuracy preservation under small fine-tuning and empirical ImageNet results with 50-60% MAC reduction and sub-2pp accuracy drops on ViT and CNN models.
When Does Removing LayerNorm Help? Activation Bounding as a Regime-Dependent Implicit Regularizer cs.LG · 2026-04-25 · unverdicted · none · ref 19
DyT improves validation loss 27% at 64M params/1M tokens but worsens it 19% at 118M tokens, with saturation levels predicting the sign of the effect.

Martin, Tongsu (Serena) Peng, and Michael W

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer