hub

BERT Rediscovers the Classical NLP Pipeline , publisher =

BERT rediscovers the classical NLP pipeline , author= · 1905 · arXiv 1905.05950

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

read on arXiv browse 19 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

cs.LG · 2022-11-01 · conditional · novelty 8.0

GPT-2 small solves indirect object identification via a circuit of 26 attention heads organized into seven functional classes discovered through causal interventions.

Mechanistic Interpretability of ASR models using Sparse Autoencoders

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

Sparse autoencoders applied to Whisper ASR reveal monosemantic features across linguistic boundaries and demonstrate cross-lingual feature steering.

Latent Space Probing for Adult Content Detection in Video Generative Models

cs.CV · 2026-04-25 · unverdicted · novelty 7.0

Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.

Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment

cs.CL · 2026-04-12 · unverdicted · novelty 7.0

Lesioning a shared core in multilingual LLMs drops whole-brain fMRI encoding correlation by 60.32%, while language-specific lesions selectively weaken predictions only for the matched native language.

The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior

cs.LG · 2026-03-30 · unverdicted · novelty 7.0

The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.

Neuroprobe: Evaluating Intracranial Brain Responses to Naturalistic Stimuli

cs.LG · 2025-09-25 · accept · novelty 7.0

Neuroprobe is a new suite of decoding tasks on the BrainTreebank iEEG dataset for evaluating multi-modal language processing in the brain during naturalistic movie viewing.

Massive Activations in Large Language Models

cs.CL · 2024-02-27 · unverdicted · novelty 7.0

Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.

OPT: Open Pre-trained Transformer Language Models

cs.CL · 2022-05-02 · unverdicted · novelty 7.0

OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.

Ensemble Monitoring for AI Control: Diverse Signals Outweigh More Compute

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

Diverse ensembles of prompted and fine-tuned GPT-4.1-Mini monitors achieve 2.4x better detection of flawed code solutions than homogeneous ensembles on adversarial inputs.

Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

A composite Collapse Index based on incremental discrete Morse homology provides low-latency early warning of representational collapse during neural network training.

LLM Safety From Within: Detecting Harmful Content with Internal Representations

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

SIREN identifies safety neurons via linear probing on internal LLM layers and combines them with adaptive weighting to detect harm, outperforming prior guard models with 250x fewer parameters.

AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

AIM applies modality-specific masks to balance stability and plasticity in asymmetric VLMs, achieving SOTA average performance and reduced forgetting on continual VQA v2 and GQA while preserving generalization to novel compositions.

Geometric Routing Enables Causal Expert Control in Mixture of Experts

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

Cosine-similarity routing in low-dimensional space makes MoE experts monosemantic by construction and enables direct causal control via centroid interventions.

A Layer-wise Analysis of Supervised Fine-Tuning

cs.LG · 2026-04-12 · unverdicted · novelty 6.0

Middle layers (20-80%) remain stable during SFT while final layers are sensitive, enabling Mid-Block Efficient Tuning that outperforms LoRA by up to 10.2% on GSM8K with reduced parameter count.

How Do Language Models Compose Functions?

cs.CL · 2025-10-02 · conditional · novelty 6.0

LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.

AIM-CoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning

cs.CV · 2025-09-30 · unverdicted · novelty 6.0

AIM-CoT enhances interleaved multimodal chain-of-thought reasoning by adding context-enhanced attention generation, active visual probing via information foraging, and dynamic attention-shift triggering.

VisualBERT: A Simple and Performant Baseline for Vision and Language

cs.CV · 2019-08-09 · conditional · novelty 6.0

VisualBERT is a Transformer model that implicitly aligns text and image regions through self-attention and achieves competitive or superior results on VQA, VCR, NLVR2, and Flickr30K after pre-training on captions.

TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability

cs.LG · 2026-05-14 · unverdicted · novelty 5.0

Task-aware pruning improves OOD performance by removing layers that distort task-adapted representation profiles, realigning OOD inputs with the geometry observed on ID data.

HyperLens: Quantifying Cognitive Effort in LLMs with Fine-grained Confidence Trajectory

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

HyperLens reveals that deeper transformer layers magnify small confidence changes into fine-grained trajectories, allowing quantification of cognitive effort where complex tasks demand more and standard SFT can reduce it.

citing papers explorer

Showing 19 of 19 citing papers.

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small cs.LG · 2022-11-01 · conditional · none · ref 25
GPT-2 small solves indirect object identification via a circuit of 26 attention heads organized into seven functional classes discovered through causal interventions.
Mechanistic Interpretability of ASR models using Sparse Autoencoders cs.CL · 2026-05-12 · unverdicted · none · ref 16
Sparse autoencoders applied to Whisper ASR reveal monosemantic features across linguistic boundaries and demonstrate cross-lingual feature steering.
Latent Space Probing for Adult Content Detection in Video Generative Models cs.CV · 2026-04-25 · unverdicted · none · ref 34
Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.
Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment cs.CL · 2026-04-12 · unverdicted · none · ref 41
Lesioning a shared core in multilingual LLMs drops whole-brain fMRI encoding correlation by 60.32%, while language-specific lesions selectively weaken predictions only for the matched native language.
The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior cs.LG · 2026-03-30 · unverdicted · none · ref 27
The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.
Neuroprobe: Evaluating Intracranial Brain Responses to Naturalistic Stimuli cs.LG · 2025-09-25 · accept · none · ref 21
Neuroprobe is a new suite of decoding tasks on the BrainTreebank iEEG dataset for evaluating multi-modal language processing in the brain during naturalistic movie viewing.
Massive Activations in Large Language Models cs.CL · 2024-02-27 · unverdicted · none · ref 84
Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.
OPT: Open Pre-trained Transformer Language Models cs.CL · 2022-05-02 · unverdicted · none · ref 11
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
Ensemble Monitoring for AI Control: Diverse Signals Outweigh More Compute cs.AI · 2026-05-14 · unverdicted · none · ref 15
Diverse ensembles of prompted and fine-tuned GPT-4.1-Mini monitors achieve 2.4x better detection of flawed code solutions than homogeneous ensembles on adversarial inputs.
Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index cs.LG · 2026-04-28 · unverdicted · none · ref 12
A composite Collapse Index based on incremental discrete Morse homology provides low-latency early warning of representational collapse during neural network training.
LLM Safety From Within: Detecting Harmful Content with Internal Representations cs.AI · 2026-04-20 · unverdicted · none · ref 21
SIREN identifies safety neurons via linear probing on internal LLM layers and combines them with adaptive weighting to detect harm, outperforming prior guard models with 250x fewer parameters.
AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning cs.CV · 2026-04-16 · unverdicted · none · ref 51
AIM applies modality-specific masks to balance stability and plasticity in asymmetric VLMs, achieving SOTA average performance and reduced forgetting on continual VQA v2 and GQA while preserving generalization to novel compositions.
Geometric Routing Enables Causal Expert Control in Mixture of Experts cs.AI · 2026-04-15 · unverdicted · none · ref 9
Cosine-similarity routing in low-dimensional space makes MoE experts monosemantic by construction and enables direct causal control via centroid interventions.
A Layer-wise Analysis of Supervised Fine-Tuning cs.LG · 2026-04-12 · unverdicted · none · ref 5
Middle layers (20-80%) remain stable during SFT while final layers are sensitive, enabling Mid-Block Efficient Tuning that outperforms LoRA by up to 10.2% on GSM8K with reduced parameter count.
How Do Language Models Compose Functions? cs.CL · 2025-10-02 · conditional · none · ref 36
LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.
AIM-CoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning cs.CV · 2025-09-30 · unverdicted · none · ref 12
AIM-CoT enhances interleaved multimodal chain-of-thought reasoning by adding context-enhanced attention generation, active visual probing via information foraging, and dynamic attention-shift triggering.
VisualBERT: A Simple and Performant Baseline for Vision and Language cs.CV · 2019-08-09 · conditional · none · ref 96
VisualBERT is a Transformer model that implicitly aligns text and image regions through self-attention and achieves competitive or superior results on VQA, VCR, NLVR2, and Flickr30K after pre-training on captions.
TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability cs.LG · 2026-05-14 · unverdicted · none · ref 21
Task-aware pruning improves OOD performance by removing layers that distort task-adapted representation profiles, realigning OOD inputs with the geometry observed on ID data.
HyperLens: Quantifying Cognitive Effort in LLMs with Fine-grained Confidence Trajectory cs.AI · 2026-05-07 · unverdicted · none · ref 11
HyperLens reveals that deeper transformer layers magnify small confidence changes into fine-grained trajectories, allowing quantification of cognitive effort where complex tasks demand more and standard SFT can reduce it.

BERT Rediscovers the Classical NLP Pipeline , publisher =

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer