Factual associations in autoregressive transformers are localized to mid-layer feed-forward modules and can be edited via rank-one model editing while preserving both specificity and generalization on counterfactual tests.
N., Kaiser, ., and Polosukhin, I
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Dywave uses wavelet hierarchical decomposition to create event-aligned compact token sequences for heterogeneous IoT signals, yielding up to 12% accuracy gains and 75% shorter inputs on mainstream sequence models across five datasets.
citing papers explorer
-
Locating and Editing Factual Associations in GPT
Factual associations in autoregressive transformers are localized to mid-layer feed-forward modules and can be edited via rank-one model editing while preserving both specificity and generalization on counterfactual tests.
-
Dywave: Event-Aligned Dynamic Tokenization for Heterogeneous IoT Sensing Signals
Dywave uses wavelet hierarchical decomposition to create event-aligned compact token sequences for heterogeneous IoT signals, yielding up to 12% accuracy gains and 75% shorter inputs on mainstream sequence models across five datasets.