TARNet outperforms prior methods on VoxCeleb1 and LibriSpeech for closed-set speaker identification by using a multi-stage dilated temporal encoder fused via attentive statistics pooling.
Robust feature extraction using temporal context averaging for speaker identification in diverse acoustic environments
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
TARNet: A Temporal-Aware Multi-Scale Architecture for Closed-Set Speaker Identification
TARNet outperforms prior methods on VoxCeleb1 and LibriSpeech for closed-set speaker identification by using a multi-stage dilated temporal encoder fused via attentive statistics pooling.