DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT,

· 2022 · arXiv 2110.01900

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

End-to-End Voice Intent Recognition for Spontaneous Human-Drone Interaction with Naive Users

eess.AS · 2026-06-19 · unverdicted · novelty 6.0

An end-to-end SLU architecture with frozen SSL acoustic encoder, LSTM classification head, and cross-modal distillation achieves 93% accuracy on simple commands and 82% on spontaneous speech at 7 ms latency on the new VoiceStick corpus, outperforming cascade baselines.

SEAM: Shortcut-Aware Real-Time Detection of Scripted vs. Spontaneous Speech for Interview Guardrails

eess.AS · 2026-06-05 · conditional · novelty 6.0

SEAM achieves 0.971 ROC-AUC on external interview data for real-time scripted speech detection by combining shortcut-prevention data techniques with a compact audio backbone.

citing papers explorer

Showing 1 of 1 citing paper after filters.

SEAM: Shortcut-Aware Real-Time Detection of Scripted vs. Spontaneous Speech for Interview Guardrails eess.AS · 2026-06-05 · conditional · none · ref 16
SEAM achieves 0.971 ROC-AUC on external interview data for real-time scripted speech detection by combining shortcut-prevention data techniques with a compact audio backbone.

DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT,

fields

years

verdicts

representative citing papers

citing papers explorer