Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting
read the original abstract
We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow. These models are useful for recognizing "command triggers" in speech-based interfaces (e.g., "Hey Siri"), which serve as explicit cues for audio recordings of utterances that are sent to the cloud for full speech recognition. Evaluation on Google's recently released Speech Commands Dataset shows that our reimplementation is comparable in accuracy and provides a starting point for future work on the keyword spotting task.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction
Multimodal intention classifiers fused with Independent Opinion Pool reduce uncertainty and improve accuracy over single-modality baselines in a 7-DoF robot arm collaboration task.
-
A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting
Joint training of speech enhancement and KWS with a novel CRN and Mel features improves noise robustness for small-footprint devices.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.