Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
read the original abstract
This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while enforcing privacy by design, as no personal user data is ever collected. Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small devices. Additionally, we describe a data generation procedure that provides sufficient, high-quality training data without compromising user privacy.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
MIST: Multimodal Interactive Speech-based Tool-calling Conversational Assistants for Smart Homes
MIST is a new synthetic speech-based tool-calling dataset for IoT devices that exposes performance gaps between open- and closed-weight multimodal LLMs.
-
IPQA: A Benchmark for Core Intent Identification in Personalized Question Answering
IPQA is a new benchmark that measures how well models identify core user intents from history in personalized question answering, finding that performance is poor and declines with greater question complexity.
-
Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings
TaDSE learns dialogue sentence embeddings via template-guided self-supervised contrastive learning plus synthetic slot-filling augmentation and reports gains on five downstream benchmarks.
-
Training LLMs with Reinforcement Learning for Intent-Aware Personalized Question Answering
IAP uses RL to train LLMs to explicitly infer and apply implicit user intent in single-turn personalized QA, achieving ~7.5% average macro-score gains over baselines on LaMP-QA.
-
Automatic Combination of Sample Selection Strategies for Few-Shot Learning
ACSESS automatically combines 23 sample selection strategies to outperform individual strategies in few-shot learning on text and image datasets.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.