3D-2D-CNN-BLSTM with word-CTC reaches 1.3% WER on GRID seen-speaker lipreading (55% relative gain over LCANet) and 8.6% on unseen speakers (24.5% gain over LipNet).
The kaldi speech recognition toolkit,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2019 2verdicts
UNVERDICTED 2representative citing papers
End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.
citing papers explorer
-
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
3D-2D-CNN-BLSTM with word-CTC reaches 1.3% WER on GRID seen-speaker lipreading (55% relative gain over LCANet) and 8.6% on unseen speakers (24.5% gain over LipNet).
-
End-to-End ASR for Code-switched Hindi-English Speech
End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.