Con- nectionist temporal classiﬁcation: labelling unsegmented se- quence data with recurrent neural networks,

· 2006

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

Cross-Attention End-to-End ASR for Two-Party Conversations

eess.AS · 2019-07-24 · unverdicted · novelty 6.0

End-to-end ASR model with speaker-specific cross-attention for two-party conversations outperforms standard models on the Switchboard corpus.

Towards Debugging Deep Neural Networks by Generating Speech Utterances

cs.LG · 2019-07-06 · unverdicted · novelty 5.0

Activation maximization applied to a speech command DNN, followed by WaveNet synthesis, produces class-specific utterances that human evaluators can interpret, supporting its use for model debugging.

LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models

cs.CV · 2019-06-25 · unverdicted · novelty 4.0

3D-2D-CNN-BLSTM with word-CTC reaches 1.3% WER on GRID seen-speaker lipreading (55% relative gain over LCANet) and 8.6% on unseen speakers (24.5% gain over LipNet).

End-to-End ASR for Code-switched Hindi-English Speech

eess.AS · 2019-06-22 · unverdicted · novelty 4.0

End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.

citing papers explorer

Showing 4 of 4 citing papers.

Cross-Attention End-to-End ASR for Two-Party Conversations eess.AS · 2019-07-24 · unverdicted · none · ref 43
End-to-end ASR model with speaker-specific cross-attention for two-party conversations outperforms standard models on the Switchboard corpus.
Towards Debugging Deep Neural Networks by Generating Speech Utterances cs.LG · 2019-07-06 · unverdicted · none · ref 13
Activation maximization applied to a speech command DNN, followed by WaveNet synthesis, produces class-specific utterances that human evaluators can interpret, supporting its use for model debugging.
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models cs.CV · 2019-06-25 · unverdicted · none · ref 16
3D-2D-CNN-BLSTM with word-CTC reaches 1.3% WER on GRID seen-speaker lipreading (55% relative gain over LCANet) and 8.6% on unseen speakers (24.5% gain over LipNet).
End-to-End ASR for Code-switched Hindi-English Speech eess.AS · 2019-06-22 · unverdicted · none · ref 13
End-to-end ASR for code-switched Hindi-English with <50 hours of data shows gains from multi-task learning and corpus balancing but underperforms cascaded baselines.

Con- nectionist temporal classiﬁcation: labelling unsegmented se- quence data with recurrent neural networks,

fields

years

verdicts

representative citing papers

citing papers explorer