Nest-rq: Next token prediction for speech self-supervised pre-training

Minglun Han, Ye Bai, Chen Shen, Youjia Huang, Mingkun Huang, Zehua Lin, Linhao Dong, Lu Lu, Yuxuan Wang · 2024 · arXiv 2409.08680

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

F3-Tokenizer: Taming Audio Autoencoder Latents for Understanding and Generation

cs.SD · 2026-06-04 · unverdicted · novelty 5.0

F3-Tokenizer adapts audio autoencoder latents with noise-regularized bottleneck (channel normalization and stochastic perturbation) and a representation encoder (RQ-MTP plus frozen-LLM supervision) to support both high-dimensional understanding representations and normalized continuous generation ta

citing papers explorer

Showing 1 of 1 citing paper.

F3-Tokenizer: Taming Audio Autoencoder Latents for Understanding and Generation cs.SD · 2026-06-04 · unverdicted · none · ref 4
F3-Tokenizer adapts audio autoencoder latents with noise-regularized bottleneck (channel normalization and stochastic perturbation) and a representation encoder (RQ-MTP plus frozen-LLM supervision) to support both high-dimensional understanding representations and normalized continuous generation ta

Nest-rq: Next token prediction for speech self-supervised pre-training

fields

years

verdicts

representative citing papers

citing papers explorer