Elastic Time adds a learned latent predictor to enable dynamic frame rates in fixed-rate neural audio autoencoders, allowing skipped frames to be reconstructed and improving efficiency-quality tradeoffs at deployment time.
Elastic Time: Dynamic Frame Rate Bottlenecks for Neural Audio Coding
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Neural audio autoencoders have become a core component of compression, feature extraction, and generation. However, while existing systems support variable bitrate, the vast majority of models still operate at a fixed latent frame-rate, allocating equal temporal budget to regions with very different information density, which can result in unnecessarily long sequences. We introduce Elastic Time, a dynamic frame-rate bottleneck that converts fixed-frame-rate autoencoders to dynamic ones. Our method learns a lightweight latent predictor used to decide which frames can be skipped and later reconstructed, enabling efficient greedy boundary selection at inference. Experiments show our method enables deployment-time rate control while improving efficiency-quality tradeoffs relative to baselines. Overall, we provide a flexible mechanism for adjusting temporal resolution in audio autoencoders, potentially facilitating more efficient downstream modeling for generation and long-context tasks.
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Elastic Time: Dynamic Frame Rate Bottlenecks for Neural Audio Coding
Elastic Time adds a learned latent predictor to enable dynamic frame rates in fixed-rate neural audio autoencoders, allowing skipped frames to be reconstructed and improving efficiency-quality tradeoffs at deployment time.