Zipformer: A faster and better encoder for automatic speech recognition,

· 2023 · arXiv 2310.11230

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

SketchSong: Hierarchical Song Generation with Sketch Planning and Fine-Grained Multi-Track Modeling

cs.SD · 2026-06-02 · unverdicted · novelty 5.0

SketchSong uses temporal sketch planning with high-level tokens and explicit modeling of four tracks (vocals, bass, drums, other) to generate more coherent songs than baselines.

From Objectives to Applications: Aligning Architectural Biases in Audio Self-Supervised Learning

eess.AS · 2026-07-01 · unverdicted · novelty 3.0

A survey that organizes audio SSL into five objective paradigms, relates their demands to architectural biases, and interprets downstream applications as tests of generalization.

citing papers explorer

Showing 2 of 2 citing papers.

SketchSong: Hierarchical Song Generation with Sketch Planning and Fine-Grained Multi-Track Modeling cs.SD · 2026-06-02 · unverdicted · none · ref 32
SketchSong uses temporal sketch planning with high-level tokens and explicit modeling of four tracks (vocals, bass, drums, other) to generate more coherent songs than baselines.
From Objectives to Applications: Aligning Architectural Biases in Audio Self-Supervised Learning eess.AS · 2026-07-01 · unverdicted · none · ref 76
A survey that organizes audio SSL into five objective paradigms, relates their demands to architectural biases, and interprets downstream applications as tests of generalization.

Zipformer: A faster and better encoder for automatic speech recognition,

fields

years

verdicts

representative citing papers

citing papers explorer