Strongly Recommend Advancing

Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi, Alexandre Défossez · 2024 · arXiv 2306.05284

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Codec-Robust Attacks on Audio LLMs

cs.SD · 2026-05-19 · unverdicted · novelty 7.0 · 2 refs

CodecAttack perturbs audio in codec latent space with multi-bitrate EoT to achieve 85.5% average ASR on Opus-compressed Audio LLMs versus under 26% for waveform baselines, with transfer to MP3 and AAC.

Steering Autoregressive Music Generation with Recursive Feature Machines

cs.LG · 2025-10-21 · unverdicted · novelty 7.0

MusicRFM discovers interpretable concept directions in music model hidden states using RFM probes and injects them at inference to steer generation toward desired musical properties without retraining.

Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music

cs.SD · 2026-05-14 · unverdicted · novelty 6.0

Introduces the first large-scale Persian music dataset and shows fine-tuned MusicGen produces compositions more aligned with Persian stylistic conventions via tag-based evaluation.

Step-Audio 2 Technical Report

cs.CL · 2025-07-22 · unverdicted · novelty 6.0

Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.

Not that Groove: Zero-Shot Symbolic Music Editing

cs.SD · 2025-05-13 · unverdicted · novelty 6.0

The work formalizes zero-shot symbolic drum editing as LLM reasoning over a drumroll grid notation, evaluates it on a new benchmark with automated symbolic unit tests, and reports up to 68% success across eight models.

Woosh: A Sound Effects Foundation Model

cs.SD · 2026-04-02 · accept · novelty 5.0

Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.

Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity

cs.AI · 2026-03-03 · unverdicted · novelty 5.0

Separating acoustic and expectation ANN representations as teacher targets improves EEG music identification beyond baselines and seed ensembles.

Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model

cs.SD · 2026-05-20 · unverdicted · novelty 4.0

The paper introduces Musical Attention, an attention variant that incorporates eight musical features including metadata to generate more coherent and varied music than standard or strided attention baselines.

citing papers explorer

Showing 8 of 8 citing papers.

Codec-Robust Attacks on Audio LLMs cs.SD · 2026-05-19 · unverdicted · none · ref 66 · 2 links
CodecAttack perturbs audio in codec latent space with multi-bitrate EoT to achieve 85.5% average ASR on Opus-compressed Audio LLMs versus under 26% for waveform baselines, with transfer to MP3 and AAC.
Steering Autoregressive Music Generation with Recursive Feature Machines cs.LG · 2025-10-21 · unverdicted · none · ref 2
MusicRFM discovers interpretable concept directions in music model hidden states using RFM probes and injects them at inference to steer generation toward desired musical properties without retraining.
Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music cs.SD · 2026-05-14 · unverdicted · none · ref 6
Introduces the first large-scale Persian music dataset and shows fine-tuned MusicGen produces compositions more aligned with Persian stylistic conventions via tag-based evaluation.
Step-Audio 2 Technical Report cs.CL · 2025-07-22 · unverdicted · none · ref 14
Step-Audio 2 integrates a latent audio encoder, reasoning-centric reinforcement learning, and discrete audio token generation into language modeling to deliver state-of-the-art performance on audio understanding and conversational benchmarks.
Not that Groove: Zero-Shot Symbolic Music Editing cs.SD · 2025-05-13 · unverdicted · none · ref 9
The work formalizes zero-shot symbolic drum editing as LLM reasoning over a drumroll grid notation, evaluates it on a new benchmark with automated symbolic unit tests, and reports up to 68% success across eight models.
Woosh: A Sound Effects Foundation Model cs.SD · 2026-04-02 · accept · none · ref 9
Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.
Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity cs.AI · 2026-03-03 · unverdicted · none · ref 43
Separating acoustic and expectation ANN representations as teacher targets improves EEG music identification beyond baselines and seed ensembles.
Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model cs.SD · 2026-05-20 · unverdicted · none · ref 7
The paper introduces Musical Attention, an attention variant that incorporates eight musical features including metadata to generate more coherent and varied music than standard or strided attention baselines.

Strongly Recommend Advancing

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer