Polyphonia improves zero-shot stem-specific timbre transfer in polyphonic music by 15.5% target alignment via acoustic-informed attention calibration that uses probabilistic priors to set coarse boundaries.
International Journal of Computer Vision , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
LightAVSeg decouples semantic filtering and spatial grounding to achieve linear-cost cross-modal interaction in audio-visual segmentation, reaching 50.4 mIoU on MS3 with 20.5M parameters as a new lightweight state-of-the-art.
citing papers explorer
-
Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration
Polyphonia improves zero-shot stem-specific timbre transfer in polyphonic music by 15.5% target alignment via acoustic-informed attention calibration that uses probabilistic priors to set coarse boundaries.
-
LightAVSeg: Lightweight Audio-Visual Segmentation
LightAVSeg decouples semantic filtering and spatial grounding to achieve linear-cost cross-modal interaction in audio-visual segmentation, reaching 50.4 mIoU on MS3 with 20.5M parameters as a new lightweight state-of-the-art.