Introduces a conceptual framework for curiosity-driven reward-based learning in audio via continuous search for novel sound sources, with an overview of prior work and a proof-of-concept.
Panns: Large-scale pretrained audio neural networks for audio pattern recognition,
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
TimePro-RL interleaves timestamp embeddings in audio sequences and applies RL post-SFT to boost temporal alignment in LALMs, yielding gains on grounding, event detection, and dense captioning.
Larger pre-training data scale and class diversity improve audio transfer learning performance, yet similarity between pre-training and target task has a stronger positive effect.
citing papers explorer
-
A conceptual framework for learning to listen by reward: Curiosity-driven search for novel sources
Introduces a conceptual framework for curiosity-driven reward-based learning in audio via continuous search for novel sound sources, with an overview of prior work and a proof-of-concept.
-
Towards Fine-grained Temporal Perception: Post-Training Large Audio-Language Models with Audio-Side Time Prompt
TimePro-RL interleaves timestamp embeddings in audio sequences and applies RL post-SFT to boost temporal alignment in LALMs, yielding gains on grounding, event detection, and dense captioning.
-
How Class Ontology and Data Scale Affect Audio Transfer Learning
Larger pre-training data scale and class diversity improve audio transfer learning performance, yet similarity between pre-training and target task has a stronger positive effect.