Sound- net: Learning sound representations from unlabeled video

Yusuf Aytar, Carl V ondrick, Antonio Torralba

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

VideoASMR-Bench: Can AI-Generated ASMR Videos Fool VLMs and Humans?

cs.CV · 2025-12-15 · unverdicted · novelty 7.0

VideoASMR-Bench shows state-of-the-art VLMs fail to reliably detect AI-generated ASMR videos from real ones, though humans can still identify the fakes relatively easily.

Semantic Noise Reduction via Teacher-Guided Dual-Path Audio-Visual Representation Learning

cs.SD · 2026-04-09 · unverdicted · novelty 6.0

TG-DP decouples reconstruction and alignment objectives into separate paths with teacher guidance on visibility patterns, yielding SOTA zero-shot audio-video retrieval gains on AudioSet.

citing papers explorer

Showing 2 of 2 citing papers.

VideoASMR-Bench: Can AI-Generated ASMR Videos Fool VLMs and Humans? cs.CV · 2025-12-15 · unverdicted · none · ref 2
VideoASMR-Bench shows state-of-the-art VLMs fail to reliably detect AI-generated ASMR videos from real ones, though humans can still identify the fakes relatively easily.
Semantic Noise Reduction via Teacher-Guided Dual-Path Audio-Visual Representation Learning cs.SD · 2026-04-09 · unverdicted · none · ref 6
TG-DP decouples reconstruction and alignment objectives into separate paths with teacher guidance on visibility patterns, yielding SOTA zero-shot audio-video retrieval gains on AudioSet.

Sound- net: Learning sound representations from unlabeled video

fields

years

verdicts

representative citing papers

citing papers explorer