Esdd 2026: Environmental sound deepfake detection challenge evaluation plan

Han Yin, Yang Xiao, Rohan Kumar Das, Jisheng Bai, Ting Dang, “Esdd · 2026 · arXiv 2508.04529

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

AudioMosaic: Contrastive Masked Audio Representation Learning

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

AudioMosaic learns general-purpose audio representations through contrastive pre-training with structured spectrogram masking, reaching state-of-the-art results on standard benchmarks and improving audio-language tasks.

DeepFense: A Unified, Modular, and Extensible Framework for Robust Deepfake Audio Detection

cs.SD · 2026-04-09 · accept · novelty 5.0

DeepFense supplies a unified toolkit and large-scale benchmarks showing that pre-trained front-end feature extractors drive most performance differences while top models exhibit strong biases by audio quality, speaker gender, and language.

EnvTriCascade: An Environment-Aware Tri-Stage Cascaded Framework for ESDD2 2026 Challenge

cs.SD · 2026-05-18 · unverdicted · novelty 4.0

EnvTriCascade is a tri-stage cascaded framework using mix-consistency detection followed by dual SSL-based five-class classifiers with cross-branch attention and RawBoost augmentation, achieving 0.8266 Macro-F1 on the ESDD2 2026 challenge test set.

citing papers explorer

Showing 3 of 3 citing papers.

AudioMosaic: Contrastive Masked Audio Representation Learning cs.LG · 2026-05-14 · unverdicted · none · ref 16
AudioMosaic learns general-purpose audio representations through contrastive pre-training with structured spectrogram masking, reaching state-of-the-art results on standard benchmarks and improving audio-language tasks.
DeepFense: A Unified, Modular, and Extensible Framework for Robust Deepfake Audio Detection cs.SD · 2026-04-09 · accept · none · ref 46
DeepFense supplies a unified toolkit and large-scale benchmarks showing that pre-trained front-end feature extractors drive most performance differences while top models exhibit strong biases by audio quality, speaker gender, and language.
EnvTriCascade: An Environment-Aware Tri-Stage Cascaded Framework for ESDD2 2026 Challenge cs.SD · 2026-05-18 · unverdicted · none · ref 14
EnvTriCascade is a tri-stage cascaded framework using mix-consistency detection followed by dual SSL-based five-class classifiers with cross-branch attention and RawBoost augmentation, achieving 0.8266 Macro-F1 on the ESDD2 2026 challenge test set.

Esdd 2026: Environmental sound deepfake detection challenge evaluation plan

fields

years

verdicts

representative citing papers

citing papers explorer