Synthetic Speech Source Tracing using Metric Learning

Dimitrios Koutsianos; Stavros Zacharopoulos; Themos Stafylakis; Yannis Panagakis

arxiv: 2506.02590 · v1 · pith:WXZJEAYJnew · submitted 2025-06-03 · 💻 cs.SD · cs.CL

Synthetic Speech Source Tracing using Metric Learning

Dimitrios Koutsianos , Stavros Zacharopoulos , Yannis Panagakis , Themos Stafylakis This is my paper

classification 💻 cs.SD cs.CL

keywords sourcetracinglearningresnetsyntheticworkaudiometric

0 comments

read the original abstract

This paper addresses source tracing in synthetic speech-identifying generative systems behind manipulated audio via speaker recognition-inspired pipelines. While prior work focuses on spoofing detection, source tracing lacks robust solutions. We evaluate two approaches: classification-based and metric-learning. We tested our methods on the MLAADv5 benchmark using ResNet and self-supervised learning (SSL) backbones. The results show that ResNet achieves competitive performance with the metric learning approach, matching and even exceeding SSL-based systems. Our work demonstrates ResNet's viability for source tracing while underscoring the need to optimize SSL representations for this task. Our work bridges speaker recognition methodologies with audio forensic challenges, offering new directions for combating synthetic media manipulation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Dual-Branch Gated Fusion for Open-Set Audio Deepfake Source Tracing
cs.SD 2026-06 unverdicted novelty 6.0

A gated fusion of XLSR-53 and CORES features with energy margin and diversity losses reaches 97.6% ID accuracy and reduces FPR95 by 83.5% relative to the Interspeech 2025 baseline on MLAAD.