pith. sign in

hub

Utmos: Utokyo-sarulab system for voicemos challenge 2022

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

method 1

citation-polarity summary

verdicts

UNVERDICTED 11

roles

method 1

polarities

use method 1

representative citing papers

Hierarchical Codec Diffusion for Video-to-Speech Generation

cs.SD · 2026-04-17 · unverdicted · novelty 7.0

HiCoDiT generates speech from video by conditioning low-level RVQ tokens on speaker identity and high-level tokens on facial expressions via a dual-scale normalized diffusion transformer.

Two-Dimensional Quantization for Geometry-Aware Audio Coding

cs.SD · 2025-12-01 · unverdicted · novelty 6.0

Q2D2 uses 2D geometric grid projections to quantize feature pairs in neural audio codecs, yielding implicit codebooks that improve efficiency and utilization over RVQ, VQ, and FSQ while maintaining reconstruction quality.

Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

cs.SD · 2026-04-30 · unverdicted · novelty 5.0

Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.

citing papers explorer

Showing 11 of 11 citing papers.