Haizhou Li — Pith Author Registry

Identifiers

name variant Haizhou Li 0.60 · backfill

Papers (29)

What Happens Before Decoding? Prefill Determines GUI Grounding in VLMs cs.CV · 2026 · author #7
Bridging What the Model Thinks and How It Speaks: Self-Aware Speech Language Models for Expressive Speech Generation cs.CL · 2026 · author #10
Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan cs.SD · 2026 · author #8
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis eess.AS · 2026 · author #5
PhiNet: Speaker Verification with Phonetic Interpretability eess.AS · 2026 · author #4
Neural Architecture Search of Time-to-First-Spike-Coded Spiking Neural Networks for Efficient Eye-based Emotion Recognition cs.NE · 2025 · author #6
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems cs.CL · 2025 · author #4
S2S-Arena: Evaluating Paralinguistic Instruction Following in Speech-to-Speech Models cs.CL · 2025 · author #9
VoiceBench: Benchmarking LLM-Based Voice Assistants cs.CL · 2024 · author #6
Acoustic Modeling for Automatic Lyrics-to-Audio Alignment eess.AS · 2019 · author #3
Code-Switching Detection Using ASR-Generated Language Posteriors cs.CL · 2019 · author #4
Large-Scale Speaker Diarization of Radio Broadcast Archives cs.CL · 2019 · author #6
Multi-Graph Decoding for Code-Switching ASR cs.CL · 2019 · author #5
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019 cs.CL · 2019 · author #5
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences eess.AS · 2019 · author #12
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet eess.AS · 2019 · author #4
Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss eess.AS · 2019 · author #4
Deep Spiking Neural Network with Spike Count based Learning Rule cs.NE · 2019 · author #6
Target Speaker Extraction for Overlapped Multi-Talker Speaker Verification eess.AS · 2019 · author #4
On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition cs.CL · 2018 · author #6
Error Reduction Network for DBLSTM-based Voice Conversion eess.AS · 2018 · author #4
Generative x-vectors for text-independent speaker verification eess.AS · 2018 · author #5
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search cs.CL · 2018 · author #6
A Multi-State Diagnosis and Prognosis Framework with Feature Learning for Tool Condition Monitoring eess.SP · 2018 · author #5
A Cost-Sensitive Deep Belief Network for Imbalanced Classification cs.LG · 2018 · author #3
Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework cs.SD · 2017 · author #7
Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting cs.SD · 2016 · author #5
Spoofing detection under noisy conditions: a preliminary investigation and an initial database cs.LG · 2016 · author #5
Fantastic 4 system for NIST 2015 Language Recognition Evaluation cs.CL · 2016 · author #16

Mentions

2510.13910 #4 · arxiv_oai · confidence 0.70 Haizhou Li
2410.17196 #6 · arxiv_oai · confidence 0.70 Haizhou Li

Frequent Coauthors

Emre Y{\i}lmaz 5 shared papers
Eng Siong Chng 5 shared papers
Wei Rao 4 shared papers
Chenglin Xu 3 shared papers
Mingyang Zhang 3 shared papers
Xiong Xiao 3 shared papers
Adem Derinel 2 shared papers
Anthony Larcher 2 shared papers
Benyou Wang 2 shared papers
Berrak Sisman 2 shared papers
Bin Ma 2 shared papers
Chen Zhang 2 shared papers
Chong Zhang 2 shared papers
Feng Jiang 2 shared papers
Geok Soon Hong 2 shared papers
Hanwu Sun 2 shared papers
Ivan Kukanov 2 shared papers
Jichen Yang 2 shared papers
Kay Chen Tan 2 shared papers
Kong Aik Lee 2 shared papers