Wang Shuai
Associate Professor
Nanjing University
Dr. Shuai Wang is an Associate Professor, Distinguished Researcher, and Ph.D. Supervisor at Nanjing University, with a joint appointment at Hetao College. He specializes in intelligent audio signal processing across multimodal acoustic signals (speech, audio events, music). He holds a Ph.D. from Shanghai Jiao Tong University and formerly served as Senior Researcher at Tencent Photon Studio. With 40+ publications as first/corresponding author in premier venues (ICASSP, Interspeech) and 10+ patents, Dr. Wang has won championships at VoxSRC 2019 and DIHARD 2019, plus Best Paper awards at ISCSLP 2024. His open-source project WeSpeaker achieves 10M+ monthly downloads on HuggingFace, widely adopted in academia and industry. Research collaborations and exceptional students are welcome in speech processing, multimodal AI, and large models.
- Wang S, Chen Z, Han B, Wang H, Liang C, Zhang B, Xiang X, Ding W, Rohdin J, Silnova A, et al. Advancing Speaker Embedding Learning: Wespeaker Toolkit for Research and Production. Speech Communication, 2024.
- Wu W, Chen X, Wang S*, Wang J, Meng L, Wu X, Meng H, Li H. C2AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction. IEEE Journal of Selected Topics in Signal Processing, 2025.
- Ma Y, Wang S*, Liu T, Li H. PhiNet: Speaker Verification with Phonetic Interpretability. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025.
- Yang C, Wang S*, Chen H, Tan W, Yu J, Li H. SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement. NeurIPS, 2025
- Wang W, Pan Z, Li X, Wang S, Li H. Speech Separation with Pretrained Frontend to Minimize Domain Mismatch. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024.
- Wang S, Yang Y, Wu Z, Qian Y, Yu K. Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020.