Wu Zhizheng

Associate Professor

HARBIN INSTITUTE OF TECHNOLOGY, SHENZHEN

Education Background

Educational Background:

[2010-2015]：Nanyang Technological University Singapore PH.D. IN COMPUTER ENGINEERING (full time)
[2006-2009]：Nankai University Tianjin, China MENG IN COMPUTER SCIENCE (full time)
[2003-2006]：Hangzhou Dianzi University Hangzhou, China BENG IN COMPUTER SCIENCE (full time)

Work Experience：

[2022.8 - Now]：The Chinese University of Hong Kong, Shenzhen, China Associate Professor
[2019.4 - 2022.7]：Meta Platforms Inc, USA Tech Lead/Research Scientist [2018.2 - 2019.4]：JD.COM Silicon Valley Research Center, USA Engineering Director/Research Scientist
[2016.5 - 2018.2]：Apple Inc, USA Research Scientist (DRI of TTS core) [2014.5 - 2016.5]：University of Edinburgh, UK Research Fellow
[2012.3 - 2012.8]：University of Eastern Finland, Finland Visiting Researcher (Host: Tomi Kinnunen)
[2009.8 - 2014.5]：Nanyang Technological University, Singapore Research Assistant
[2007.11 - 2009.7]：Microsoft Research Asia, China Intern Researcher

Research Field

Speech Interaction, Speech Generation, Audio Authentication，AI + Music

zhizhengwu@slai.edu.cn

Biography

Dr. Zhizheng Wu is currently an Associate Professor at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), a Jointly Appointed Professor at Shenzhen Loop Area Institute, a senior research scientist at the Shenzhen Institute of Big Data, and deputy director of the Shenzhen Key Laboratory of Cross-Modal Cognitive Computing. He has specialized in speech recognition, speech synthesis, audio understanding, and speech signal processing, with prior affiliations at Microsoft Research Asia, Apple, and Meta.

Academic Publications

Representative Publications：

Junan Zhang, Jing Yang, Zihao Fang, Yuancheng Wang, Zehua Zhang, Zhuo Wang, Fan Fan, Zhizheng Wu, AnyEnhance: A Unified Generative Model with Prompt‑Guidance and Self‑Critic for Voice Enhancement, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025
Xianghu Yue, Xiaohai Tian, Lu Lu, Malu Zhang, Zhizheng Wu, Haizhou Li, CoAVT: A Cognition‑Inspired Unified Audio‑Visual‑Text Pre‑Training Model for Multimodal Processing, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025
Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu, An Investigation of Time‑Frequency Representation Dis‑criminators for High‑Fidelity Vocoders, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
Junyan Ye, Baichuan Zhou, Zilong Huang, Junan Zhang, Tianyi Bai, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He, Weijia Li, LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models, ICLR 2025
Yuancheng Wang, Haoyue Zhan, Liwei Liu, Ruihong Zeng, Haotian Guo, Jiachen Zheng, Qiang Zhang, Xueyao Zhang, Shunsi Zhang, Zhizheng Wu, MaskGCT: Zero‑Shot Text‑to‑Speech with Masked Generative Codec Transformer, ICLR 2025
Xueyao Zhang, Xiaohui Zhang, Kainan Peng, Zhenyu Tang, Vimal Manohar, Yingru Liu, Jeff Hwang, Dangna Li, Yuhao Wang, Julian Chan, Yuan Huang, Zhizheng Wu, Mingbo Ma, Vevo: Controllable Zero‑Shot Voice Imitation with Self‑Supervised Disentanglement, ICLR 2025
Junyi Ao*, Yuancheng Wang*, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu, SD‑Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words, NeurIPS 2024
Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang‑Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao, Natural‑Speech 3: Zero‑Shot Speech Synthesis with Factorized Codec and Diffusion Models, ICML 2024
Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao, AUDIT: Audio Editing by Following In‑structions with Latent Diffusion Models, NeurIPS 2023
Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li, Accented Text‑to‑Speech Synthesis with Limited Data, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024

Research Awards：

2025 Huawei Spark Award, Huawei
2024 Best Paper Finalist, IEEE Spoken Language Technology (SLT) 2024
2021 ‑ now World Top 2% Scientist, Stanford University (Ranked among the top 2% of scientists globally based on citations and research impact.)
2016 Best Student Paper award
2016 Most Cited Article of Speech Communication
2015 Top 1 in Blizzard Challenge (Intelligibility task), Blizzard Challenge
2012 Best Paper award, APSIPA Annual Submit and Conference, Asia‑Pacific Signal and Information Processing Association (APSIPA)