Wu Zhizheng
Associate Professor
HARBIN INSTITUTE OF TECHNOLOGY, SHENZHEN
Education Background
Educational Background:
- [2010-2015]:Nanyang Technological University Singapore PH.D. IN COMPUTER ENGINEERING (full time)
- [2006-2009]:Nankai University Tianjin, China MENG IN COMPUTER SCIENCE (full time)
- [2003-2006]:Hangzhou Dianzi University Hangzhou, China BENG IN COMPUTER SCIENCE (full time)
Work Experience:
- [2022.8 - Now]:The Chinese University of Hong Kong, Shenzhen, China Associate Professor
- [2019.4 - 2022.7]:Meta Platforms Inc, USA Tech Lead/Research Scientist [2018.2 - 2019.4]:JD.COM Silicon Valley Research Center, USA Engineering Director/Research Scientist
- [2016.5 - 2018.2]:Apple Inc, USA Research Scientist (DRI of TTS core) [2014.5 - 2016.5]:University of Edinburgh, UK Research Fellow
- [2012.3 - 2012.8]:University of Eastern Finland, Finland Visiting Researcher (Host: Tomi Kinnunen)
- [2009.8 - 2014.5]:Nanyang Technological University, Singapore Research Assistant
- [2007.11 - 2009.7]:Microsoft Research Asia, China Intern Researcher
Research Field
Speech Interaction, Speech Generation, Audio Authentication,AI + Music
Email
zhizhengwu@slai.edu.cn
Biography
Dr. Zhizheng Wu is currently an Associate Professor at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), a Jointly Appointed Professor at Shenzhen Loop Area Institute, a senior research scientist at the Shenzhen Institute of Big Data, and deputy director of the Shenzhen Key Laboratory of Cross-Modal Cognitive Computing. He has specialized in speech recognition, speech synthesis, audio understanding, and speech signal processing, with prior affiliations at Microsoft Research Asia, Apple, and Meta.
Academic Publications
Representative Publications:
- Junan Zhang, Jing Yang, Zihao Fang, Yuancheng Wang, Zehua Zhang, Zhuo Wang, Fan Fan, Zhizheng Wu, AnyEnhance: A Unified Generative Model with Prompt‑Guidance and Self‑Critic for Voice Enhancement, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025
- Xianghu Yue, Xiaohai Tian, Lu Lu, Malu Zhang, Zhizheng Wu, Haizhou Li, CoAVT: A Cognition‑Inspired Unified Audio‑Visual‑Text Pre‑Training Model for Multimodal Processing, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2025
- Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu, An Investigation of Time‑Frequency Representation Dis‑criminators for High‑Fidelity Vocoders, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
- Junyan Ye, Baichuan Zhou, Zilong Huang, Junan Zhang, Tianyi Bai, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He, Weijia Li, LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models, ICLR 2025
- Yuancheng Wang, Haoyue Zhan, Liwei Liu, Ruihong Zeng, Haotian Guo, Jiachen Zheng, Qiang Zhang, Xueyao Zhang, Shunsi Zhang, Zhizheng Wu, MaskGCT: Zero‑Shot Text‑to‑Speech with Masked Generative Codec Transformer, ICLR 2025
- Xueyao Zhang, Xiaohui Zhang, Kainan Peng, Zhenyu Tang, Vimal Manohar, Yingru Liu, Jeff Hwang, Dangna Li, Yuhao Wang, Julian Chan, Yuan Huang, Zhizheng Wu, Mingbo Ma, Vevo: Controllable Zero‑Shot Voice Imitation with Self‑Supervised Disentanglement, ICLR 2025
- Junyi Ao*, Yuancheng Wang*, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu, SD‑Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words, NeurIPS 2024
- Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang‑Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao, Natural‑Speech 3: Zero‑Shot Speech Synthesis with Factorized Codec and Diffusion Models, ICML 2024
- Yuancheng Wang, Zeqian Ju, Xu Tan, Lei He, Zhizheng Wu, Jiang Bian, Sheng Zhao, AUDIT: Audio Editing by Following In‑structions with Latent Diffusion Models, NeurIPS 2023
- Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li, Accented Text‑to‑Speech Synthesis with Limited Data, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
Research Awards:
- 2025 Huawei Spark Award, Huawei
- 2024 Best Paper Finalist, IEEE Spoken Language Technology (SLT) 2024
- 2021 ‑ now World Top 2% Scientist, Stanford University (Ranked among the top 2% of scientists globally based on citations and research impact.)
- 2016 Best Student Paper award
- 2016 Most Cited Article of Speech Communication
- 2015 Top 1 in Blizzard Challenge (Intelligibility task), Blizzard Challenge
- 2012 Best Paper award, APSIPA Annual Submit and Conference, Asia‑Pacific Signal and Information Processing Association (APSIPA)