Xinyuan Qian

Associate Professor

Center for Language, Intelligence and Machines

Education Background

Queen Mary University of London, UK, Computer Science, Doctoral Degree (2015.11-2020.03)
The University of Edinburgh, UK, Signal Processing and Communications, MSc (2014.09-2015.09)
The University of Edinburgh, UK, Electronics and Electrical Engineering, BEng (2012.09-2014.09)
Nanjing University of Aeronautics and Astronautics, Information Engineering, BEng, (2010.09-2012.09)

Work Experience

2022.10 – Present: University of Science and Technology Beijing (USTB), China, Associate Professor
2022.03 – 2022.09: The Chinese University of Hong Kong, China, Research Assistant
2020.02 – 2022.03: National University of Singapore (NUS), Singapore, Research Fellow
2017.04 – 2018.12: Fondazione Bruno Kessler, Italy, Research Intern
2014.06 – 2014.08: Heriot-Watt University, UK, Research Intern

Research Fields

Deep Learning, Machine Hearing, Audio-Visual Perception and Fusion, Speaker Extraction, Localization and Tracking, Cross-Modal Generation

Personal Website

https://catherine-qian.github.io/

xinyuanqian@slai.edu.cn

Biography

Xinyuan Qian, PhD, is an Associate Professor at the School of Computer and Communication Engineering, University of Science and Technology Beijing. She earned her doctoral degree in Computer Science from Queen Mary University of London, UK, and conducted research at the University of Edinburgh (UK), FBK Research Center (Italy), and The Chinese University of Hong Kong, Shenzhen. Her research focuses on machine hearing, audio-visual fusion, speaker extraction, audio analysis and localization, etc. She has presided over multiple projects including the National Natural Science Foundation of China and the Beijing Natural Science Foundation, and has published more than 70 papers in top-tier journals and conferences such as TASLP, TMM, ICASSP and CVPR. Research collaborations and applications from outstanding students in the fields of intelligent speech technology and audio-visual multimodal interaction are warmly welcome.

Academic Publications

Academic Achievements：

1. Qian X, Gao J, Zhang Y, et al. SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model. IEEE Journal of Selected Topics in Signal Processing, 2025.

2. Qian X, Zhang. Q, J. Wang, G. Guan, H. Li, Deep Cross-modal Retrieval between Spatial Image and Acoustic Speech, IEEE Trans. on Multimedia, 2023.

3. Qian X, Z. Wang, J. Wang, G. Guan, H. Li, Audio-Visual Cross-Attention Network for Robotic Speaker Tracking, IEEE**/ACM Trans. on Audio, Speech, and Language Processing, 2022.

4. Qian X, Brutti, A., Lanz, O., Omologo, M. and Cavallaro, A., Audio-visual Tracking of Concurrent Speakers. IEEE Trans. on Multimedia, 2021.

5. Qian X, Brutti, A., Lanz, O., Omologo, M. and Cavallaro, A., Multi-speaker tracking from an audio–visual sensing device. IEEE Trans. on Multimedia, 2019.