Xinyuan Qian
Associate Professor
Center for Language, Intelligence and Machines
- Queen Mary University of London, UK, Computer Science, Doctoral Degree (2015.11-2020.03)
- The University of Edinburgh, UK, Signal Processing and Communications, MSc (2014.09-2015.09)
- The University of Edinburgh, UK, Electronics and Electrical Engineering, BEng (2012.09-2014.09)
- Nanjing University of Aeronautics and Astronautics, Information Engineering, BEng, (2010.09-2012.09)
Work Experience
- 2022.10 – Present: University of Science and Technology Beijing (USTB), China, Associate Professor
- 2022.03 – 2022.09: The Chinese University of Hong Kong, China, Research Assistant
- 2020.02 – 2022.03: National University of Singapore (NUS), Singapore, Research Fellow
- 2017.04 – 2018.12: Fondazione Bruno Kessler, Italy, Research Intern
- 2014.06 – 2014.08: Heriot-Watt University, UK, Research Intern
Xinyuan Qian, PhD, is an Associate Professor at the School of Computer and Communication Engineering, University of Science and Technology Beijing. She earned her doctoral degree in Computer Science from Queen Mary University of London, UK, and conducted research at the University of Edinburgh (UK), FBK Research Center (Italy), and The Chinese University of Hong Kong, Shenzhen. Her research focuses on machine hearing, audio-visual fusion, speaker extraction, audio analysis and localization, etc. She has presided over multiple projects including the National Natural Science Foundation of China and the Beijing Natural Science Foundation, and has published more than 70 papers in top-tier journals and conferences such as TASLP, TMM, ICASSP and CVPR. Research collaborations and applications from outstanding students in the fields of intelligent speech technology and audio-visual multimodal interaction are warmly welcome.
Academic Achievements:
1. Qian X, Gao J, Zhang Y, et al. SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model. IEEE Journal of Selected Topics in Signal Processing, 2025.
2. Qian X, Zhang. Q, J. Wang, G. Guan, H. Li, Deep Cross-modal Retrieval between Spatial Image and Acoustic Speech, IEEE Trans. on Multimedia, 2023.
3. Qian X, Z. Wang, J. Wang, G. Guan, H. Li, Audio-Visual Cross-Attention Network for Robotic Speaker Tracking, IEEE**/ACM Trans. on Audio, Speech, and Language Processing, 2022.
4. Qian X, Brutti, A., Lanz, O., Omologo, M. and Cavallaro, A., Audio-visual Tracking of Concurrent Speakers. IEEE Trans. on Multimedia, 2021.
5. Qian X, Brutti, A., Lanz, O., Omologo, M. and Cavallaro, A., Multi-speaker tracking from an audio–visual sensing device. IEEE Trans. on Multimedia, 2019.