岳翔宇

助理教授

香港中文大学

教育背景

教育经历：

2016-2022: 加州大学伯克利电子计算机博士学位
2014-2016: 斯坦福大学硕士学位
2010-2014: 南京大学本科

工作经历：

2022-至今香港中文大学助理教授

研究领域

多模态大模型，具身智能，生成模型等

邮箱

xiangyuyue@slai.edu.cn

个人简介

岳翔宇，现任香港中文大学MMLab助理教授。主要研究方向是多模态模型，具身智能，生成模型等。他于加州大学伯克利分校获得电子计算机博士学位，于斯坦福大学和南京大学分别获得硕士和学士学位。博士期间，他在Berkeley AI Research师从美国工程院院士，Alberto Sangiovanni-Vincentelli教授和Kurt Keutzer教授。曾获得Lotfi A. Zadeh 奖，用于表彰对于软计算（包含神经网络）领域的理论和应用做出突出贡献的UC Berkeley EECS优秀博士毕业生。曾担任CVPR, NeurIPS, ICLR, ICML, AAAI等会议的领域主席。

学术著作

[1] K. Feng, K. Gong, B. Li, Z. Guo, Y. Wang, T. Peng, J. Wu, X. Zhang, B. Wang, X. Yue, “Video-R1: Reinforcing Video Reasoning in MLLMs”, Neural Information Processing Systems (NeurIPS), 2025.

[2] J. Han, K. Gong, Y. Zhang, J. Wang, K. Zhang, D. Lin, Y. Qiao, P. Gao, X. Yue, “OneLLM: One Framework to Align All Modalities with Language”, Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

[3] Z. Lai, Y. Zhao, Z. Zhao, H. Liu, F. Y. Wang, H. Shi, X. Yang, Q. Lin, J. Huang, Y. Lliu, J. Jiang, C. Guo, X. Yue, “Unleashing Vectset Diffusion Model for Fast Shape Generation”, International Conference on Computer Vision (ICCV), 2025. [ICCV Highlight]

[4] Y. Zhang, H. Li, J. Liu, X. Yue, “Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities”, International Conference on Computer Vision (ICCV), 2025.

[5] T. Peng, M. Li, H. Zhou, R. Xia, R. Zhang, L. BAI, S. Mao, B. Wang, C. He, A. Zhou, B. Shi, T. Chen, B. Zhang, X. Yue, “Chimera: Improving Generalist Model with Domain-Specific Experts”, International Conference on Computer Vision (ICCV), 2025.

[6] M. Cai, X. Cun, X. Li, W. Liu, Z. Zhang, Y. Zhang, Y. Shan, X. Yue, “DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation”, Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[7] Y. Zhang, X. Ding, X. Yue, “Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations”, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025.

[8] Y. Huang, X. Dai, J. Wang, X. Qi, Y. Yuan, X. Yue, “Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model”, International Conference on Learning Representations (ICLR), 2025.

[9] C. Tang, X. Ma, E. Su, X. Song, X. Liu, W. Li, L. Bai, W. Ouyang, X. Yue, “UniSTD: Towards Unified Spatio-Temporal Prediction across Diverse Disciplines”, Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

[10] Y. Zhang, H. Li, J. Liu, X. Yue, “Learning Beyond Still Frames: Scaling Vision-Language Models with Video”, International Conference on Computer Vision (ICCV), 2025.