Upcoming| SLAI Seminar 18th | 深圳河套学院

Seminar Title

Understanding Deep Learning Through the Lens of Data Characteristics

About the Speaker

许志钦

Prof. Zhiqin Xu

Zhiqin Xu is a professor at the Institute of Natural Sciences/School of Mathematical Sciences at Shanghai Jiao Tong University. Prof. Xu graduated from the Zhiyuan College of Shanghai Jiao Tong University in 2012 with a bachelor's degree and earned his Ph.D. in Applied Mathematics from Shanghai Jiao Tong University in 2016. From 2016 to 2019, Prof. Xu conducted postdoctoral research at New York University Abu Dhabi and the Courant Institute of Mathematical Sciences. His research interests focus on the mathematical foundations and applications of deep learning. In the field of large models, he has discovered mechanisms through which complexity influences the memorization and reasoning of large models. In fundamental deep learning research, in collaboration with others, Prof. Xu has uncovered principles such as the frequency principle, parameter condensation, and energy landscape embedding in deep learning, and has developed multi-scale neural networks. In the field of AI for Science, particularly in combustion chemistry, Prof. Xu has collaborated with others to develop deep learning-based methods for mechanism reduction and surrogate models to accelerate combustion simulations.

Abstract

Understanding the performance of deep learning in practical problems requires consideration of model characteristics, data characteristics, and the features of optimization algorithms that connect these two components. This report will analyze data characteristics from perspectives such as function frequency, effective complexity, signal-to-noise ratio, inference complexity, and correlation statistics. It will also design experiments to explore the features of models and optimization, aiming to understand the generalization ability of deep learning and the reasoning capabilities of language models. Additionally, the report will provide practical insights for model training. We found that small initialization encourages models to interpret data through reasoning rather than memorization, which is closely related to the phenomenon of parameter condensation observed in models with small initialization. Furthermore, certain key statistical properties in the data drive the formation of embedding structures and influence the reasoning abilities of models.

Host

Prof. Dong Wang

Date & Time

December 29, 2025, Monday,

10:00am-12:00pm

Venue

B411 Lecture Hall, Shenzhen Loop Area Institute

(6 Hongmian Rd, Fubao Sub-Street, Futian District, Shenzhen, navigate to "Shenzhen Loop Area Institute (South Gate)" on the map.)

Online link

扫码加入会议

Join the Meeting Online

code

腾讯会议号：320-722-376

Tencent Meeting：320-722-376