预告 | SLAI Seminar第二十二期

Title

单智能体与多智能体强化学习：基础与应用

Single and Multi-Agent Reinforcement Learning: Fundamentals and Applications

About the Speaker

曹泽宏教授

Prof. Zehong Jimmy Cao

曹泽宏现任澳大利亚阿德莱德大学副教授（北美体系正教授），研究聚焦自然与人工智能体为中心的人工智能，涵盖强化学习、脑机接口及不确定性决策等前沿领域。曹教授曾获多项国际荣誉，包括澳大利亚研究理事会DECRA Fellowship、日本学术振兴会特邀研究员（JSPS Invitational Fellowship）、德国学术交流中心人工智能学者奖（DAAD AInet Fellowship）、百度AI中国青年学者奖、ACM杰出讲者以及爱思唯尔“全球前2%高被引科学家”称号。同时，曹教授担任《IEEE神经网络与学习系统汇刊》及《IEEE模糊系统汇刊》等权威期刊副主编，并在AAMAS、AAAI、IJCAI、KDD、ACM多媒体等国际顶级会议中担任领域主席。

Dr Zehong Jimmy Cao is an Associate Professor (Full Professor in North American system) at Adelaide University, Australia. His research focuses on natural and artificial agent-centred AI, spanning reinforcement learning, brain–computer interfaces, and decision-making under uncertainty. He has received numerous international distinctions, including Australian Research Council DECRA Fellowship, JSPS Invitational Fellowship (Japan), DAAD AInet Fellowship (Germany), Baidu AI China Young Scholar Award, ACM Distinguished Speaker, and Elsevier’s World’s Top 2% Most Cited Scientist listing. Dr Cao also serves as an associate editor and area chair for leading journals and conferences such as IEEE TNNLS, IEEE TFS, AAMAS, AAAI, IJCAI, KDD, and ACM MM.

Abstract

强化学习（RL）通过智能体与环境的试错交互，使其自主习得最优决策策略。近年来，RL在复杂系统与大语言模型的感知、推理与控制领域取得突破性进展，已成为现代人工智能的核心基石。本讲座将系统介绍RL基础理论，包括状态表示、策略优化与价值函数估计，并深入探讨训练不稳定性、样本效率低、泛化能力有限及实际部署中的安全壁垒等关键挑战。进一步延伸至高维动态环境下的多智能体强化学习（MARL），阐述多智能体如何协同合作以解决大规模现实任务。

Reinforcement Learning (RL) enables agents to autonomously acquire optimal decision-making strategies through trial-and-error interactions with their environment. In recent years, RL has driven major progress in perception, reasoning, and control across complex systems and large language models, establishing itself as a cornerstone of modern AI. This seminar presents a systematic introduction to RL fundamentals, including state representation, policy optimisation, and value function estimation. It also examines key challenges—training instability, low sample efficiency, limited generalisation, and safety barriers in real-world deployment. The seminar will further extend to multi-agent RL (MARL) in high-dimensional, dynamic environments, illustrating how multiple agents coordinate and cooperate to solve large-scale real-world tasks.

Host

张治国教授

Prof. Zhiguo Zhang

Date & Time

2026年1月16日（星期五）

上午10:00-12:00

January 16, 2026, Friday,

10:00-12:00

Venue

深圳河套学院B411阶梯教室

（深圳市福田区福保街道红棉路6号

地图导航“深圳河套学院-南门”）

B411 Lecture Hall, Shenzhen Loop Area Institute

(6 Hongmian Rd, Fubao Sub-Street, Futian District, Shenzhen, navigate to "Shenzhen Loop Area Institute (South Gate)" on the map.)

Online link

扫码加入会议

Join the Meeting Online

腾讯会议号：535-536-447

Tencent Meeting：535-536-447