【SLAI Seminar】第二十二期： Single and Multi-Agent Reinforcement Learning: Fundamentals and Applications 单智能体与多智能体强化学习：基础与应用 (Jan 16,10:00）

SLAI Seminar 22nd Session will be discussing the topic on "Single and Multi-Agent Reinforcement Learning: Fundamentals and Applications ", from 10am to 12pm, January 16th (Friday) at B411 Lecture Hall, online participation is welcome (Tencent Meeting ID: 535-536-447)

报告主题：单智能体与多智能体强化学习：基础与应用

时间：2026年1月16日（周五）上午10:00-12:00

地点: 深圳河套学院B411阶梯教室

线上参与：腾讯会议号535-536-447

讲者简介 About the Speaker:

曹泽宏为澳大利亚阿德莱德大学副教授（北美体系正教授）。其研究聚焦自然与人工智能体为中心的人工智能，涵盖强化学习、脑机接口及不确定性决策等前沿领域。曹教授曾获多项国际荣誉，包括澳大利亚研究理事会DECRA Fellowship、日本学术振兴会特邀研究员（JSPS Invitational Fellowship）、德国学术交流中心人工智能学者奖（DAAD AInet Fellowship）、百度AI中国青年学者奖、ACM杰出讲者以及爱思唯尔“全球前2%高被引科学家”称号。同时，曹教授担任《IEEE神经网络与学习系统汇刊》及《IEEE模糊系统汇刊》等权威期刊副主编，并在AAMAS、AAAI、IJCAI、KDD、ACM多媒体等国际顶级会议中担任领域主席。

Dr Zehong Jimmy Cao is an Associate Professor (Full Professor in North American system) at Adelaide University, Australia. His research focuses on natural and artificial agent-centred AI, spanning reinforcement learning, brain–computer interfaces, and decision-making under uncertainty. He has received numerous international distinctions, including Australian Research Council DECRA Fellowship, JSPS Invitational Fellowship (Japan), DAAD AInet Fellowship (Germany), Baidu AI China Young Scholar Award, ACM Distinguished Speaker, and Elsevier’s World’s Top 2% Most Cited Scientist listing. Dr Cao also serves as an associate editor and area chair for leading journals and conferences such as IEEE TNNLS, IEEE TFS, AAMAS, AAAI, IJCAI, KDD, and ACM MM.

报告摘要 Abstract：

强化学习（RL）通过智能体与环境的试错交互，使其自主习得最优决策策略。近年来，RL在复杂系统与大语言模型的感知、推理与控制领域取得突破性进展，已成为现代人工智能的核心基石。本讲座将系统介绍RL基础理论，包括状态表示、策略优化与价值函数估计，并深入探讨训练不稳定性、样本效率低、泛化能力有限及实际部署中的安全壁垒等关键挑战。进一步延伸至高维动态环境下的多智能体强化学习（MARL），阐述多智能体如何协同合作以解决大规模现实任务。

Reinforcement Learning (RL) enables agents to autonomously acquire optimal decision-making strategies through trial-and-error interactions with their environment. In recent years, RL has driven major progress in perception, reasoning, and control across complex systems and large language models, establishing itself as a cornerstone of modern AI. This seminar presents a systematic introduction to RL fundamentals, including state representation, policy optimisation, and value function estimation. It also examines key challenges—training instability, low sample efficiency, limited generalisation, and safety barriers in real-world deployment. The seminar will further extend to multi-agent RL (MARL) in high-dimensional, dynamic environments, illustrating how multiple agents coordinate and cooperate to solve large-scale real-world tasks.