【SLAI Seminar】第二十四期:Engineering Faithful and Interpretable AI Systems (Jan 21, 14:30)
SLAI Seminar 24th Session will be discussing the topic on "Engineering Faithful and Interpretable AI Systems", from 2:30pm to 4pm, January 21st (Wednesday) at B311 Lecture Hall, online participation is welcome (Tencent Meeting ID: 216-952-697)
报告主题:构建可信、可解释人工智能系统
时间:2026年1月21日(周三)下午14:30-16:00
地点: 深圳河套学院B311阶梯教室
线上参与:腾讯会议号216-952-697
讲者简介 About the Speaker:
René Vidal 现任宾夕法尼亚大学校级讲座教授(整合认知与数据科学领域),兼任电气与系统工程系及放射学系拉赫利夫讲席教授、数据工程与科学创新中心(IDEAS)主任,并共同主持宾夕法尼亚大学人工智能计划,同时担任亚马逊学者、NORCE研究院首席科学家,并曾任TPAMI联合主编。Vidal教授当前研究聚焦于深度学习理论基础、可信人工智能及其在计算机视觉与生物医学数据科学中的应用。其实验室在运动分割、行为识别、子空间聚类、矩阵分解、深度学习理论、可解释人工智能及生物医学图像分析等领域作出了开创性贡献。René Vidal 教授是ACM Fellow, AIMBE Fellow, IEEE Fellow, IAPR Fellow 及Sloan Fellow,曾荣获IEEE Edward J. McCluskey 技术成就奖、D’Alembert教职奖、 J.K. Aggarwal 奖、 ONR 青年研究者奖、NSF Career Award ,并在机器学习、计算机视觉、信号处理、控制理论与医疗机器人领域多次获得最佳论文奖。
René Vidal is the Penn Integrates Knowledge and Rachleff University Professor of Electrical and Systems Engineering & Radiology, the Director of the Center for Innovation in Data Engineering and Science (IDEAS), and Co-Chair of Penn AI at the University of Pennsylvania. He is also an Amazon Scholar, an Affiliated Chief Scientist at NORCE, and a former Associate Editor in Chief of TPAMI. His current research focuses on the foundations of deep learning and trustworthy AI and its applications in computer vision and biomedical data science. His lab has made seminal contributions to motion segmentation, action recognition, subspace clustering, matrix factorization, deep learning theory, interpretable AI, and biomedical image analysis. He is an ACM Fellow, AIMBE Fellow, IEEE Fellow, IAPR Fellow and Sloan Fellow, and has received numerous awards for his work, including the IEEE Edward J. McCluskey Technical Achievement Award, D’Alembert Faculty Award, J.K. Aggarwal Prize, ONR Young Investigator Award, NSF CAREER Award as well as best paper awards in machine learning, computer vision, signal processing, controls, and medical robotics.
报告摘要 Abstract:
大语言模型(LLM)与视觉语言模型(VLM)已在众多任务中展现出卓越性能。然而,随着其应用日益广泛,这些模型在忠实性、安全性和可解释性方面的根本局限性逐渐暴露。本次报告,Vidal教授将提出一个统一视角,探讨如何通过原则性模型干预与可解释决策框架应对这些挑战。Vidal教授首先将介绍简约概念工程(PaCE),该方法通过选择性消除模型内部不良激活来提升忠实度与对齐性,在保留语言能力的同时有效缓解幻觉与偏见表述问题。随后,将阐述信息追踪(IP)框架——一种基于可解释设计理念的预测系统,通过一系列用户可理解的信息化查询取代不透明的推理过程,在保持预测准确性的同时生成简洁解释。在文本、视觉及医疗领域的实验结果表明,这些方法能在不牺牲性能的前提下显著提升系统透明度,研究成果共同为构建兼具强大能力、忠实性与人类价值对齐的人工智能系统指明重要方向。
Large Language Models (LLMs) and Vision Language Models (VLMs) have achieved remarkable performance across a wide range of tasks. However, their growing deployment has exposed fundamental limitations in faithfulness, safety, and transparency. In this talk, Prof. Vidal will present a unified perspective on addressing these challenges through principled model interventions and interpretable decision-making frameworks. He first introduces Parsimonious Concept Engineering (PaCE), an approach that improves faithfulness and alignment by selectively removing undesirable internal activations, mitigating hallucinations and biased language while preserving linguistic competence. Prof. Vidal then present Information Pursuit (IP), an interpretable-by-design prediction framework that replaces opaque reasoning with a sequence of informative, user-interpretable queries, yielding concise explanations alongside accurate predictions. Results across text, vision, and medical tasks illustrate how these ideas advance transparency without sacrificing performance. Together, these contributions point toward a broader direction for building AI systems that are powerful, faithful, and aligned with human values.