【SLAI Seminar】24th:Engineering Faithful and Interpretable AI Systems (Jan 21, 14:30)
SLAI Seminar 24th Session will be discussing the topic on "Engineering Faithful and Interpretable AI Systems", from 2:30pm to 4pm, January 21st (Wednesday) at B311 Lecture Hall, online participation is welcome (Tencent Meeting ID: 216-952-697)
About the Speaker:
René Vidal is the Penn Integrates Knowledge and Rachleff University Professor of Electrical and Systems Engineering & Radiology, the Director of the Center for Innovation in Data Engineering and Science (IDEAS), and Co-Chair of Penn AI at the University of Pennsylvania. He is also an Amazon Scholar, an Affiliated Chief Scientist at NORCE, and a former Associate Editor in Chief of TPAMI. His current research focuses on the foundations of deep learning and trustworthy AI and its applications in computer vision and biomedical data science. His lab has made seminal contributions to motion segmentation, action recognition, subspace clustering, matrix factorization, deep learning theory, interpretable AI, and biomedical image analysis. He is an ACM Fellow, AIMBE Fellow, IEEE Fellow, IAPR Fellow and Sloan Fellow, and has received numerous awards for his work, including the IEEE Edward J. McCluskey Technical Achievement Award, D’Alembert Faculty Award, J.K. Aggarwal Prize, ONR Young Investigator Award, NSF CAREER Award as well as best paper awards in machine learning, computer vision, signal processing, controls, and medical robotics.
Abstract:
Large Language Models (LLMs) and Vision Language Models (VLMs) have achieved remarkable performance across a wide range of tasks. However, their growing deployment has exposed fundamental limitations in faithfulness, safety, and transparency. In this talk, Prof. Vidal will present a unified perspective on addressing these challenges through principled model interventions and interpretable decision-making frameworks. He first introduces Parsimonious Concept Engineering (PaCE), an approach that improves faithfulness and alignment by selectively removing undesirable internal activations, mitigating hallucinations and biased language while preserving linguistic competence. Prof. Vidal then present Information Pursuit (IP), an interpretable-by-design prediction framework that replaces opaque reasoning with a sequence of informative, user-interpretable queries, yielding concise explanations alongside accurate predictions. Results across text, vision, and medical tasks illustrate how these ideas advance transparency without sacrificing performance. Together, these contributions point toward a broader direction for building AI systems that are powerful, faithful, and aligned with human values.