Tom KO Yu Ting

Associate Professor

Center for Language, Intelligence and Machines Deputy Director

Center for Language, Intelligence and Machines

Education Background

1. Educational Background (in reverse chronological order)：

09/2010 to 06/2014 Doctor of Philosophy in Computer Science and Engineering,

The Hong Kong University of Science and Technology

(Advisor: Professor Brian Mak)

01/2008 to 08/2010 Master of Philosophy in Computer Science and Engineering,

The Hong Kong University of Science and Technology

09/2006 to 12/2007 Master of Science in IC Design Engineering,

The Hong Kong University of Science and Technology

09/2000 to 05/2003 Bachelor of Computer Engineering,

The Chinese University of Hong Kong

2. Work Experience (in reverse chronological order)：

10/2021 to 12/2024 ByteDance AI Lab，Research Scientist

01/2019 to 09/2021 Southern University of Science and Technology，Assistant Professor (Tenure-track)

06/2014 to 01/2019 HuaWei Noah’sArt Lab，Researcher

Research Field

1.Full-duplex speech large model, multimodal large model, deep learning

Class Type

Language, Intelligence and Machines

Personal Website

https://tomkocse.github.io/

tomko@slai.edu.cn

Biography

[Tom Ko is currently serves as a full-time Associate Professor at SLAI and is the Deputy Director of LIMA Center. Previously, he worked as a Research Scientist at ByteDance's AI Lab and served as an Assistant Professor at Southern University of Science and Technology, and a researcher at Huawei's Noah's Ark Lab.

He received his PhD in Computer Science and Engineering from the Hong Kong University of Science and Technology in 2014. He has published over 50 research papers specializing in speech processing and natural language processing, with more than 5,100 citations. Two pioneering papers in speech augmentation have each received over 1,000 citations.

In recent years, he has focused on the applications of large language models in speech translation, contributing to the development of representative systems such as SpeechT5 and CLASI, with related results published in top conferences like ACL and ICLR. Acknowledged for his excellence, he has received awards such as the Excellent Team Award at ByteDance and the "Future Star" Award at Huawei. He is also an active program committee member and chair for international conferences such as IWSLT and Interspeech, demonstrating his broad influence in both industry and academia.]

Academic Publications

Tom Ko, Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur

"Audio Augmentation for Speech Recognition",

in Proceedings of Interspeech, September, 2015

Tom Ko, Vijayaditya Peddinti, Daniel Povey, Michael L. Seltzer, Sanjeev Khudanpur

"A Study on Data Augmentation of Reverberant Speech for Robust Speech Recognition",

in Proceedings of ICASSP, March, 2017

Yingke Zhu, Tom Ko, David Snyder, Brian Mak, Daniel Povey

"Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification",

in Proceedings of Interspeech, September, 2018

Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

“Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing”,

in Proceedings of ACL, May, 2022

Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, … "Wavcaps: A chatgpt-assisted weakly-labelled audio captioning dataset for audio-language multimodal research", IEEE Transactions on Audio, Speech and Language Processing, 2024