Tengda Han (韩腾达) /tən-daː/

Email  /  CV  /  Google Scholar  /  GitHub  /  Twitter


I am a research scientist at Google DeepMind. I was a post-doctoral research fellow in VGG at the University of Oxford. I obtained my PhD degree in the same group, supervised by Andrew Zisserman. I train neural networks to understand videos.

profile photo

I obtained a Bachelor of Engineering degree in Australian National University, with a major in mechanical & material engineering. I had chance to work on deep learning projects with Basura Fernando, Anoop Cherian, Mehrtash Harandi, Stephen Gould and Richard Hartley.

Before that, I studied business administration and law in Renmin University of China, Beijing, for one year.

Originally, I am from the suburbs of Hangzhou, China.

News
Publications
CountGD: Multi-Modal Open-World Counting
Niki Amini-Naieni, Tengda Han, Andrew Zisserman
NeurIPS, 2024
AutoAD ZERO: A Training-free Framework for Zero-shot Audio Description
Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
ACCV, 2024
It’s Just Another Day: Unique Captioning by Discriminative Prompting
Toby Perrett, Tengda Han, Dima Damen, and Andrew Zisserman
[Oral] ACCV, 2024
AutoAD III: The Prequel - Back to the Pixels
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
CVPR, 2024
Semantic Counting from Self-Collages
Lukas Knobel, Tengda Han*, Yuki M. Asano*
CVPR, 2024
AutoAD II: The Sequel - Who, When, and What in Movie Audio Description
Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
ICCV, 2023
Open-world Text-specified Object Counting
Niki Amini-Naieni, Kiana Amini-Naieni, Tengda Han, Andrew Zisserman
[Best Poster Award] BMVC, 2023
AutoAD: Movie Description in Context
Tengda Han*, Max Bain*, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman
[Highlight] CVPR, 2023
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio
Max Bain, Jaesung Huh, Tengda Han, Andrew Zisserman
INTERSPEECH, 2023
Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers
Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M. Asano
Technical report, 2022
Turbo Training with Token Dropout
Tengda Han, Weidi Xie, Andrew Zisserman
BMVC, 2022
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
ECCV, 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, Roman Ring, Eliza Rutherford, Serkan Cabi, Tengda Han, Zhitao Gong, Sina Samangooei, Marianne Monteiro, Jacob Menick, Sebastian Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, Karen Simonyan
NeurIPS, 2022
Temporal Alignment Networks for Long-term Video
Tengda Han, Weidi Xie, Andrew Zisserman
[Oral] CVPR, 2022
Self-supervised Co-Training for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
NeurIPS, 2020
Memory-augmented Dense Predictive Coding for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
[Spotlight] ECCV, 2020
Video Representation Learning by Dense Predictive Coding
Tengda Han, Weidi Xie, Andrew Zisserman
[Oral] Workshop on Large Scale Holistic Video Understanding, ICCV, 2019
Human Action Forecasting by Learning Task Grammars
Tengda Han, Jue Wang, Anoop Cherian, Stephen Gould
Tech report, 2017
Human Pose Forecasting via Deep Markov Models
Sam Toyer, Anoop Cherian, Tengda Han, Stephen Gould
DICTA, 2017
Misc