Staff Researcher at Alibaba and PhD from Ml-Labs, Dublin City University
Glasnevin, Dublin
Ireland
Email: lyuchenyang.dcu [at] gmail [dot] com
Google Scholar
Twitter
DBLP
LinkedIn
I am currently a Staff Researcher/Tech Lead at the Alibaba AI Business Group, where I head the speech team. Previously, I was a researcher at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), focusing on multilingual and multimodal large language models. I earned my Ph.D. in Machine Learning from Dublin City University's ML-Labs in 2023, following a Bachelor of Engineering from Northeastern University in China in 2018. My research lies primarily in natural language processing, especially the application of large models—including vision-language models—to multilingual and multimodal tasks. I have published over 30 papers at top-tier conferences such as ACL, EMNLP, NeurIPS, and ACM-MM, with my GPT4Video work nominated for the Best Paper Award at ACM-MM 2024. My Google Scholar citations exceed 1,600, with an h-index of 19, and the open-source projects I have led or contributed to have collectively garnered over 4k stars on GitHub. I won two championships and two runner-up prizes in the IWSLT 2025 speech translation competition. I also serve as an area chair, program committee member, shared task organizer, and reviewer for several leading conferences including ICLR, ACL, EMNLP, IJCAI, and ACM-MM. Prior to my current role, I gained extensive research experience in large language models through positions as a research assistant and visiting scholar at Tencent AI Lab, Japan's National Institute of Informatics (NII) and IBM Research-China. My work has been recognized with several awards, including the IWSLT 2025 Speech Translation Competition championship, the ACM-MM 2024 Best Paper nomination, the German DAAD AInet Fellowship, the 2023 Irish AI Young Talent of the Year Award and an SFI PhD Scholarship. Additionally, my research has been featured in media outlets such as Irish national broadcaster RTÉ, Slator, Irish Tech News, and Irish podcasts.
[2023/06]: We have released our Multi-modal Large Language Models named Macaw-LLM!.
Graph-Based Video-Language Learning with Multi-Grained Audio-Visual Alignment [PDF][Code][Bibtex] Chenyang Lyu, Wenxi Li, Tianbo Ji, Longyue Wang, Liting Zhou, Cathal Gurrin, Linyi Yang, Yi Yu, Yvette Graham, Jennifer Foster In Proceedings of the 31st ACM International Conference on Multimedia, ACM-MM 2023
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration [PDF][Code][Bibtex] Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, and Zhaopeng Tu. Preprint 2023 (1100+ stars on Github, 300,000 views and discussion on Twitter).
Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering [PDF][Code][Bibtex] Chenyang Lyu, Tianbo Ji, Yvette Graham, and Jennifer Foster In Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, ACL 2023
Is a Video worth $n\times n$ Images? A Highly Efficient Approach to Transformer-based Video Question Answering [PDF][Code][Bibtex] Chenyang Lyu, Tianbo Ji, Yvette Graham, and Jennifer Foster In Proceedings of The Third Workshop on Simple and Efficient Natural Language Processing, ACL 2023
Exploiting Rich Textual User-Product Context for Improving Personalised Sentiment Analysis [PDF][Code][Bibtex] Chenyang Lyu, Linyi Yang, Yue Zhang, Yvette Graham, Jennifer Foster In Findings of the 61th Annual Meeting of the Association for Computational Linguistics, ACL 2023
Gated Multi-Modal Fusion with Cross-Modal Contrastive Learning for Video Question Answering [PDF][Code][Bibtex] Chenyang Lyu, Wenxi Li, Tianbo Ji, Liting Zhou, Cathal Gurrin In the 32nd International Conference on Artificial Neural Networks, ICANN 2023
New Trends in Machine Translation with Large Language Models [PDF] [Code][Bibtex] Chenyang Lyu, Zefeng Du, Jitao Xu, Derek F. Wong, Yitao Duan, Longyue Wang Symposium on Large Language Models at IJCAI2023
Document-Level Machine Translation with Large Language Models [PDF] [Slides][Code][Bibtex] Longyue Wang*, Chenyang Lyu*, Tianbo Ji*, Zhirui Zhang*, Dian Yu, Shuming Shi, Zhaopeng Tu Technical Report 2023
Dialogue-to-Video Retrieval [PDF] [Slides][Code][Bibtex] Chenyang Lyu, Manh-Duy Nguyen, Van-Tu Ninh, Liting Zhou, Cathal Gurrin, Jennifer Foster The 45th European Conference on Information Retrieval, ECIR 2023
Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains [PDF] [Slides][Code][Bibtex] Chenyang Lyu, Jennifer Foster and Yvette Graham The 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, Workshop on Insights from Negative Results in NLP
Achieving Reliable Human Assessment of Open-Domain Dialogue Systems [PDF][Slides][Code][Bibtex] Tianbo Ji, Yvette Graham, Gareth J. F. Jones, Chenyang Lyu and Qun Liu The 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
DCU-Lorcan at FinCausal 2022: Span-based Causality Extraction from Financial Documents using Pre-trained Language Models [PDF][Bibtex] Chenyang Lyu, Tianbo Ji, Quanwei Sun, Liting Zhou The 14th Conference on Language Resources and Evaluation, LREC 2022, Proceedings of the 4th Financial Narrative Processing Workshop
Knowledge and Pre-trained Language Models Inside and Out: a deep-dive into datasets and external knowledge [PDF][slides][Bibtex] Chenyang Lyu PhD Transfer Report
Improving Unsupervised Question Answering via Summarization-Informed Question Generation [PDF][Slides][Bibtex] Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang and Qun Liu The 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Improving Document-Level Sentiment Analysis with User and Product Context [PDF][Slides] [Code][Bibtex] Chenyang Lyu, Jennifer Foster and Yvette Graham The 28th International Conference on Computational Linguistics, COLING 2020
Incorporating Context and Knowledge for Better Sentiment Analysis of Narrative Text
[PDF][Bibtex]
Chenyang Lyu, Tianbo Ji and Yvette Graham
The 42nd European Conference on Information Retrieval, ECIR 2020, The Third International Workshop on Narrative Extraction from Texts