About
Hi, I am a PhD student at HKU-NLP group, co-supervised by Prof. Lingpeng Kong and Prof. Qi Liu. Previously, I completed my master’s degree at Peking University as a member of LANCO, advised by Prof. Xu Sun and my bachelor’s degree at Xidian University.
My research interests lie in empowering LLMs with multi-modal abilities and understanding the working mechanism of LLMs.
I am happy to discuss potential collaboration opportunities, feel free to reach out!
News
- [2024/09] Two papers (VLFeedback and ICL Survey] are accepted by EMNLP 2024 Main Conference!
- [2024/07] VITATECS for benchmarking temporal understanding of VideoLLMs, got accepted by ECCV 2024!
- [2024/06] Checkout our Video-MME, the first-ever comprehensive benchmark for Video-LLMs.
- [2024/05] Five papers got accepted by ACL 2024, see you at !
- [2024/02] Checkout our Reka Flash and Reka Core, first-tier mutlimodal LLMs!
- [2023/12] Our paper Label Words are Anchors won the Best Long Paper Award of EMNLP 2023!
Education
- PhD Student, The Univeristy of Hong Kong, Sept. 2023 - Now.
- MSc in Computer Science, Peking University, Sept. 2020 - July 2023.
- BE in Software Engineering, Xidian University Sept. 2016 - Jul. 2020.
Internship
- Reka AI, Multi-modal LLM R&D Intern, July 2023 - Now
-
Shanghai AI Lab, Research Intern, July 2022 - June 2023.
Mentor: Dr. Jingjing Xu
- Toutiao Search, Search Algorithm Intern, Dec.2021 - June 2022.
-
Wechat AI, Research Intern, April 2020 - Nov. 2021.
Mentor: Dr. Yankai Lin and Dr. Peng Li
Preprints
(#: Equal Contribution)
-
M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning
Lei Li, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu Sun, Lingpeng Kong, Qi Liu
[arxiv, dataset]
Selected Publication
-
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Lei Li #, Zhihui Xie #, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong, Qi Liu
EMNLP 2024 [project page] -
A Survey for In-context Learning
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li, Zhifang Sui
EMNLP 2024 [arxiv, paper list] -
VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Shicheng Li, Lei Li, Shuhuai Ren, Yuanxin Liu, Yi Liu, Rundong Gao, Xu Sun, Lu Hou
ECCV 2024 [arxiv, dataset] -
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Lei Li #, Yuqi Wang #, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu
ACL 2024 [project page] -
Can Language Models Understand Physical Concepts?
Lei Li, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu Sun
EMNLP 2023 [arxiv, dataset] -
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
EMNLP 2023 (Best Long Paper Award) [arxiv, code] -
Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction
Lei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu Sun
ECML-PKDD 2022 (Oral) [arxiv, code] -
From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
Lei Li, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun
Findings of EMNLP 2022 [url, code] -
Dynamic Knowledge Distillation for Pre-trained Language Models
Lei Li, Yankai Lin, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
EMNLP 2021 (Oral) [url, code] -
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
Lei Li, Yankai Lin, Deli Chen, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
Findings of EMNLP 2021 [url, code] -
Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge
Yi Zhang #, Lei Li #, Yunfang Wu, Qi Su, Xu Sun
IEEE Transactions on Audio, Speech and Language Processing (TASLP) [arxiv] -
Enhancing Topic-to-essay Generation with External Commonsense Knowledge
Pengcheng Yang #, Lei Li #, Fuli Luo, Tianyu Liu, Xu Sun
ACL 2019, [url]
Academic Service
- Area Chair/Action Editor: ACL ARR 2024
- Reviewer/Program Committee: IJCV, ACM CSUR, NeuIPS 2024, COLM 2024, ICML 2024, CVPR 2024, ICLR 2024, NeurIPS 2023 (Top Reviewer Award), ACL(2020 - 2023), EMNLP (2019 - 2023)
-
Teaching Assistant:
- Machine Learning in Trading and Finance (2023 Fall, HKU)
- Computational Linguistics (2021 Fall, PKU)
Invited Talks
-
[2024/10] AI TIME, VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Model Alignment
Video (Coming soon..) -
[2023/12] NICE-NLP, Can Large Language Models Understand Physical Concepts?
Video (in Chinese) -
[2023/03] LCC-Lab, Beijing Language and Culture University, A Survey on In-context Learning Survey
Video (in Chinese)
Awards
- Best Long Paper Award, EMNLP, 2023
- National Scholarship, Peking University, 2021
- National Scholarship, Xidian University, 2019
- Best Demo Award, DeeCamp, 2018