Hi, I am a PhD student at HKU-NLP group, co-supervised by Prof. Lingpeng Kong and Prof. Qi Liu. Previously, I completed my master’s degree at Peking University as a member of LANCO, advised by Prof. Xu Sun and my bachelor’s degree at Xidian University.

My research interests lie in empowering LLMs with multi-modal abilities and understanding the working mechanism of LLMs.

I am happy to discuss potential collaboration opportunities, feel free to reach out!

News

  • [2024/09] Two papers (VLFeedback and ICL Survey] are accepted by EMNLP 2024 Main Conference!
  • [2024/07] VITATECS for benchmarking temporal understanding of VideoLLMs, got accepted by ECCV 2024!
  • [2024/06] Checkout our Video-MME, the first-ever comprehensive benchmark for Video-LLMs.
  • [2024/05] Five papers got accepted by ACL 2024, see you at :thailand:!
  • [2024/02] Checkout our Reka Flash and Reka Core, first-tier mutlimodal LLMs!
  • [2023/12] :boom:Our paper Label Words are Anchors won the Best Long Paper Award of EMNLP 2023!

Education

  • PhD Student, The Univeristy of Hong Kong, Sept. 2023 - Now.
  • MSc in Computer Science, Peking University, Sept. 2020 - July 2023.
  • BE in Software Engineering, Xidian University Sept. 2016 - Jul. 2020.

Internship

  • Reka AI, Multi-modal LLM R&D Intern, July 2023 - Now
  • Shanghai AI Lab, Research Intern, July 2022 - June 2023.

    Mentor: Dr. Jingjing Xu

  • Toutiao Search, Search Algorithm Intern, Dec.2021 - June 2022.
  • Wechat AI, Research Intern, April 2020 - Nov. 2021.

    Mentor: Dr. Yankai Lin and Dr. Peng Li

Preprints

(#: Equal Contribution)

  • M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning
    Lei Li, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu Sun, Lingpeng Kong, Qi Liu
    [arxiv, dataset]

Selected Publication

  • VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
    Lei Li #, Zhihui Xie #, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong, Qi Liu
    EMNLP 2024 [project page]

  • A Survey for In-context Learning
    Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li, Zhifang Sui
    EMNLP 2024 [arxiv, paper list]

  • VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
    Shicheng Li, Lei Li, Shuhuai Ren, Yuanxin Liu, Yi Liu, Rundong Gao, Xu Sun, Lu Hou
    ECCV 2024 [arxiv, dataset]

  • Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
    Lei Li #, Yuqi Wang #, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu
    ACL 2024 [project page]

  • Can Language Models Understand Physical Concepts?
    Lei Li, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu Sun
    EMNLP 2023 [arxiv, dataset]

  • Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
    Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
    EMNLP 2023 (Best Long Paper Award) [arxiv, code]

  • Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction
    Lei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu Sun
    ECML-PKDD 2022 (Oral) [arxiv, code]

  • From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
    Lei Li, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun
    Findings of EMNLP 2022 [url, code]

  • Dynamic Knowledge Distillation for Pre-trained Language Models
    Lei Li, Yankai Lin, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
    EMNLP 2021 (Oral) [url, code]

  • CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
    Lei Li, Yankai Lin, Deli Chen, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
    Findings of EMNLP 2021 [url, code]

  • Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge
    Yi Zhang #, Lei Li #, Yunfang Wu, Qi Su, Xu Sun
    IEEE Transactions on Audio, Speech and Language Processing (TASLP) [arxiv]

  • Enhancing Topic-to-essay Generation with External Commonsense Knowledge
    Pengcheng Yang #, Lei Li #, Fuli Luo, Tianyu Liu, Xu Sun
    ACL 2019, [url]

Academic Service

  • Area Chair/Action Editor: ACL ARR 2024
  • Reviewer/Program Committee: IJCV, ACM CSUR, NeuIPS 2024, COLM 2024, ICML 2024, CVPR 2024, ICLR 2024, NeurIPS 2023 (Top Reviewer Award), ACL(2020 - 2023), EMNLP (2019 - 2023)
  • Teaching Assistant:
    • Machine Learning in Trading and Finance (2023 Fall, HKU)
    • Computational Linguistics (2021 Fall, PKU)

Invited Talks

Awards

  • Best Long Paper Award, EMNLP, 2023
  • National Scholarship, Peking University, 2021
  • National Scholarship, Xidian University, 2019
  • Best Demo Award, DeeCamp, 2018