Hi, I am a PhD student at HKU-NLP group, co-supervised by Prof. Qi Liu and Prof. Lingpeng Kong.

My research interests lie in empowering LLMs with multi-modal abilities and exploring their emerging capabilities. Previously, I completed my master’s degree at Peking University as a member of LANCO, advised by Prof. Xu Sun, and my bachelor’s degree at Xidian University.

News

  • [2024/02] Checkout our Reka Flash, an efficient and capable multimodal language model at Reka Playground !!
  • [2023/12] :boom:Our paper Label Words are Anchors won the Best Long Paper Award of EMNLP 2023!
  • [2023/10] Four papers got accepted by EMNLP 2023!
  • [2023/09] One paper got accepted by NeurIPS D&B 2023.
  • [2023/06] Our M3IT dataset and model Ying-VLM for multi-modal instruction tuning is released at HuggingFace!

Education

  • PhD Student, The Univeristy of Hong Kong, Sept. 2023 - Now.
  • MSc in Computer Science, Peking University, Sept. 2020 - July 2023.
  • BE in Software Engineering, Xidian University Sept. 2016 - Jul. 2020.

Internship

  • Reka AI, Multi-modal LLM R&D Intern, July 2023 - Now
  • Shanghai AI Lab, Research Intern, July 2022 - June 2023.

    Mentor: Dr. Jingjing Xu

  • Toutiao Search, Search Algorithm Intern, Dec.2021 - June 2022.
  • Wechat AI, Research Intern, April 2020 - Nov. 2021.

    Mentor: Dr. Yankai Lin and Dr. Peng Li

Preprints

(#: Equal Contribution)

  • Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
    Lei Li #, Yuqi Wang #, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu
    [project page]

  • Silkie: Preference Distillation for Large Visual Language Models
    Lei Li #, Zhihui Xie #, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong
    [project page]

  • M3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning
    Lei Li, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu Sun, Lingpeng Kong, Qi Liu
    [arxiv, dataset]

  • A Survey for In-context Learning
    Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, Zhifang Sui
    [arxiv, paper list]

  • VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
    Shicheng Li, Lei Li, Shuhuai Ren, Yuanxin Liu, Yi Liu, Rundong Gao, Xu Sun, Lu Hou
    [arxiv, dataset]

Selected Publication

  • Can Language Models Understand Physical Concepts?
    Lei Li, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu Sun
    EMNLP 2023 [arxiv, dataset]

  • Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
    Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
    EMNLP 2023 (Best Long Paper Award) [arxiv, code]

  • Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction
    Lei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu Sun
    ECML-PKDD 2022 (Oral) [arxiv, code]

  • From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
    Lei Li, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun
    Findings of EMNLP 2022 [url, code]

  • Dynamic Knowledge Distillation for Pre-trained Language Models
    Lei Li, Yankai Lin, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
    EMNLP 2021 (Oral) [url, code]

  • CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
    Lei Li, Yankai Lin, Deli Chen, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun
    Findings of EMNLP 2021 [url, code]

  • Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge
    Yi Zhang #, Lei Li #, Yunfang Wu, Qi Su, Xu Sun
    IEEE Transactions on Audio, Speech and Language Processing (TASLP) [arxiv]

  • Enhancing Topic-to-essay Generation with External Commonsense Knowledge
    Pengcheng Yang #, Lei Li #, Fuli Luo, Tianyu Liu, Xu Sun
    ACL 2019, [url]

Academic Service

  • Area Chair/Action Editor: ACL ARR 2024
  • Reviewer/Program Committee: COLM 2024, ICML 2024, CVPR 2024, ICLR 2024, NeurIPS 2023 (Top Reviewer Award), ACL(2020 - 2023), EMNLP (2019 - 2023)
  • Teaching Assistant:
    • Machine Learning in Trading and Finance (2023 Fall, HKU)
    • Computational Linguistics (2021 Fall, PKU)

Awards

  • Best Long Paper Award, EMNLP, 2023
  • National Scholarship, Peking University, 2021
  • National Scholarship, Xidian University, 2019
  • Best Demo Award, DeeCamp, 2018