About Me

I am currently a fourth-year direct Ph.D. student at the College of Computer Science and Technology, Zhejiang University. My research primarily focuses on the generalizability of multimodal large language models and vision-language models, with a prior concentration on unsupervised domain adaptation and domain generalization. Before joining Zhejiang University, I earned my undergraduate degree at Beijing University of Chemical Technology. Currently, I am very fortunate to be advised by Prof. Chao Wu.

I am on the job market now and will be graduated in the 2025 summer. I am open to both academic and industrial positions! Please contact me if you have matched positions.

Contact

Email: didi_zhu at zju dot edu dot cn / one dot invisible dot numb at gmail dot com

Research Interest

  • Robustness of Multi-modal Large Language Models
  • Generalization of Vision-Language Models
  • Unsupervised Domain Adaptation

Recent News

  • 05 / 2024: One first-author paper has been accepted to KDD 2024.
  • 05 / 2024: One first-author paper has been accepted to ICML 2024.
  • 08 / 2023: One paper about prompt learning is online.
  • 07 / 2023: One first-author paper has been accepted to ICCV 2023.
  • 07 / 2023: One first-author paper has been accepted to ACM Multimedia 2023.
  • 05 / 2023: One paper has been accepted to KDD 2023.
  • 11 / 2022: One paper has been accepted to IEEE Transactions on Big Data.
  • 10 / 2022: I have passed the Mid-term Assessment of Doctoral Program.
  • 10 / 2022: I am presented with Outstanding Postgraduate Student Award on 2021-2022 by Zhejiang University.
  • 05 / 2022: One paper has been accepted to IJCAI 2021 FL Workshop.

Publications

Refereed Publications

  • Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models. [paper] Didi Zhu, Zhongyisun Sun, Zexi Li, Tao Shen, Ke Yan, Shouhong Ding, Chao Wu, Kun Kuang ICML 2024
  • Neural Collapse Anchored Prompt Tuning for Generalizable Vision-Language Models. [paper] Didi Zhu, Zexi Li, Min Zhang, Junkun Yuan, Jiashuo Liu, Kun Kuang, Chao Wu KDD 2024
  • Universal domain adaptation via compressive attention matching. [paper] Didi Zhu, Yinchuan Li, Junkun Yuan, Zexi Li, Kun Kuang, Chao Wu ICCV 2023
  • Generalized Universal Domain Adaptation with Generative Flow Networks. [paper] Didi Zhu, Yinchuan Li, Yunfeng Shao, Jianye Hao, Fei Wu, Kun Kuang, Jun Xiao, Chao Wu ACM Multimedia (MM) 2023
  • Ensemble federated adversarial training with non-iid data. [paper] Shuang Luo, Didi Zhu, Zexi Li, Chao Wu International Joint Conferences on Artificial Intelligence (IJCAI) FL workshop 2021
  • Quantitatively Measuring and Contrastively Exploring Heterogeneity for Domain Generalization. [paper] Yunze Tong, Junkun Yuan, Min Zhang, Didi Zhu, Keli Zhang, Fei Wu, Kun Kuang KDD 2023
  • Towards Effective Clustered Federated Learning: A Peer-to-peer Framework with Adaptive Neighbor Matching. [paper] Zexi Li, Jiaxun Lu, Shuang Luo, Didi Zhu, Yunfeng Shao, Yinchuan Li, Zhimeng Zhang, Yongheng Wang, Chao Wu IEEE Transactions on Big Data

Projects

  • Catastrophic Forgetting in Multi-Modal Large Language Models

    • Sep 2023-Fed 2024 in Tencent Youtu Lab
    • Pioneered the first comprehensive exploration and revelation of catastrophic forgetting in MLLMs such as InstructBLIP and LLaVa.
    • Addressed the issue through an innovative training-free model grafting technique.
  • Prompt Tuning in Large Vision-Language Models Based on Neural Collapse

    • April 2023-Aug 2023 in DCD lab, Zhejiang University
    • The first exploration of large vision-language models through the lens of neural collapse in deep learning theory.
    • Tackle class imbalance in generalization tasks for large vision-language models by leveraging neural collapse theory.
  • Out-of-Distribution Detection based on Self-Attention Mechanism

    • Dec 2022-Mar 2023 in Huawei Noah’s Ark Lab
    • Addressed the issue of inconsistent source-target label spaces in Universal Domain Adaptation directly using self-attention in ViT.
    • This research has been accepted at ICCV 2023.
  • Unifying Domain Adaptation Variants with Label Heterogeneity based on GFlowNet

    • May 2022-Nov 2022 in Huawei Noah’s Ark Lab
    • Introduced a comprehensive problem called Generalized Universal Domain Adaptation, achieving a unification of all Domain Adaptation sub-problems involving label heterogeneity.
    • Implemented an exploration-aware active learning strategy based on Generative Flow Networks to effectively address GUDA.
    • This research has been accepted at ACM Multimedia 2023.