实验室持续探索视觉-语言理解中的各类前沿问题,风格自由平等、简单高效,提供充分的科研支持与资源。近期主要关注视觉语言模型的高效表征和安全问题,欢迎对学术有兴趣有想法、敢于探索尝试,或有较强代码能力的同学与我交流联系。
25年已无名额,26年预计有1个名额。如果的确对我的实验室非常感兴趣,请尽早联系,越早越好。
I am broadly interested in how humans/machines understand vision and language. My current work focuses on efficient representation and safety. I am always looking for motivated students who are curious about our research problems, have experience reproducing research papers, and can commit to projects for 6+ months.
Attack as Defense: Safeguarding Large Vision-Language Models from Jailbreaking by Adversarial Attacks.
Chongxin Li, Hanzhang Wang (Corr.), Yuchun Fang
EMNLP Findings. 2025.
Textural or Textual: How Vision-Language Models Read Text in Images.
Hanzhang Wang, Qingyuan Ma
ICML. 2025.
Exploring Intrinsic Dimension for Vision-Language Model Pruning.
Hanzhang Wang, Jiawen Zhang, and Qingyuan Ma
ICML. 2024.
Evolutionary Recurrent Neural Network for Image Captioning.
Hanzhang Wang, Hanli Wang, and Kaisheng Xu
Neurocomputing. 2020.
Swell-and-Shrink: Decomposing Image Captioning by Transformation and Summarization.
Hanzhang Wang, Hanli Wang, and Kaisheng Xu
IJCAI. 2019.
Categorizing concepts with basic level for vision-to-language.
Hanzhang Wang, Hanli Wang, and Kaisheng Xu
CVPR. 2018.
2024, Google China Academic Cooperation Project
2023.1 - 2025.12, National Natural Science Foundation of China
Powered by Jekyll and Minimal Light theme.