Chenhe Gu

I'm a graduate student in computer science at the University of California, Irvine currently working on adversarial robustness and safety evaluation for multimodal large language models. My research focuses on understanding and improving the security vulnerabilities of vision-language AI systems, particularly in how these models can be exploited through adversarial attacks that manipulate both visual and textual inputs. My work has contributed to developing novel attack methods like the Dynamic Vision-Language Alignment (DynVLA) Attack, which enhances the transferability of adversarial examples across different multimodal models by targeting the vision-language connector components. This research addresses a critical gap in understanding how adversarial perturbations can generalize across diverse model architectures, including both open-source models like BLIP2, InstructBLIP, and LLaVA, as well as closed-source systems like Gemini. Additionally, I've been involved in large-scale collaborative efforts to establish industry standards for AI safety evaluation, contributing to the development of AILuminate v1.0, a comprehensive benchmark for assessing AI system risk and reliability. This work represents a crucial step toward creating standardized evaluation frameworks that can help ensure safer deployment of AI technologies across various applications and domains.

Publications

Improving Adversarial Transferability in MLLMs via Dynamic Vision-Language Alignment Attack

Chenhe Gu, Jindong Gu, Andong Hua, Yao Qin

arXiv.org 2025

AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons

Shaona Ghosh, Heather Frase, Adina Williams, Sarah Luger, Paul Röttger, Fazl Barez, Sean McGregor, Kenneth Fricklas, Mala Kumar, Quentin Feuillade--Montixi, Kurt Bollacker, Felix Friedrich, Ryan Tsang, Bertie Vidgen, Alicia Parrish, Chris Knotz, Eleonora Presani, Jonathan Bennion, Marisa Ferrara Boston, Mike Kuniavsky, Wiebke Hutiri, James Ezick, Malek Ben Salem, Rajat Sahay, Sujata Goswami, Usman Gohar, Ben Huang, Supheakmungkol Sarin, Elie Alhajjar, Canyu Chen, Roman Eng, Kashyap Ramanandula Manjusha, Virendra Mehta, Eileen Long, M. Emani, Natan Vidra, Benjamin Rukundo, Abolfazl Shahbazi, Kongtao Chen, Rajat Ghosh, Vithursan Thangarasa, Pierre Peign'e, Abhinav Singh, Max Bartolo, Satyapriya Krishna, Mubashara Akhtar, Rafael Gold, Cody Coleman, Luis Oala, Vassil Tashev, Joseph Marvin Imperial, Amy Russ, Sasidhar Kunapuli, Nicolas Miailhe, Julien Delaunay, Bhaktipriya Radharapu, Rajat Shinde, Tuesday, Debojyoti Dutta, Declan Grabb, Ananya Gangavarapu, Saurav Sahay, Agasthya Gangavarapu, P. Schramowski, Stephen Singam, Tom David, Xudong Han, P. Mammen, Tarunima Prabhakar, Venelin Kovatchev, Ahmed Ahmed, Kelvin N. Manyeki, Sandeep Madireddy, Foutse Khomh, Fedor Zhdanov, Joachim Baumann, N. Vasan, Xianjun Yang, Carlos Mougn, J. Varghese, Hussain Chinoy, Seshakrishna Jitendar, M. Maskey, C. Hardgrove, Tianhao Li, Aakash Gupta, Emil Joswin, Yifan Mai, Shachi H. Kumar, Çigdem Patlak, Kevin Lu, Vincent Alessi, Sree Bhargavi Balija, Chenhe Gu, Robert Sullivan, J. Gealy, Matt Lavrisa, James Goel, Peter Mattson, Percy Liang, Joaquin Vanschoren

arXiv.org 2025