This repository is the official implementation of Multi-Paradigm Collaborative Adversarial Attack Against Multimodal Large Language Models. [Paper]
Overview of the proposed MPCAttack: (a) Pipeline for MPCAttack in adversarial examples generation. (b) Pipeline for attacking MLLMs.
To install requirements:
conda create -n MPCAttack python=3.10
conda activate MPCAttack
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu118
pip install -U transformers
pip install hydra-core pytorch-lightning opencv-python scipy nltk timm==1.0.1 pandas
pip install git+https://github.com/openai/CLIP.gitInstall from requirements file
pip install -r requirements.txt-
Prepare Data
Download the datasets from this link. -
Generate Adversarial Examples
python generate_adversarial_examples_MPCAttack.py --output ./MPCAttack
-
Evaluation
The evaluation is seperated into two parts:
- generate descriptions for clean and adversarial images on target blackbox model
- evaluate the Attack Success Rate (ASR) and Similarity score
For the first part, run:
python python blackbox_text_generation.py --output ./MPCAttack --model_name Qwen2.5-VL-7B-Instruct
Note1: In the first run of the first part, the source image, the target image, and the text description of the adversarial example are generated simultaneously. When the text description file of the source image and the target image already exists, it will be skipped to avoid duplicate generation.
Note2: All open-source MLLMs are evaluated using the VLMEvalKit toolkit. You can update the vlmeval folder to reference VLMEvalKit to use the latest open-source models.
Note3: When the target model is a closed-source model, the corresponding API needs to be configured. Create api_keys.yaml under the root following this template:
# API Keys for different models # DO NOT commit this file to git! gpt4v: "your_api_key" claude: "your_api_key" claude4_5: "your_api_key" gemini: "your_api_key" gpt4o: "your_api_key" gpt5: "your_api_key" gpt-4o-mini: "your_api_key"
For the second part, run:
python gpt_evaluate.py --output ./MPCAttack --model_name Qwen2.5-VL-7B-Instruct
Note: The evaluation model is gpt-4o-mini model, so we also need to configure the api key.
- Visualization of adversarial images and perturbations.
- Visualization of adversarial images in attacking commercial MLLMs.
We sincerely thank M-Attack and FoA-Attack for their outstanding work.
@article{li2026multi,
title={Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models},
author={Li, Yuanbo and Xu, Tianyang and Hu, Cong and Zhou, Tao and Wu, Xiao-Jun and Kittler, Josef},
journal={arXiv preprint arXiv:2603.04846},
year={2026}
}




