AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant T2I Adversarial Patches (NeurIPS 2025)
Cutting-edge works have demonstrated that text-to-image (T2I) diffusion models can generate adversarial patches that mislead state-of-the-art object detectors in the physical world, revealing detectors' vulnerabilities and risks. However, these methods neglect the T2I patches' attack effectiveness when observed from different views in the physical world (i.e., angle robustness of the T2I adversarial patches). In this paper, we study the angle robustness of T2I adversarial patches comprehensively, revealing their angle-robust issues, demonstrating that texts affect the angle robustness of generated patches significantly, and task-specific linguistic instructions fail to enhance the angle robustness. Motivated by the studies, we introduce Angle-Robust Concept Learning (AngleRoCL), a simple and flexible approach that learns a generalizable concept (i.e., text embeddings in implementation) representing the capability of generating angle-robust patches. The learned concept can be incorporated into textual prompts and guides T2I models to generate patches with their attack effectiveness inherently resistant to viewpoint variations. Through extensive simulation and physical-world experiments on five SOTA detectors across multiple views, we demonstrate that AngleRoCL significantly enhances the angle robustness of T2I adversarial patches compared to baseline methods. Our patches maintain high attack success rates even under challenging viewing conditions, with over 50% average relative improvement in attack effectiveness across multiple angles. This research advances the understanding of physically angle-robust patches and provides insights into the relationship between textual concepts and physical properties in T2I-generated contents.
24.10.2025| Accepted by NeurIPS 2025! π14.06.2025| Paper available on arXiv: Link
The code requires Python 3.10.16 or later. The file requirements.txt contains the full list of required Python modules.
# Create conda environment
conda create -n anglerocl python=3.10.16
conda activate anglerocl
# Install PyTorch (tested on RTX 3090/4090)
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
# Install dependencies
pip install safetensors==0.5.3
pip install transformers==4.51.3
pip install accelerate==1.4.0
pip install diffusers==0.32.2
pip install opencv-python==4.8.1.78
pip install pandas==2.2.3
pip install kornia==0.6.8
pip install numpy==1.23.1
pip install scikit-learn==1.3.1
# Install MMDetection (for multi-detector evaluation)
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
mim install mmdet-
Stable Diffusion v1.5 - Download from Hugging Face
-
Our Pretrained Resources - Google Drive
- angle-robust.safetensors - Trained angle-robust concept embedding
- Generated patches - Pre-generated adversarial patches
- Detector checkpoints - YOLOv3/v5/v10, DETR, Faster R-CNN, RT-DETR weights
The code was tested on NVIDIA RTX 3090/4090 GPUs with 24GB VRAM.
You can train the angle-robust concept using the command below:
accelerate launch anglerocl.py \
--pretrained_model_name_or_path="stable-diffusion-v1-5/stable-diffusion-v1-5" \
--placeholder_token="<angle-robust>" \
--initializer_token="robust" \
--output_dir="runs/anglerocl/${timestamp}" \
--yolo_weights_file="yolov5s.pt" \
--max_train_steps=50000 \
--save_steps=1950 \
--validation_steps=1950pretrained_model_name_or_pathpath to pre-trained Stable Diffusion modelplaceholder_tokenthe concept token to learn (default: "")initializer_tokentoken to initialize the concept (default: "robust")output_dirwhere the learned embeddings will be saved (${timestamp} will be replaced automatically)yolo_weights_filepath to YOLOv5 detector checkpointmax_train_stepstotal training steps (default: 195000, recommended: 50000 for faster training)save_stepssave embeddings every N steps (default: 1950)validation_stepsrun validation every N steps (default: 1950)
After training the angle-robust concept, you can generate adversarial patches using the learned embedding.
You can generate datasets using the scripts in the dataset/ folder:
# Generate NDDA baseline dataset
python dataset/NDDA.py \
--output-dir=<output_path> \
--num-images=50
# Generate NDDA + AngleRoCL dataset
python dataset/NDDA_textual_inversion.py \
--output-dir=<output_path> \
--num-images=50
# Generate with prompt tuning
python dataset/NDDA_tuneprompt.py \
--output-dir=<output_path> \
--num-images=50 \
--group="all"You can also generate a single image using generate.py:
from diffusers import StableDiffusionPipeline
import torch
model_id = "stable-diffusion-v1-5/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
# Load the trained angle-robust concept
repo_id_embeds = "<path_to_learned_embeds>"
pipe.load_textual_inversion(repo_id_embeds)
# Generate image with angle-robust concept
prompt = "A <angle-robust> blue stop sign with 'abcd' on it"
image = pipe(prompt, num_inference_steps=25, guidance_scale=7.5).images[0]
image.save("angle_robust_stop_sign.png")You can evaluate the generated patches using the testing scripts in the test/ folder.
There are four testing scripts organized in two groups:
# Process all textures in a single folder
python test/multiview_detection_folder_multidetector.py \
--texture-folder= \
--detector=yolov5 \
--yolov5-model= \
--target-class-id=11 \
--output-dir=
# Batch process multiple folders (e.g., different prompt categories)
python test/multiview_detection_folder_batch_multidetector.py \
--base-dir= \
--detector=yolov5 \
--yolov5-model= \
--target-class-id=11 \
--output-base-dir=# Process all textures in a single folder with environment backgrounds
python test/multiview_detection_folder_multidetector_environment.py \
--texture-folder= \
--environment-folder= \
--detector=yolov5 \
--yolov5-model= \
--target-class-id=11 \
--output-dir=
# Batch process multiple folders with environment backgrounds
python test/multiview_detection_folder_batch_multidetector_environment.py \
--base-dir= \
--environment-folder= \
--detector=yolov5 \
--yolov5-model= \
--output-base-dir=_folderscripts: Process all patches within one folder_folder_batchscripts: Process all patches across multiple subfolders (batch mode with category grouping)- Without
_environment: Test on pure color backgrounds - With
_environment: Test on real-world environment backgrounds
Example Directory Structure:
# For _folder scripts:
patches/
βββ image_001.png
βββ image_002.png
βββ image_003.png
# For _folder_batch scripts:
all_patches/
βββ blue_square_stop_sign/
β βββ image_001.png
β βββ image_002.png
βββ stop_sign_with_hello/
β βββ image_001.png
β βββ image_002.png
βββ yellow_triangle_stop_sign/
βββ image_001.png
βββ image_002.png
Output files:
angle_confidence.csv- detection confidence at each angleconfidence_curve.png- visualization of angle-confidence curveaasr_analysis/- detailed AASR metrics and analysissummary_results.csv- (for_folder_batchscripts) aggregated results and category-wise statistics
We thank the authors of the following outstanding open-source repositories for their valuable code and contributions:
- Hugging Face Diffusers β the foundation of our text-to-image pipeline
- P2P: Prompt-to-Perturb β pioneering work on text-guided adversarial attacks that greatly inspired this project (CVPR 2025)
- yolov5_adversarial β excellent reference implementation for physical-world adversarial patch attacks on YOLO detectors
@article{ji2025anglerocl,
title={AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant T2I Adversarial Patches},
author={Ji, Wenjun and Fu, Yuxiang and Ying, Luyang and Fan, Deng-Ping and Wang, Yuyi and Cheng, Ming-Ming and Tsang, Ivor and Guo, Qing},
journal={arXiv preprint arXiv:2506.09538},
year={2025}
}
