This is the supplementary material and code base used to run the experiments in the paper titled "Finding the Weakest Link: Adversarial Attack against Multi-Agent Communications", which was accepted as an Extended Abstract to AAMAS 2026.
Due to size constraints, we are unable to share our data. Instead, we provide instructions below on how to replicate our results below.
We used Python v3.9.21
Install the required libraries with
pip install -r requirements.txt
To run the main experiment use the command
python main_experiment.py
To plot the results of the main experiment use the command
python plot_main_experiment.py <filename>
To plot the data we collected use the command
python plot_main_experiment.py author_data
To save the figures use the flag -save_fig
Update tempo_threshold_config.py to change the tempo thresholds used in the experiment.
Update model_file_config.py to select the models used in the experiment.
For the attack, we use PGD with step size of 0.1 and number of steps of 20. k=1, and a Δm = 1. We selected the delta value using the sweep search method we describe below.
We attempt to control the attack rate, which is the proportion of attacked steps, using the threshold selection method we describe below. However, we found that threshold selection was unable to control the attack rate enough for a fair comparison between methods. Instead, we control the attack rate by binning results by the attack rate into low (0.25), medium (0.5), and high (0.75), with bin widths of 0.125.
Our full results are shown in the results file
For our experiments, we trained three independent systems per communication method per environment.
We show the training curve in the graph below.
We train each system for 30,000 or 100,000 episodes.
We used the same hyperparameters for all configurations and present them in the table below.
| Hyperparameter | Value |
|---|---|
| Number of hidden layers | 1 |
| Number of hidden nodes | 64 |
| Initial learning rate | 1e-3 |
| Learning rate step | 100 |
| Learning rate gamma | 0.9 |
| Initial epsilon | 1 |
| Epsilon decay | 0.995 |
| Minimum epsilon | 0.01 |
| Gamma | 0.99 |
We note that the epsilon hyperparameter refers to the epsilon greedy exploration method \cite{mnih_playing_2013} not the attack magnitude ϵ.
To replicate the training of agents, use the train_agents.py file.
python train_agents.py -alg <ALG> -env_config <ENV>
ALG = [RIAL, ObservationSharing]
ENV = [Navigation, PredatorPreyOrthogonal, PredatorPreyDiagonal, SmallTrafficJunction, LargeTrafficJunction]
Agents are saved in the data\models folder
Update model_file_config.py to attack and plot newly trained models.
Our experimental approach to selecting the attack magnitude uses a grid search from zero to two. We present the reward under different attack magnitudes and the attack success rate in the figures below. In these figures, we abbreviate the observation sharing communication method to "OBS" and the weighted, maximum and untargeted loss functions as W, M, and U respectively. We also abbreviate the navigation environment to Nav, the orthogonal and diagonal PredatorPrey environments to PP-O and PP-D respectively, and the small and large TrafficJunction environments to TJ-S and TJ-L respectively.
Perturbations from the weighted loss and maximum loss functions cause a consistent degradation of system reward on the observation sharing system for all environments and the RIAL system on the diagonal PredatorPrey environment. The weighted loss perturbations also cause a consistent degradation on navigation and orthogonal PredatorPrey. However, for the diagonal PredatorPrey environment, the weighted loss only scales with the attack magnitude up to a value of one. After this, higher attack magnitudes do not allow untargeted attacks to increase its effect on the reward. This may be due to the larger number of actions causing a local optimum for the Untargeted loss function. RIAL-trained systems in the navigation and orthogonal PredatorPrey environments are unaffected by all attacks of any magnitude in our range.
We also observe that increasing attack magnitudes increase the success rate for all attacks, environments, and communication methods. The Untargeted loss is consistently better than the weighted and maximum losses. Despite this, the Untargeted loss does not have a significantly better impact on the reward compared with either maximum or weighted loss.
Based on the results of this search, we identified one as an appropriate value for further comparisons between methods.
To run the magnitude sweep, use the command
python magnitude_sweep.py
To plot the results use the command
python plot_magnitude_sweeps.py <filename>
To save the figures use the flag -save_fig
To attempt to control the attack rate and enable a fair comparison of different tempo methods, we experimentally determined appropriate thresholds. For computational efficiency, we measure the different tempo metrics of the systems without attacking those systems.
To collect tempo threshold data run the command
python tempo_threshold_selection.py
To get the thresold data use the command
python get_tempo_thresholds <filename> -alg <ALG> -env_config <ENV>
The results can be used to update tempo_threshold_config.py.
Details about our environments can be found in the Environment README
Please cite our work
@inproceedings{standen_finding_2026,
author={Standen, Maxwell and Kim, Junae and Szabo, Claudia},
title = {Finding the Weakest Link: Adversarial Attack against Multi-Agent Communications},
year = 2026,
booktitle = {International {Conference} on {Autonomous} {Agents} and {Multiagent} {Systems}},
}