This is the code base for paper Test vs Mutant: Adversarial LLM Agents for Robust Unit Test Generation
AdverTest is a mutation-guided, dual-agent framework where a Test Generation Agent (T) and a Mutant Generation Agent (M) co-evolve: M keeps creating challenging, compilable mutants while T iteratively generates and repairs tests to kill them. The loop is driven by both coverage and mutation score to improve real fault detection, not only coverage.
AdverTest expects:
Java 8 (OpenJDK 1.8). Set JAVA_HOME and ensure java -version reports 1.8. (This is what defects4j 2.X.X expect)
Defects4J v2.1.0: https://github.com/rjust/defects4j/tree/v2.1.0 GrowingBugs https://github.com/liuhuigmail/GrowingBugRepository You will need reference to these website to get the datasets and corretly install it.
After correctly install the dataset, you can run defects4j commands in your shell. (However, you will not be able to run Defects4J and GrowingBugs's commands at the same time)
You can do this either by download.py or you can write your script yourself. Make sure you download them at corresponding folders in the download.py
After you successfully download the projects, you will still need to modify this file: defects4j.build.xml (defects4j/framework/projects/defects4j.build.xml),add dependency for generated
<property name="mock-junit.jar" value="xxxx"/>
<property name="objnesis.jar" value="xxxx"/>
<property name="mockito.jar" value="xxxx"/>
<property name="byte-buddy.jar" value="xxxx"/>
you need to first Add these Property Definitions Then Update compile.gen.tests Classpath and Update run.gen.tests Classpath(in the file!)
(these are in the repository's lib folder, you will need to fill in the value by the path of the corresponding property)
For growingbugs, do the same.
Evosuite and Randoop: we use what defects4j provide: check https://defects4j.org/html_doc/gen_tests.html for more details.
ChatUniTest: https://github.com/ZJU-ACES-ISE/ChatUniTest/tree/python
HITS: https://github.com/eecshope/HITS
You can follow the instruction in their github repository.
We also use defects4j provide(CoberTura): check https://defects4j.org/html_doc/run_coverage.html for details.
- We suggest you to create a new conda environment.
- Set your api key in model.py
- If you want to, change config.py for logging
- Run python generate.py (It also do the testing as well)
- At default, it generates test at the working directory of the fixed version of projects. If you want to save them elsewhere, run python copy_files.py
- If you want to run saved tests and tests generated by other baselines, run test.py
We provide generated tests from baseline methods and AdverTest for the case study section in the paper in folder ./case.
.
├── README.md # Project overview, setup, and usage
├── case/ # A case generated by all methods.(The case we demonstrated in the paper)
└── HITS/...
└── ChatUniTest/...
...
├── .gitignore # Git ignore rules
├── config.py # configs
├── model.py # llm model setting
├── generate.py # Entry
├── enhance_testcase.py # Test augmentation
├── enhance_mutants.py # Mutant augmentation
├── coverage.py # Coverage
├── extract.py # Extract codes in PUT
├── function.py # functions used for mutants generation and tests generation
├── copy_files.py # copy files
├── compress.py # Pack/zip artifacts from a run
├── download.py # download projects
├── test.py # run saved tests and other baselines generated tests
├── testIniGenPrompt.py # initial test generation
├── lib/ # Third-party tools/binaries
│ └── ...
└── __pycache__/ # Python bytecode caches (ignored)