GitHub - TrustAIRLab/Unsafe-LLM-Based-Search

Official repository for "Unsafe LLM-Based Search: Quantitative Analysis and Mitigation of Safety Risks in AI Web Search".

Introduction

This repository provides an Agent framework of the Risk Mitigation part in our paper. The XGBoost-detector and PhishLLM-detector are for comparison. The code for the PhishLLM-detector can be found at: https://github.com/code-philia/PhishLLM

Project Structure

agent_defense/
├── src/
│   └──agent.py                     # build_agent
│   └──prompt.py                    # prompt
│   └──tools.py                     # tool calling (You could change the tools by modifying the `return_tools` function; the HtmlLLM-detector's prompt can be found in the `is_malicious` function.)
│   └──utils.py                     # XGBoost-detector method
│   └──selenium_fetcher.py          # HtmlLLM-detector method for getting HTML content (optional)
│   └──template.csv                 # template for basic test
│   └──XGBoostClassifier.pickle.dat # XGBoost-detector model weight
├── template.json                   # template for basic test
├── requirement.txt                   # required packages, use `pip install -r requirement.txt` to install
├── prompt_defense.py               # prompt-based defense code
└── main.py                         # run the defense (It uses the HtmlLLM-detector (ours) by default.)

How to Run

Setup

Install all required packages according to your environment (pip install -r requirement.txt).
Register OpenAI API Key, See Tutorial here. Paste the API key to './src/openai.txt'.

For Batch Comparison (Shown in Our Paper)

Prepare the batch_result.csv in the format below (You need to use the is_malicious function to obtain the results and write them to this CSV file for batch comparison):

phish_prediction is the result of the PhishLLM-detector, while malicious is the result of our method, the HtmlLLM-detector.
```
url,phish_prediction,malicious
https://example0.com,benign,False
https://example1.com,benign,True
```

Prepare the input.json

[
    {
        "LLM": "The platform name",
        "Query": "The Query",
        "Risk": "main",
        "content": {
            "output": "The output of AIPSE",
            "resource": [
                "https://example0.com",
                "https://example1.com"
            ]
        }
    }
]

BasicTest Run

We provide all template files. To run a basic test, you can simply run:
```
python main.py
python prompt_defense.py
```
after entering the parameters in the main.py, tools.py, and prompt_defense.py files.

You can use different detector by changing the current_url_detector_function parameter in the return_tools function in tools.py file. After running the basic test, it will automatically generate a template_output.json file for verification.

For Single Query

You can directly test it by changing the return_tools function in tools.py.

Note:

The domains used in our case study will expire a few days after January 4, 2026. We have archived their content via the Wayback Machine. Please refer to the list below for details about the archived webpages:

These domains are no longer under our control and have been released back into the domain market. As such, we are no longer responsible for their content or any communications originating from them.

⚠️ Caution: Any messages or information sent from these domains do not represent us anymore.

Citation

@inproceedings{UnsafeSearch2025,
      title={Unsafe LLM-Based Search: Quantitative Analysis and Mitigation of Safety Risks in AI Web Search}, 
      author = {Zeren Luo and Zifan Peng and Yule Liu and Zhen Sun and Mingchen Li and Jingyi Zheng and Xinlei He},
      booktitle = {{34th USENIX Security Symposium (USENIX Security 25)}},
      publisher = {USENIX},
      year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
prompt_defense.py		prompt_defense.py
requirement.txt		requirement.txt
teaser.png		teaser.png
template.json		template.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Project Structure

How to Run

Setup

For Batch Comparison (Shown in Our Paper)

For Single Query

Note:

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

TrustAIRLab/Unsafe-LLM-Based-Search

Folders and files

Latest commit

History

Repository files navigation

Introduction

Project Structure

How to Run

Setup

For Batch Comparison (Shown in Our Paper)

For Single Query

Note:

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages