Installation

Backend Project

This is a backend project using FastAPI to provide an API for chat prompts and Gemma API integration.

FastAPI Documentation

FastAPI automatically generates interactive API documentation using Swagger UI. Once the server is running, you can access the documentation at http://localhost:8001/docs. This documentation allows you to test the API endpoints directly from the browser.

Changing the Face Token

To change the face token used for authentication with the Gemma API, update the HEADERS variable in the main.py file:

HEADERS = {"Authorization": "Bearer your_new_face_token_here"}

Replace your_new_face_token_here with your new face token.

Clone the Dataset

Clone dataset from https://github.com/verazuo/jailbreak_llms. Please the CSVs in the bert directory or change the file URLs. Before Running Bert

BERT Model

The BERT model used in this project is defined in the bert_model.ipynb notebook. Please run Bert_model.ipynb. After training, the final model is saved in the final_bert_model directory. Make sure to load this model in the main.py file for the application to function correctly.

Installation

To set up the project, first clone the repository:

git clone https://your-repository-url.git
cd your-project-directory

Then, install the required dependencies:

pip install -r requirements.txt

#Running the Server To start the FastAPI server, run the following command:

uvicorn main:app --host 0.0.0.0 --port 5000

The server will be available at http://localhost:5000.

Prompt Dataset

jailbreak_prompts.csv and regular_prompts.csv is the collected prompts in our study. The meaning of each column is as follows.

Column	Description
platform	The platform of the prompt, i.e., Reddit.
source	The source of the prompt, i.e., r/ChatGPT.
prompt	The extracted prompt.
jailbreak	If it is True, then it is identified as a jailbreak prompt.
created_at	Created time of the prompt. Obtained from the source.
date	Date extracted from `created_at` value.
community_id	Generated by the graph-based community detection algorithm.
community_name	Community name referred in our paper.

Forbidden Question Set

The complete forbidden question set is forbidden_question_set.csv.zip.

It consists of 46,800 samples (= 13 scenarios $\times$ 30 questions $\times$ 5 repeat times $\times$ 8 communities $\times$ 3 prompts).

The meaning of each column is as follows.

Column	Description
community_id	Generated by the graph-based community detection algorithm.
community_name	Community name referred in our paper.
prompt_type	The type of prompt, i.e., earlist, latest, or the most closeness one in the community.
prompt	The extracted prompt.
content_policy_id	Content policy id.
content_policy_name	The content policy name, i.e., illegal activity.
q_id	Question id.
question	The question.
response_idx	For each question, we ask it five times.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
frontend		frontend
.gitattributes		.gitattributes
.gitignore		.gitignore
Bert_model.ipynb		Bert_model.ipynb
ECS289G_Project_Presentation.pdf		ECS289G_Project_Presentation.pdf
ECS289_googleFLAN_zeroshot.pptx		ECS289_googleFLAN_zeroshot.pptx
GRU_FFNN.ipynb		GRU_FFNN.ipynb
Project_Proposal__Karamjeet_.pdf		Project_Proposal__Karamjeet_.pdf
README.md		README.md
RNN_&_LSTM_.ipynb		RNN_&_LSTM_.ipynb
chat.py		chat.py
final_Project_Proposal__Karamjeet.pdf		final_Project_Proposal__Karamjeet.pdf
requirements.txt.txt		requirements.txt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backend Project

FastAPI Documentation

Changing the Face Token

Clone the Dataset

BERT Model

Installation

Prompt Dataset

Forbidden Question Set

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Backend Project

FastAPI Documentation

Changing the Face Token

Clone the Dataset

BERT Model

Installation

Prompt Dataset

Forbidden Question Set

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages