CareerBot-RAG

CareerBot-RAG is a project that analyzes CVs and provides skill and occupation suggestions based on the ESCO dataset.

published paper : https://www.theseus.fi/handle/10024/874901?show=full

Prerequisites

Python 3.8+
Node.js 14+
npm 6+

Installation

Environement

Create a venv :
```
python -m venv env
```

Activate venv :

Windows :

.\env\Scripts\activate

Unix or MacOS :

source env/bin/activate

Backend

Install Python dependencies:
```
pip install -r requirements.txt
```

Frontend

Navigate to the UI directory:
```
cd UI
```
Install npm dependencies:
```
npm install
```

Configuration

OpenAI API key:
- The project uses a .env file in the Backend directory to store the OpenAI API key.
- Create or edit the .env file:
```
OPENAI_API_KEY=your_new_api_key_here
```
- Make sure not to commit this file to version control to keep your API key secure.
Constants: The project uses several important constants that can be found in the Backend/app/config.py file:
- EMBEDDING_MODEL_NAME: The OpenAI model used for generating embeddings (default: "text-embedding-3-small")
- GENERATION_MODEL_NAME: The OpenAI model used for text generation (default: "gpt-4o-mini")
- NUMBER_DOC_PER_ITEM: Number of documents retrieved for each item from the vector database. The higher the more items will be given to the context of the LLM picking step (default: 1).
- LLM_MAX_PICKS: Maximum number of items identified from the CV shown to the user [skills, occupations] (default: [15, 5])
- NB_SUGGESTED_SKILLS: Number of skills suggested to the user (default: 20)
- NB_SUGGESTED_OCCUPATIONS: Number of occupations suggested to the user (default: 10)
You can modify these constants in the config.py file to adjust the behavior of the application. For example:
```
EMBEDDING_MODEL_NAME = "text-embedding-3-large"
NB_SUGGESTED_SKILLS = 15
```
Note: Changing some of these constants (especially EMBEDDING_MODEL_NAME) may require regenerating the FAISS index.

FAISS Index

Creating a New FAISS Index (Optional)

Update the EMBEDDING_MODEL_NAME in Backend/app/config.py to your desired model. ( text-embedding-3-small by default and recommanded )
Navigate to the indexing directory:
```
cd POC/indexing
```
Run the FAISS index creation script:
```
python create_FAISS_index.py
```

This will create new FAISS index files in the data/processed_data/FAISS_index directory.

After creating the new index, you must run the script to update the options list:
```
cd ../../utils
python creating_options_list.py
```

This script generates updated JSON files for skills and occupations options, which are used by the frontend.

Note: Creating a new index can be time-consuming, costs money and may require significant computational resources, especially for larger models.

Running the Application

Backend

Navigate to the Backend directory:
```
cd Backend
```
Start the FastAPI server:
```
uvicorn app.main:app --reload
```

The backend will be available at http://localhost:8000.

Frontend

Navigate to the UI directory:
```
cd UI
```
Start the Next.js development server:
```
npm run dev
```

The frontend will be available at http://localhost:3000.

Usage

Open your browser and go to http://localhost:3000
Upload a CV file (PDF format)
Click "Analyze" to process the CV
View the suggested skills and occupations based on the CV content

Additional Information

File Structure

Backend/: Contains the FastAPI backend
UI/: Contains the Next.js frontend
POC/: Proof of concept scripts and experiments
data/: Data files and processed indexes

Notes

This work was done as a project for a Haaga-Helia thesis
This project uses Open AI's model. each call costs money. Ensure you have sufficient OpenAI API credits
The application uses the ESCO (European Skills, Competences, Qualifications and Occupations) dataset
The API key shown have been revoked

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CareerBot-RAG

Prerequisites

Installation

Environement

Backend

Frontend

Configuration

FAISS Index

Creating a New FAISS Index (Optional)

Running the Application

Backend

Frontend

Usage

Additional Information

File Structure

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
Backend		Backend
POC		POC
UI		UI
data		data
utils		utils
README.md		README.md
requirements.txt		requirements.txt

robinrolle/CareerBot-RAG

Folders and files

Latest commit

History

Repository files navigation

CareerBot-RAG

Prerequisites

Installation

Environement

Backend

Frontend

Configuration

FAISS Index

Creating a New FAISS Index (Optional)

Running the Application

Backend

Frontend

Usage

Additional Information

File Structure

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages