AI Tool

Answers specific questions on scientific publications (~2500–3000 articles in total).
Automates the process while keeping great results.
Addresses optimization goals:
- Reduce tokens used
- Provide clear and better responses
- Lower the hallucination rate
- Get the most from AI
Returns a pre-defined Excel document with Q/A columns.
Used in a pharma research lab at CHU Sainte-Justine, Montreal to rework their website.

How to Use the Project

Step 1 – Download the Project

If you received this as a ZIP file, unzip it on your Desktop.

Right-click on the forlder, find 'Open in terminal'

Skip to next step.

If you’re using Git:

On your Desktop, right-click and find 'Open in terminal'

Once openned, run the commands :

git clone https://github.com/simy46/ImpactPharma.git
cd ImpactPharma

Step 2 – Add Your PDFs

Place all the PDF files you want to analyze inside the /pdfs folder of the project.

These are the pdfs that will be used : Chose the ones that YOU want.

Example: ImpactPharma/pdfs/*.pdf

Step 3 – Run on Windows

Make sure you have Python 3.10 or newer installed.

Then on the terminal

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
python main.py

Output

After the script finishes, results will be saved in:

/outputs/resultats_YYYYMMDD_HHMMSS.xlsx : Excel with all extracted answers
/logs/pipeline_YYYYMMDD_HHMMSS.log : detailed processing log for each PDF

Notes

Don't edit the Excel file while the script is running.
If you have questions or errors, share the log file with the developer.
It will always write inside a new /outputs/resultats_YYYYMMDD_HHMMSS.xlsx with the date and time written on the name of the file.

Motivation

This is a work for a research lab at a Montreal hospital (the most popular if you ask me). I am happy to help them achieve their goal.

It is just a gpt-4 wrapper

I am no llm dev (yet : 10/07/2025), but I'm leaning towards that path more and more. I do love research and might get a paper out of this project.

https://impactpharmacie.org/index.php?p=greeter.php

Simple Approach (abstraction)

for pdf in pdf_files:
    text = PDFLoader.extract_text(pdf)
    responses = {}

    for category in categories:
        prompt = PromptManager.build_prompt(category, text)
        answer_raw = APIManager.ask(prompt)
        parsed = ResponseParser.parse(answer_raw)
        responses.update(parsed)

    ExcelWriter.insert_row(pdf_name, responses)

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
config		config
core		core
outputs		outputs
pdfs		pdfs
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
stats.py		stats.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Tool

How to Use the Project

Step 1 – Download the Project

Step 2 – Add Your PDFs

Step 3 – Run on Windows

Output

Notes

Motivation

It is just a gpt-4 wrapper

Simple Approach (abstraction)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Tool

How to Use the Project

Step 1 – Download the Project

Step 2 – Add Your PDFs

Step 3 – Run on Windows

Output

Notes

Motivation

It is just a gpt-4 wrapper

Simple Approach (abstraction)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages