Skip to content

Elcasvi/Code-Fixer-LLM-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code-Fixer-LLM-Agent

This project integrates tools for analyzing, compiling, and fine-tuning language models to correct and analyze source code. It combines components in C++ and Python, as well as notebooks for the fine-tuning of LLMs

Component Descriptions

1. CPP_Compiler

Contains a compiler and parser for source code written in C++:

  • lexer/: Implements lexical analysis (tokenization) of source code.
  • parser/: Implements syntactic analysis.
  • grammar/tok.txt: List of tokens recognized by the compiler.
  • grammar/output.txt: Sequence of token indices generated by the lexer.
  • grammar/token_mapper.py: A Python script that maps the indices in output.txt to the token names from tok.txt, printing the sequence of tokens and respecting line breaks (NEWLINE)

Using token_mapper.py

Run this command from the root of the project or from within the CPP_Compiler/grammar/ directory:

python3 CPP_Compiler/grammar/token_mapper.py

The script will print the sequence of tokens corresponding to the indices in output.txt, separating lines each time it encounters the token NEWLINE.

2. fine-tuning-LLM

Contains notebooks and scripts for fine-tuning language models (LLMs):

  • Notebooks: Examples and experiments of fine-tuning using various frameworks and datasets.
  • dataset/: Datasets and scripts for preparing training data.

Requisitos

Install the necessary dependencies for fine-tuning with:

pip install -r fine-tuning-LLM/requirements.txt

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors