Skip to content

DeepCodeProbe: A Probe into Machine Learning Models Specializing in Software Maintenace

Notifications You must be signed in to change notification settings

CommissarSilver/DeepCodeProbe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

DeepCodeProbe is a tool designed for probing small ML models trained on syntactic representations of code in order to provide interpretability on their syntax learning capbilities alongside the represeantions they learn. The tool is designed to be model agnostic and can be used with any model that uses AST/CFG as input.

Dependencies

The experiments were carried out using Python 3.10. Install the dependencies for DeepCodeProbe with:

pip install -r requirements.txt

Additionally, each of the models under study have their own dependencies. In order to train the models and replicate the results, you need to install the dependencies for each model.

AST-NN

First, create a virtual environment and activate it:

python -m venv astnn
source astnn/bin/activate

Then, install the dependencies for AST-NN:

pip install -r src/astnn/requirements.txt

FuncGNN

First, create a virtual environment and activate it:

python -m venv funcgnn
source funcgnn/bin/activate

Then, install the dependencies for FuncGNN:

pip install -r src/funcgnn/requirements.txt

SummarizationTF

First, create a virtual environment and activate it:

python -m venv summarizationtf
source summarizationtf/bin/activate

Then, install the dependencies for SummarizationTF:

pip install -r src/summarization_tf/requirements.txt

CodeSumDRL

First, create a virtual environment and activate it:

python -m venv code_sum_drl
source code_sum_drl/bin/activate

Then, install the dependencies for CodeSumDRL:

pip install -r src/code_sum_drl/requirements.txt

Training The Models

In order to train the models, you need to download the dataset. Each model has its own dataset. The datasets can be downloaded from the following links:

After downloading the datasets, put them in the dataset directory at the root of each models' source directory. Afterwards, you can train the models by running the following the instructions in the README files of each model:

  • AST-NN: src/ast_nn/README.md
  • FuncGNN: src/funcgnn/README.md
  • SummarizationTF: src/summarization_tf/README.md
  • CodeSumDRL: src/code_sum_drl/README.md

Training The Probes

After training the models, you can train the probes by running the following command:

python src/probe_model.py --model {model_name}

Where {model_name} is the name of the model you want to train the probe for. Please note that each model requires a different probe configuration. The configurations for each model are outlined in probe_model.py.

Reproducing The Validation Results

After training the probes and the models, you can reproduce the validation results by running the following command:

python src/validate_probe.py --model {model_name}

Where {model_name} is the name of the model you want to evaluate the probe for. Similar to training the probes, each model requires a different probe configuration. The configurations for each model are outlined in validate_probe.py.

About

DeepCodeProbe: A Probe into Machine Learning Models Specializing in Software Maintenace

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published