Skip to content

WM-SEMERU/CodeSmellExt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code Smells in AI4SE

Welcome! This repository contains the source code and experimental results for our paper, " A Causal Perspective on Measuring, Explaining and Mitigating Smells in LLM-Generated Code"
Our work investigates the prevalence of code smells in code generated by large language models (LLMs), provides insights into their causes, and explores strategies for mitigation. Here you will find all materials necessary to reproduce our analyses and findings.


Prerequisites

To use GPU for predictions, ensure you have PyTorch installed and a compatible GPU available.

Check your GPU setup with:

!nvidia-smi

Example output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 12.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:61:00.0 Off |                    0 |
| N/A   33C    P0    34W / 250W |      4MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Installation

  1. Create a virtual environment (using conda, mamba, or virtualenv) and activate it:
    mamba create -n code-smell-env
    conda activate code-smell-env
  2. Navigate to the project base path:
    cd CodeSmells
  3. Install dependencies:
    pip install .
  4. Compile and install the code_smell_lib using nbdev:
    cd code_smell_lib
    nbdev_export
    pip install .

Note: Some dependencies may not be included in requirements.txt. Install them manually if prompted.


General Instructions

  • The dataset folder contains the datasets (CodeSmellData) used for our experiments.
  • The notebooks folder contains Jupyter notebooks for each research question.

Notebook Overview

Research Question Subfolder Notebook File Function/Explanation
RQ1 (Measure) baseline 01_information_gain.ipynb Computes information gain for baseline model comparisons.
RQ1 (Measure) robustness_generation 05_1_data_engineering-CodeLlama.ipynb Processes logits for robustness generation experiments (CodeLlama model).
RQ1 (Measure) robustness_generation 06_alignment_and_aggregation.ipynb Aggregates and aligns logits for robustness generation experiments.
RQ1 (Measure) robustness_transformations 04_1_data_engineering-CodeLlama.ipynb Processes logits for robustness transformation experiments (CodeLlama model).
RQ1 (Measure) robustness_transformations 05_alignment_and_aggregation.ipynb Aggregates and aligns logits for robustness transformation experiments.
RQ2 (Explain) causal_analysis 01_dataset_preprocessing.ipynb Preprocesses datasets for causal analysis experiments.
RQ2 (Explain) causal_analysis 02_causal_analysis.ipynb Performs causal analysis to identify relationships between code smells and model outputs.
RQ2 (Explain) causal_analysis 03_result_analysis.ipynb Analyzes and visualizes results from causal analysis.
RQ2 (Explain) prompting 04_alignment_and_aggregation.ipynb Aggregates and aligns logits for PSC computation in prompting experiments.
RQ3 (Mitigation) mitigation 01_dataset_preparation.ipynb Prepares datasets for mitigation experiments.
RQ3 (Mitigation) mitigation 02_extractor_CausalLM.ipynb Extracts logits from CausalLM models for mitigation analysis.
RQ3 (Mitigation) mitigation 03_2_data_engineering-CausalLM.ipynb Processes extracted logits for mitigation experiments.
RQ3 (Mitigation) mitigation 04_alignment_and_aggregation.ipynb Aggregates and aligns logits to compute Propensity Smelly Score (PSC) for mitigation.
RQ3 (Mitigation) mitigation 05_analysis.ipynb Performs statistical analysis and visualization for mitigation results.
RQ4 (Pipeline) pipeline 01_extractor_CausalLM.ipynb Extracts logits from CausalLM models for pipeline experiments.
RQ4 (Pipeline) pipeline 02_2_data_engineering-CausalLM.ipynb Processes logits for pipeline experiments.
RQ4 (Pipeline) pipeline 03_alignment_and_aggregation.ipynb Aggregates and aligns logits for PSC computation in pipeline experiments.
RQ4 (Survey) survey result_analysis.ipynb Analyzes survey results related to code smell perceptions.
All RQs extension models.md Documents the models used in all experiments.

Subfolder Explanations

  • baseline: Notebooks for baseline model comparisons and metrics.
  • causal_analysis: Notebooks for dataset preprocessing, causal analysis, and result visualization.
  • mitigation: Notebooks for preparing data, extracting logits, engineering features, aggregating results, and analyzing mitigation strategies.
  • pipeline: Notebooks for extracting, processing, and aggregating logits in pipeline experiments.
  • prompting: Notebooks for aggregation and analysis in prompting experiments.
  • robustness_generation: Notebooks for robustness generation experiments, including data engineering and aggregation.
  • robustness_transformations: Notebooks for robustness transformation experiments, including data engineering and aggregation.
  • survey: Notebooks for analyzing survey data on code smell perceptions.

code_smells_lib/nbs

This folder contains the nbdev source notebooks that implement and document the main components of the code_smells_lib library. Each notebook focuses on a specific aspect of code smell detection and analysis:

  • 00_ast_utils.ipynb: Utility functions for parsing and analyzing Python Abstract Syntax Trees (ASTs), essential for identifying structural code smells.
  • 01_pos_tagging.ipynb: Implements part-of-speech tagging for code tokens, supporting advanced code analysis and smell detection.
  • 02_smell_detectors.ipynb: Core logic for detecting various code smells, including heuristics and rule-based approaches.
  • 03_metrics.ipynb: Defines metrics for quantifying code smells and evaluating code quality.
  • 04_data_processing.ipynb: Functions for loading, cleaning, and transforming code datasets used in experiments.
  • 05_visualization.ipynb: Tools for visualizing code smell distributions and analysis results.

Each notebook is designed for literate programming: code cells implement functionality, while markdown cells explain usage and design decisions. The code is exported to Python modules using nbdev_export, ensuring that documentation and implementation remain synchronized.


scripts

The scripts folder contains shell-executable versions of the main analysis pipelines found in the notebooks. Each subfolder corresponds to a research question or experimental setting (e.g., causal_analysis, mitigation, pipeline, prompting, robustness_generation, robustness_transformations) and includes scripts and configuration files tailored for batch execution.

Key points:

  • Purpose: These scripts automate the execution of data preprocessing, model inference, code smell detection, and result aggregation steps, making it easy to reproduce experiments from the command line.
  • Structure: The folder structure mirrors the organization of the main notebooks, with each subfolder containing the relevant Python scripts and shell scripts (run_script.sh) for its experimental stage.
  • Usage: To run an experiment, navigate to the desired subfolder and execute the provided shell script. Logs and outputs are saved in dedicated directories for easy inspection.

This setup enables reproducible, large-scale experiments and is ideal for running analyses on remote servers or clusters.


survey

The survey folder contains materials and analysis related to the user study conducted for our research. This study investigates human perceptions of code smells in generated code.

  • control.pdf and treatment.pdf: These files contain the survey forms shown to participants. The control version presents code snippets without explicit code smell annotations, while the treatment version includes additional information or highlighting related to code smells.
  • result_analysis.ipynb: This notebook performs statistical analysis of the survey responses. It processes Likert-scale ratings from participants, compares control and treatment groups, and applies significance tests (e.g., Mann-Whitney U) to assess the impact of code smell annotations on user judgments.

Use this folder to reproduce the survey analysis and explore how code smell explanations affect developer perceptions.


causation plots

The causation plots folder contains additional visualizations that were excluded from the main paper but provide valuable insights into the robustness of Propensity Smelly Score (PSC) for each type of code smell. These plots illustrate how PSC and related metrics behave under various robustness experiments, helping to further understand the stability and reliability of code smell detection across different scenarios.

  • The plots are organized by robustness experiment and include both mean and median statistics for actual and maximum probabilities associated with code smells.
  • You will find visualizations such as code_smell_actual_prob_mean, code_smell_actual_prob_median, code_smell_max_prob_mean, and code_smell_max_prob_median in PDF and PNG formats.
  • These results can be used to explore the sensitivity of PSC to code transformations and generation settings, offering deeper context for interpreting the main findings.

Researchers interested in the detailed behavior of PSC under robustness conditions can refer to these plots for supplementary analysis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published