Skip to content

Huawei-TTE/MCeT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Directory structure

├── README.md # This file
└── model_evaluation/ # The model evaluation module
        └── model_evaluation/
            ├── prompts/ # Prompts used by the evaluation
            ├── llm_correctness/ # Code
            ├── Ferrari/ # Sequence diagrams from the paper "Model Generation with LLMs: From Requirements to UML Sequence Diagrams"
            ├── config_mcet_seq.json # Template to configure the tool
            └── mcet_sequence.py # Main script

Installation

Create and start a new python virtual environment

python3 -m venv myvenv
source myvenv/bin/activate

Install the module in the model_evaluation

cd model_evaluation
pip install -e .

This script was developed and tested on Windows Subsystem for Linux (WSL) and uses Linux paths, was not tested on Windows.

Configuring the tool

The tool is configured through a json configuration file: config_mcet_seq.json

Copy it first and add your API keys and proxy URLS, do not commit your config files if you place API keys or proxy URLs.

Example:

cp config_mcet_seq.json config_mcet_seq_keys.json
echo config_mcet_seq_keys.json >> .gitignore

Here's a breakdown of the configuration options:

LLM Configuration

  • generation_config:
    • server: Specifies the server url where the underlying LLM is hosted

    • model: Specify the name of the language model that is served by the server

    • llm_config:

      • temperature: Controls the randomness of the model's output (range: 0-1).
      • top_p: Specifies the proportion of top tokens to consider for generation (range: 0-1).

API Keys and Proxies

  • api_key: API key for LLM server.
  • server_proxy:
    • http: specify a proxy if needed in the form http://[name]:[password]@@[proxy_url]:[port]

Advanced options

Experiment Settings

Configures the prompts used in the evaluation.

  • prompts: Paths to JSON files containing prompts for each experiment.

Execution Settings

  • num_votes: Number of votes required for consensus in the relations evaluation.
  • num_retries: Number of retries allowed for failed LLM operations.
  • parallel: Enables or disables parallel LLM calls (Boolean).

To modify the configuration, simply edit this JSON file with the desired settings.

Note: Ensure that API keys and proxy URLs are kept secure and not exposed in public repositories.

Running the tool

Run the tool as follows:

cd model_evaluation/model_evaluation/
python correctness_property.py -r [requirement text] -p [plantuml model] -d [output directory] --config_path [config file]
  • The -r argument is the relative path to the file containing the requirement text, it should have a .txt extension.

  • The -p argument is the relative path to the platuml model file, it should have a .puml extension

  • The -d argument is the relative path to the directory that will store the output, if it does not exist, the script will create it.

  • For config_path, provide the configuration file.

All of those paths are relative to the working directory (output of pwd command).

The final results are saved in [output director]/[requirement text name].json

Example run:

cd model_evaluation/model_evaluation/
python correctness_property.py -r Ferrari/1.autopilot.v0.txt -p Ferrari/1.autopilot.v0.puml -d out_seq --config_path config_0.3.1_seq_template.json

The script runs and stores a log in out_seq/out.log

All issues detected in the sequence diagram are saved in out_seq/1.autopilot.v0.txt_result.json

The output for sequence diagrams is a set of explanations of issues found by each check within the tool (e.g., message checking).

About

Behavioral Model Correctness Evaluation using Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages