├── README.md # This file
└── model_evaluation/ # The model evaluation module
└── model_evaluation/
├── prompts/ # Prompts used by the evaluation
├── llm_correctness/ # Code
├── Ferrari/ # Sequence diagrams from the paper "Model Generation with LLMs: From Requirements to UML Sequence Diagrams"
├── config_mcet_seq.json # Template to configure the tool
└── mcet_sequence.py # Main script
Create and start a new python virtual environment
python3 -m venv myvenv
source myvenv/bin/activateInstall the module in the model_evaluation
cd model_evaluation
pip install -e .This script was developed and tested on Windows Subsystem for Linux (WSL) and uses Linux paths, was not tested on Windows.
The tool is configured through a json configuration file: config_mcet_seq.json
Copy it first and add your API keys and proxy URLS, do not commit your config files if you place API keys or proxy URLs.
Example:
cp config_mcet_seq.json config_mcet_seq_keys.json
echo config_mcet_seq_keys.json >> .gitignoreHere's a breakdown of the configuration options:
LLM Configuration
- generation_config:
-
server: Specifies the server url where the underlying LLM is hosted
-
model: Specify the name of the language model that is served by the server
-
llm_config:
- temperature: Controls the randomness of the model's output (range: 0-1).
- top_p: Specifies the proportion of top tokens to consider for generation (range: 0-1).
-
API Keys and Proxies
- api_key: API key for LLM server.
- server_proxy:
- http: specify a proxy if needed in the form
http://[name]:[password]@@[proxy_url]:[port]
- http: specify a proxy if needed in the form
Advanced options
Experiment Settings
Configures the prompts used in the evaluation.
- prompts: Paths to JSON files containing prompts for each experiment.
Execution Settings
- num_votes: Number of votes required for consensus in the relations evaluation.
- num_retries: Number of retries allowed for failed LLM operations.
- parallel: Enables or disables parallel LLM calls (Boolean).
To modify the configuration, simply edit this JSON file with the desired settings.
Note: Ensure that API keys and proxy URLs are kept secure and not exposed in public repositories.
Run the tool as follows:
cd model_evaluation/model_evaluation/
python correctness_property.py -r [requirement text] -p [plantuml model] -d [output directory] --config_path [config file]-
The
-rargument is the relative path to the file containing the requirement text, it should have a.txtextension. -
The
-pargument is the relative path to the platuml model file, it should have a.pumlextension -
The
-dargument is the relative path to the directory that will store the output, if it does not exist, the script will create it. -
For
config_path, provide the configuration file.
All of those paths are relative to the working directory (output of pwd command).
The final results are saved in [output director]/[requirement text name].json
cd model_evaluation/model_evaluation/
python correctness_property.py -r Ferrari/1.autopilot.v0.txt -p Ferrari/1.autopilot.v0.puml -d out_seq --config_path config_0.3.1_seq_template.jsonThe script runs and stores a log in out_seq/out.log
All issues detected in the sequence diagram are saved in out_seq/1.autopilot.v0.txt_result.json
The output for sequence diagrams is a set of explanations of issues found by each check within the tool (e.g., message checking).