diff --git a/burn-algorithms/COCOS.md b/burn-algorithms/COCOS.md index 9fe8a6d..7c759c0 100644 --- a/burn-algorithms/COCOS.md +++ b/burn-algorithms/COCOS.md @@ -6,6 +6,13 @@ This folder contains some machine learning examples to help you get started with The following documentation helps you to run the examples in the enclave. +## Table of Contents +- [Prerequisites](#prerequisites) +- [Setup](#setup) +- [Running Examples with CVMS](#running-examples-with-cvms) +- [WASM Module Example](#wasm-module-example) +- [Notes](#notes) + ## Prerequisites - Git @@ -17,302 +24,253 @@ The following documentation helps you to run the examples in the enclave. - [Rust](https://www.rust-lang.org/tools/install) - [Go](https://golang.org/doc/install) -### Build qemu images +## Setup -Clone the cocos and buildroot repositories. +### Build QEMU Images -```bash -git clone https://github.com/ultravioletrs/cocos.git -``` +1. **Clone the repositories:** -prepare cocos directory. - -```bash -mkdir -p cocos/cmd/manager/img && mkdir -p cocos/cmd/manager/tmp -``` - -```bash -git clone https://github.com/buildroot/buildroot.git -``` + ```bash + git clone https://github.com/ultravioletrs/cocos.git + git clone https://github.com/buildroot/buildroot.git + ``` -Change the directory to buildroot. +2. **Prepare cocos directory:** -```bash -cd buildroot -``` + ```bash + mkdir -p cocos/cmd/manager/img && mkdir -p cocos/cmd/manager/tmp + ``` -Build the cocos qemu image. +3. **Build the cocos qemu image:** -```bash -make BR2_EXTERNAL=../cocos/hal/linux cocos_defconfig -``` + ```bash + cd buildroot + make BR2_EXTERNAL=../cocos/hal/linux cocos_defconfig + make -j4 && cp output/images/bzImage output/images/rootfs.cpio.gz ../cocos/cmd/manager/img + ``` -```bash -make -j4 && cp output/images/bzImage output/images/rootfs.cpio.gz ../cocos/cmd/manager/img -``` + The above commands will build the cocos qemu image and copy the kernel image and rootfs to the manager image directory. -The above commands will build the cocos qemu image and copy the kernel image and rootfs to the manager image directory. +4. **Generate key pair:** -### Get Key Pair + ```bash + cd ../cocos + make all + ./build/cocos-cli keys -k="rsa" + ``` -Generates a new public/private key pair using an algorithm of the users choice(This happens in the cocos directory). +### Build Example Algorithm -If you are not in the cocos directory, change the directory to cocos. +For the addition example, build the addition algorithm: ```bash +cd burn-algorithms +cargo build --release --bin addition --features cocos +cp target/release/addition ../cocos/ cd ../cocos ``` -```bash -make cli -``` +## Running Examples with CVMS -```bash -./build/cocos-cli keys -k="rsa" -``` +The modern approach uses the CVMS (Computation Management Server) for streamlined workflow management. -## Running the examples +### Finding Your Host IP Address -Start computation server(this happens in the cocos directory). +Find your host machine's IP address (avoid using localhost): ```bash -go run ./test/computations/main.go public.pem false +ip a ``` -For the addition example we can build the addition algorithm(this happens in the burn-algorithms directory). +Look for your network interface (e.g., wlan0 for WiFi, eth0 for Ethernet) and note the inet address. For example, if you see `192.168.1.100`, use that as your ``. -```bash -cargo build --release --bin addition --features cocos -``` +### Start Core Services -Copy the built binary from `/target/release/addition` to the directoy where you will run the computation server. +#### Start the Computation Management Server (CVMS) -```bash -cp target/release/addition ../../cocos -``` - -For example, to run the `addition` algorithm, run the following command (this happens in the cocos directory). Since the addition algorithm does not require any dataset, the dataset path is empty. +From your cocos directory, start the CVMS server with the addition algorithm: ```bash -go run ./test/computations/main.go ./addition public.pem false +HOST= go run ./test/cvms/main.go -algo-path ./addition -public-key-path public.pem -attested-tls-bool false ``` -Start the manager. +Expected output: ```bash -cd cmd/manager +{"time":"...","level":"INFO","msg":"cvms_test_server service gRPC server listening at :7001 without TLS"} ``` -The manager requires the vhost_vsock kernel module to be loaded. Load the module with the following command. +#### Start the Manager -```bash -sudo modprobe vhost_vsock -``` +Navigate to the cocos/cmd/manager directory and start the Manager: ```bash +cd cmd/manager sudo \ MANAGER_QEMU_SMP_MAXCPUS=4 \ -MANAGER_GRPC_URL=localhost:7001 \ +MANAGER_QEMU_MEMORY_SIZE=4G \ +MANAGER_GRPC_HOST=localhost \ +MANAGER_GRPC_PORT=7002 \ MANAGER_LOG_LEVEL=debug \ -MANAGER_QEMU_USE_SUDO=false \ -MANAGER_QEMU_ENABLE_SEV=false \ -MANAGER_QEMU_SEV_CBITPOS=51 \ MANAGER_QEMU_ENABLE_SEV_SNP=false \ MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ go run main.go ``` -This will start on a specific port called `agent_port`, which will be in the manager logs. - -For example, +Expected output: ```bash -{"time":"2024-07-26T11:45:08.503149211+03:00","level":"INFO","msg":"manager_test_server service gRPC server listening at :7001 without TLS"} -{"time":"2024-07-26T11:45:14.827479501+03:00","level":"DEBUG","msg":"received who am on ip address [::1]:47936"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721983514 nanos:832365721} computation_id:"1" originator:"manager" status:"starting"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721983514 nanos:833034946} computation_id:"1" originator:"manager" status:"in-progress"} -received agent log -&{message:"char device redirected to /dev/pts/15 (label compat_monitor0)\n" computation_id:"1" level:"debug" timestamp:{seconds:1721983514 nanos:849595083}} -received agent log -&{message:"\x1b[2J\x1b[0" computation_id:"1" level:"debug" timestamp:{seconds:1721983515 nanos:215753406}} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721983527 nanos:970098872} computation_id:"1" originator:"manager" status:"complete"} -received runRes -&{agent_port:"43045" computation_id:"1"} -received agent log -&{message:"Transition: receivingManifest -> receivingManifest\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1721983527 nanos:966911139}} -received agent log +{"time":"...","level":"INFO","msg":"Manager started without confidential computing support"} +{"time":"...","level":"INFO","msg":"manager service gRPC server listening at localhost:7002 without TLS"} ``` -## Uploading the algorithm and dataset +### Create CVM and Upload Algorithm -```bash -export AGENT_GRPC_URL=localhost:43045 -``` +#### Create CVM -Upload the algorithm to the enclave. +From your cocos directory: ```bash -./build/cocos-cli algo ./addition ./private.pem +export MANAGER_GRPC_URL=localhost:7002 +./build/cocos-cli create-vm --log-level debug --server-url ":7001" ``` -Upload the dataset to the enclave. Since this algorithm does not require a dataset, we can skip this step. +**Important:** Note the id and port from the cocos-cli output. -```bash -./build/cocos-cli dataset ./private.pem -``` - -## Downloading the results - -After the computation has been completed, you can download the results from the enclave. +Expected output: ```bash -./build/cocos-cli result ./private.pem +🔗 Connected to manager using without TLS +🔗 Creating a new virtual machine +✅ Virtual machine created successfully with id and port ``` -This will generate a `result.zip` file in the current directory. Unzip the file to get the result. +#### Export Agent gRPC URL + +Set the AGENT_GRPC_URL using the port noted in the previous step: ```bash -unzip result.zip +export AGENT_GRPC_URL=localhost: ``` -For the addition example, we can read the output from the `result.txt` file which contains the result of the addition algorithm. This file is gotten from the `result.zip` file. +#### Upload Addition Algorithm -To read the result, run the following command. +From your cocos directory: ```bash -cat result.txt +./build/cocos-cli algo ./addition ./private.pem ``` -You can also build the addition algorithm with the `read` feature to read the result from the enclave. +Expected output: ```bash -cargo build --release --bin addition --features read +🔗 Connected to agent without TLS +Uploading algorithm file: ./addition +🚀 Uploading algorithm [███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded algorithm! ✔ ``` -Run the binary with the `result.txt` file as an argument. - -```bash -./target/release/addition ../../cocos/results.txt -``` +#### Upload Dataset (if required) -This will output the result of the addition algorithm. +For algorithms that require datasets: ```bash -"[5.141593, 4.0, 5.0, 8.141593]" +./build/cocos-cli data ./private.pem ``` -Terminal recording of the above steps: - -[![asciicast](https://asciinema.org/a/LzH6RLi1r69hYBhx3qaOIn4ER.svg)](https://asciinema.org/a/LzH6RLi1r69hYBhx3qaOIn4ER) +**Note:** The addition example doesn't require a dataset, so this step can be skipped. -## Wasm Module +#### Download Results -For the addition inference example we can build the addition algorithm(this happens in the burn-algorithms/addition-inference directory). +After the computation completes: ```bash -cargo build --release --target wasm32-wasip1 --features cocos +./build/cocos-cli result ./private.pem ``` -Copy the built wasm module from `/target/wasm32-wasip1/release/addition_inference.wasm` to the directoy where you will run the computation server. +Expected output: ```bash -cp ../target/wasm32-wasip1/release/addition-inference.wasm ../../../cocos +🔗 Connected to agent without TLS +⏳ Retrieving computation result file +📥 Downloading result [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Computation result retrieved and saved successfully as results.zip! ✔ ``` -For example, to run the `addition-inference` algorithm, run the following command (this happens in the cocos directory). Since the addition-inference algorithm does not require any dataset, the dataset path is empty. +#### Extract and View Results ```bash -go run ./test/computations/main.go ./addition-inference.wasm public.pem true +unzip results.zip +cat results/result.txt ``` -Start the manager. +For the addition example, you can also read the result using the built binary: ```bash -cd cmd/manager +cd burn-algorithms +cargo build --release --bin addition --features read +./target/release/addition ../cocos/results/result.txt ``` +Expected output: + ```bash -sudo \ -MANAGER_QEMU_SMP_MAXCPUS=4 \ -MANAGER_GRPC_URL=localhost:7001 \ -MANAGER_LOG_LEVEL=debug \ -MANAGER_QEMU_USE_SUDO=false \ -MANAGER_QEMU_ENABLE_SEV=false \ -MANAGER_QEMU_SEV_CBITPOS=51 \ -MANAGER_QEMU_ENABLE_SEV_SNP=false \ -MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ -MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ -go run main.go +"[5.141593, 4.0, 5.0, 8.141593]" ``` -This will start on a specific port called `agent_port`, which will be in the manager logs. - -For example, +#### Remove CVM -```bash -{"time":"2024-08-06T10:54:53.42640029+03:00","level":"INFO","msg":"manager_test_server service gRPC server listening at :7001 without TLS"} -{"time":"2024-08-06T10:54:55.953576985+03:00","level":"DEBUG","msg":"received who am on ip address [::1]:50528"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1722930895 nanos:957553381} computation_id:"1" originator:"manager" status:"starting"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1722930895 nanos:958021704} computation_id:"1" originator:"manager" status:"in-progress"} -received agent log -&{message:"char device redirected to /dev/pts/10 (label compat_monitor0)\n" computation_id:"1" level:"debug" timestamp:{seconds:1722930896 nanos:39152844}} -received agent log -&{message:"\x1b[" computation_id:"1" level:"debug" timestamp:{seconds:1722930898 nanos:319429985}} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1722930911 nanos:886580521} computation_id:"1" originator:"manager" status:"complete"} -received runRes -&{agent_port:"46593" computation_id:"1"} -received agent log -&{message:"Transition: receivingManifest -> receivingManifest\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1722930911 nanos:859764366}} -received agent event -``` +Use the `` obtained during CVM creation: ```bash -export AGENT_GRPC_URL=localhost:46593 +./build/cocos-cli remove-vm ``` -Upload the algorithm to the enclave. The `-a wasm` flag is used to specify that the algorithm is a wasm module. +Expected output: ```bash -./build/cocos-cli algo ./addition-inference.wasm ./private.pem -a wasm +🔗 Connected to manager using without TLS +🔗 Removing virtual machine +✅ Virtual machine removed successfully ``` -Upload the dataset to the enclave. Since this algorithm does not require a dataset, we can skip this step. +## WASM Module Example -```bash -./build/cocos-cli dataset ./private.pem -``` +### Build WASM Module -After the computation has been completed, you can download the results from the enclave. +For the addition inference example: ```bash -./build/cocos-cli result ./private.pem +cd burn-algorithms/addition-inference +cargo build --release --target wasm32-wasip1 --features cocos +cp ../target/wasm32-wasip1/release/addition-inference.wasm ../../../cocos/ +cd ../../../cocos ``` -This will generate a `result.zip` file in the current directory. Unzip the file to get the result. +### Run with CVMS + +Start the CVMS server with the WASM module: ```bash -unzip result.zip +HOST= go run ./test/cvms/main.go -algo-path ./addition-inference.wasm -public-key-path public.pem -attested-tls-bool false ``` -For the addition example, we can read the output from the `results/results.txt` file which contains the result of the addition algorithm. This file is gotten from the `result.zip` file. - -To read the result, run the following command. +Follow the same CVM creation and management steps as above, but upload the algorithm with the WASM flag: ```bash -cat results/results.txt +./build/cocos-cli algo ./addition-inference.wasm ./private.pem -a wasm ``` -Terminal recording of the above steps: +## Notes -[![asciicast](https://asciinema.org/a/vKnxV4A9HXloD8g1xJRvw7h3T.svg)](https://asciinema.org/a/vKnxV4A9HXloD8g1xJRvw7h3T) +- **Memory Requirements**: 4GB is sufficient for most basic examples; increase as needed for complex algorithms +- **WASM Support**: Both binary and WASM modules are supported, with WASM providing better portability +- **Security**: The enclave provides confidential computing capabilities for sensitive workloads +- **Datasets**: Not all algorithms require datasets; the addition example works without external data +- **Results Format**: Results are packaged in ZIP files and can contain multiple output files -## Conclusion +### Terminal Recordings -This documentation has shown how to run the addition algorithm in the enclave using the Cocos system. The addition algorithm was built as a binary target and a wasm module. The results were downloaded from the enclave and read to get the output of the addition algorithm. This process can be followed to run other algorithms in the enclave using the Cocos system. +- Binary example: [![asciicast](https://asciinema.org/a/LzH6RLi1r69hYBhx3qaOIn4ER.svg)](https://asciinema.org/a/LzH6RLi1r69hYBhx3qaOIn4ER) +- WASM example: [![asciicast](https://asciinema.org/a/vKnxV4A9HXloD8g1xJRvw7h3T.svg)](https://asciinema.org/a/vKnxV4A9HXloD8g1xJRvw7h3T) diff --git a/burn-algorithms/addition-inference/Cargo.toml b/burn-algorithms/addition-inference/Cargo.toml index 9f5f0f1..1a4d981 100644 --- a/burn-algorithms/addition-inference/Cargo.toml +++ b/burn-algorithms/addition-inference/Cargo.toml @@ -11,6 +11,6 @@ cocos = [] read = [] [dependencies] -burn = { version = "0.15.0", default-features = false, features = ["ndarray"] } +burn = { version = "0.16.0", default-features = false, features = ["ndarray"] } futures = "0.3.30" lib = { path = "../lib" } diff --git a/burn-algorithms/addition/Cargo.toml b/burn-algorithms/addition/Cargo.toml index d47042e..c274572 100644 --- a/burn-algorithms/addition/Cargo.toml +++ b/burn-algorithms/addition/Cargo.toml @@ -35,5 +35,5 @@ path = "src/main.rs" required-features = ["read"] [dependencies] -burn = { version = "0.15.0", features = ["ndarray", "wgpu"] } +burn = { version = "0.16.0", features = ["ndarray", "wgpu"] } lib = { path = "../lib" } diff --git a/burn-algorithms/agnews/Cargo.toml b/burn-algorithms/agnews/Cargo.toml index 82efeed..d793320 100644 --- a/burn-algorithms/agnews/Cargo.toml +++ b/burn-algorithms/agnews/Cargo.toml @@ -27,7 +27,7 @@ required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "fusion", "ndarray", diff --git a/burn-algorithms/agnews/src/training.rs b/burn-algorithms/agnews/src/training.rs index 3bd366a..d26617c 100644 --- a/burn-algorithms/agnews/src/training.rs +++ b/burn-algorithms/agnews/src/training.rs @@ -73,7 +73,8 @@ pub fn train( let lr_scheduler = NoamLrSchedulerConfig::new(config.learning_rate) .with_warmup_steps(1000) .with_model_size(config.transformer.d_model) - .init(); + .init() + .unwrap(); let learner = if cfg!(feature = "cocos") { LearnerBuilder::new(artifact_dir) diff --git a/burn-algorithms/cifar10/Cargo.toml b/burn-algorithms/cifar10/Cargo.toml index 306361a..0e5c785 100644 --- a/burn-algorithms/cifar10/Cargo.toml +++ b/burn-algorithms/cifar10/Cargo.toml @@ -26,14 +26,14 @@ path = "src/main.rs" required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "ndarray", "train", "vision", "wgpu", ] } -burn-common = "0.15.0" +burn-common = "0.16.0" serde = { version = "1.0.203", features = ["derive", "std"] } lib = { path = "../lib" } flate2 = "1.0.31" diff --git a/burn-algorithms/imdb/Cargo.toml b/burn-algorithms/imdb/Cargo.toml index 75a4c93..92f2305 100644 --- a/burn-algorithms/imdb/Cargo.toml +++ b/burn-algorithms/imdb/Cargo.toml @@ -26,7 +26,7 @@ path = "src/main.rs" required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "ndarray", "train", diff --git a/burn-algorithms/imdb/src/training.rs b/burn-algorithms/imdb/src/training.rs index 5a69e9f..0b333a0 100644 --- a/burn-algorithms/imdb/src/training.rs +++ b/burn-algorithms/imdb/src/training.rs @@ -76,7 +76,8 @@ pub fn train + 'static>( let lr_scheduler = NoamLrSchedulerConfig::new(config.learning_rate) .with_warmup_steps(1000) .with_model_size(config.transformer.d_model) - .init(); + .init() + .unwrap(); let learner = if cfg!(feature = "cocos") { LearnerBuilder::new(artifact_dir) diff --git a/burn-algorithms/iris-inference/Cargo.toml b/burn-algorithms/iris-inference/Cargo.toml index ffb6eb3..93c476b 100644 --- a/burn-algorithms/iris-inference/Cargo.toml +++ b/burn-algorithms/iris-inference/Cargo.toml @@ -10,7 +10,7 @@ description.workspace = true cocos = [] [dependencies] -burn = { version = "0.15.0", default-features = false, features = ["ndarray"] } +burn = { version = "0.16.0", default-features = false, features = ["ndarray"] } futures = "0.3.30" serde = { version = "1.0.203", features = ["derive"] } serde_json = "1.0.120" diff --git a/burn-algorithms/iris/Cargo.toml b/burn-algorithms/iris/Cargo.toml index 4aca2eb..a273805 100644 --- a/burn-algorithms/iris/Cargo.toml +++ b/burn-algorithms/iris/Cargo.toml @@ -26,7 +26,7 @@ path = "src/main.rs" required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "ndarray", "train", diff --git a/burn-algorithms/lib/Cargo.toml b/burn-algorithms/lib/Cargo.toml index b7272b0..7017875 100644 --- a/burn-algorithms/lib/Cargo.toml +++ b/burn-algorithms/lib/Cargo.toml @@ -8,4 +8,4 @@ description.workspace = true publish = false [dependencies] -burn = { version = "0.15.0", default-features = false } +burn = { version = "0.16.0", default-features = false } diff --git a/burn-algorithms/mnist-inference/Cargo.toml b/burn-algorithms/mnist-inference/Cargo.toml index 98c1899..6b51dbd 100644 --- a/burn-algorithms/mnist-inference/Cargo.toml +++ b/burn-algorithms/mnist-inference/Cargo.toml @@ -10,7 +10,7 @@ description.workspace = true cocos = [] [dependencies] -burn = { version = "0.15.0", default-features = false, features = ["ndarray"] } +burn = { version = "0.16.0", default-features = false, features = ["ndarray"] } futures = "0.3.30" serde_json = "1.0.120" lib = { path = "../lib" } diff --git a/burn-algorithms/mnist/Cargo.toml b/burn-algorithms/mnist/Cargo.toml index a818be6..fc78e96 100644 --- a/burn-algorithms/mnist/Cargo.toml +++ b/burn-algorithms/mnist/Cargo.toml @@ -26,13 +26,13 @@ path = "src/main.rs" required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "ndarray", "train", "vision", "wgpu", ] } -burn-common = "0.15.0" +burn-common = "0.16.0" serde = { version = "1.0.203", features = ["derive", "std"] } lib = { path = "../lib" } diff --git a/burn-algorithms/winequality-inference/Cargo.toml b/burn-algorithms/winequality-inference/Cargo.toml index 6615c2b..6187844 100644 --- a/burn-algorithms/winequality-inference/Cargo.toml +++ b/burn-algorithms/winequality-inference/Cargo.toml @@ -10,7 +10,7 @@ description.workspace = true cocos = [] [dependencies] -burn = { version = "0.15.0", default-features = false, features = ["ndarray"] } +burn = { version = "0.16.0", default-features = false, features = ["ndarray"] } futures = "0.3.30" serde = { version = "1.0.203", features = ["derive"] } serde_json = "1.0.120" diff --git a/burn-algorithms/winequality/Cargo.toml b/burn-algorithms/winequality/Cargo.toml index 7747de7..8b6be06 100644 --- a/burn-algorithms/winequality/Cargo.toml +++ b/burn-algorithms/winequality/Cargo.toml @@ -26,7 +26,7 @@ path = "src/main.rs" required-features = ["cocos"] [dependencies] -burn = { version = "0.15.0", features = [ +burn = { version = "0.16.0", features = [ "dataset", "ndarray", "train", diff --git a/covid19/README.md b/covid19/README.md index aca276c..6cb3eb3 100644 --- a/covid19/README.md +++ b/covid19/README.md @@ -15,7 +15,7 @@ PyTorch can be installed from the [PyTorch website](https://pytorch.org/get-star For example, to install on a linux system with ROCm support, you can run: ```bash -pip3 install torch~=2.3.1 torchvision~=0.18.1 --index-url https://download.pytorch.org/whl/rocm6.0 +pip3 install torch~=2.6.0 torchvision~=0.21.0 --index-url https://download.pytorch.org/whl/rocm6.0 ``` ## Install @@ -48,7 +48,7 @@ python train.py ## Test Model -Inference can be done using `predict.py`. Anyfile in the `datasets/test` directory can be used for testing. +Inference can be done using `predict.py`. Any file in the `datasets/test` directory can be used for testing. ```bash python predict.py --model results/model.pth --image datasets/test/COVID/COVID-2.png @@ -56,83 +56,104 @@ python predict.py --model results/model.pth --image datasets/test/COVID/COVID-2. ## Testing with Cocos -Make sure you have the Cocos repository cloned and eos buildroot installed. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) +Make sure you have the Cocos repository cloned and set up. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) -Clone the ai repository which has the COVID-19 model: +### Prerequisites -```bash -git clone https://github.com/ultravioletrs/ai.git -``` +1. **Clone the repositories:** -```bash -cd ai/covid19 -``` + ```bash + git clone https://github.com/ultravioletrs/cocos.git + git clone https://github.com/ultravioletrs/ai.git + ``` -Download the data from [COVID-19 Radiography Database](https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database) dataset. +2. **Navigate to the COVID-19 directory:** -You can install kaggle-cli to download the dataset: + ```bash + cd ai/covid19 + ``` -```bash -pip install kaggle -``` +3. **Download and prepare the dataset:** -Set the [kaggle API key](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#api-credentials) and download the dataset: + Install kaggle-cli to download the dataset: -```bash -kaggle datasets download -d tawsifurrahman/covid19-radiography-database -``` + ```bash + pip install kaggle + ``` -Prepare the dataset: + Set the [kaggle API key](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#api-credentials) and download the dataset: -```bash -python tools/prepare_datasets.py covid19-radiography-database.zip -d datasets -``` + ```bash + kaggle datasets download -d tawsifurrahman/covid19-radiography-database + ``` -Zip the folders: + Prepare the dataset: -```bash -zip -r datasets/h1.zip datasets/h1 -zip -r datasets/h2.zip datasets/h2 -zip -r datasets/h3.zip datasets/h3 -``` + ```bash + python tools/prepare_datasets.py covid19-radiography-database.zip -d datasets + ``` -Change the directory to cocos: + Zip the hospital datasets for uploading: -```bash -cd ../../cocos -``` + ```bash + cd datasets + zip -r h1.zip h1 + zip -r h2.zip h2 + zip -r h3.zip h3 + cd .. + ``` -Build cocos artifacts: +4. **Change to cocos directory and build artifacts:** -```bash -make all -``` + ```bash + cd ../../cocos + make all + ``` + +5. **Generate keys:** -Before running the computation server, we need to issue certificates for the computation server and the client. This can be done by running the following commands: + ```bash + ./build/cocos-cli keys -k="rsa" + ``` + +### Finding Your Host IP Address + +Find your host machine's IP address (avoid using localhost): ```bash -./build/cocos-cli keys -k="rsa" +ip a ``` -Run the computation server: +Look for your network interface (e.g., wlan0 for WiFi, eth0 for Ethernet) and note the inet address. For example, if you see `192.168.1.100`, use that as your ``. + +### Start Core Services + +#### Start the Computation Management Server (CVMS) + +From your cocos directory, start the CVMS server with the COVID-19 algorithm and datasets: ```bash -go run ./test/computations/main.go ../ai/covid19/train.py public.pem false ../ai/covid19/datasets/h1.zip ../ai/covid19/datasets/h2.zip ../ai/covid19/datasets/h3.zip +cd cocos +HOST= go run ./test/cvms/main.go -algo-path ../ai/covid19/train.py -public-key-path public.pem -attested-tls-bool false -data-paths ../ai/covid19/datasets/h1.zip,../ai/covid19/datasets/h2.zip,../ai/covid19/datasets/h3.zip ``` -On another terminal, run manager: +Expected output: ```bash -cd cmd/manager +{"time":"...","level":"INFO","msg":"cvms_test_server service gRPC server listening at :7001 without TLS"} ``` -Make sure you have the `bzImage` and `rootfs.cpio.gz` in the `cmd/manager/img` directory. +#### Start the Manager + +Navigate to the cocos/cmd/manager directory and start the Manager (increase memory for COVID-19 training): ```bash +cd cocos/cmd/manager sudo \ MANAGER_QEMU_SMP_MAXCPUS=4 \ MANAGER_QEMU_MEMORY_SIZE=25G \ -MANAGER_GRPC_URL=localhost:7001 \ +MANAGER_GRPC_HOST=localhost \ +MANAGER_GRPC_PORT=7002 \ MANAGER_LOG_LEVEL=debug \ MANAGER_QEMU_ENABLE_SEV_SNP=false \ MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ @@ -140,254 +161,250 @@ MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ go run main.go ``` -After sometime you will see the computation server will output a port number. This port number is the port on which the computation server is running. You can use this port number to run the client. - -The logs will look like this: +Expected output: ```bash -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721816170 nanos:825593350} computation_id:"1" originator:"manager" status:"starting"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721816170 nanos:826050932} computation_id:"1" originator:"manager" status:"in-progress"} -received agent log -&{message:"char device redirected to /dev/pts/17 (label compat_monitor0)\n" computation_id:"1" level:"debug" timestamp:{seconds:1721816170 nanos:927805046}} -received agent log -&{message:"qemu-system-x86_64: warning: Number of hotpluggable cpus requested (64) exceeds the recommended cpus supported by KVM (24)\n" computation_id:"1" level:"error" timestamp:{seconds:1721816170 nanos:953823551}} -received agent log -&{message:"S" computation_id:"1" level:"debug" timestamp:{seconds:1721816172 nanos:583261451}} -received agent log -&{message:"e" computation_id:"1" level:"debug" timestamp:{seconds:1721816172 nanos:583288633}} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1721816191 nanos:936892540} computation_id:"1" originator:"manager" status:"complete"} -received runRes -&{agent_port:"46589" computation_id:"1"} -received agent log -&{message:"Transition: receivingManifest -> receivingManifest\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1721816191 nanos:933814929}} -received agent log -&{message:"agent service gRPC server listening at :7002 without TLS" computation_id:"1" level:"INFO" timestamp:{seconds:1721816191 nanos:934464476}} -received agent event -&{event_type:"receivingAlgorithm" timestamp:{seconds:1721816191 nanos:934190831} computation_id:"1" originator:"agent" status:"in-progress"} +{"time":"...","level":"INFO","msg":"Manager started without confidential computing support"} +{"time":"...","level":"INFO","msg":"manager service gRPC server listening at localhost:7002 without TLS"} ``` -On another terminal, upload the artifacts to the computation server: +### Create CVM and Upload COVID-19 Algorithm -```bash -export AGENT_GRPC_URL=localhost: -``` +#### Create CVM + +From your cocos directory: ```bash -./build/cocos-cli algo ../ai/covid19/train.py ./private.pem -a python -r ../ai/covid19/requirements.txt +export MANAGER_GRPC_URL=localhost:7002 +./build/cocos-cli create-vm --log-level debug --server-url ":7001" ``` -Upload the data to the computation server: +**Important:** Note the id and port from the cocos-cli output. + +Expected output: ```bash -./build/cocos-cli data ../ai/covid19/datasets/h1.zip ./private.pem -./build/cocos-cli data ../ai/covid19/datasets/h2.zip ./private.pem -./build/cocos-cli data ../ai/covid19/datasets/h3.zip ./private.pem +🔗 Connected to manager using without TLS +🔗 Creating a new virtual machine +✅ Virtual machine created successfully with id and port ``` -When the results are ready, download the results: +Expected CVMS server output: ```bash -./build/cocos-cli result ./private.pem +&{message:"Method InitComputation for computation id 1 took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"ReceivingAlgorithm" timestamp:{...} computation_id:"1" originator:"agent" status:"InProgress"} +&{message:"agent service gRPC server listening at 10.0.2.15: without TLS" computation_id:"1" level:"INFO" timestamp:{...}} ``` -The above will generate a `results.zip` file. Copy this file to the ai directory: +#### Export Agent gRPC URL + +Set the AGENT_GRPC_URL using the port noted in the previous step (default 6100): ```bash -cp results.bin ../ai/covid19/ +export AGENT_GRPC_URL=localhost: ``` -Test the model with the test data: +#### Upload COVID-19 Algorithm -```bash -cd ../ai/covid19 -``` +From your cocos directory: ```bash -unzip results.zip -d results +./build/cocos-cli algo ../ai/covid19/train.py ./private.pem -a python -r ../ai/covid19/requirements.txt ``` -The image can be any image from the test dataset: +Expected output: ```bash -python predict.py --model results/model.pth --image datasets/test/COVID/COVID-2.png +🔗 Connected to agent without TLS +Uploading algorithm file: ../ai/covid19/train.py +🚀 Uploading algorithm [███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded algorithm! ✔ ``` -## Testing with Prism - -Make sure you have the Cocos repository cloned and eos buildroot installed. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) +#### Upload COVID-19 Datasets -Clone the ai repository which has the COVID-19 model: +Upload each hospital dataset: ```bash -git clone https://github.com/ultravioletrs/ai.git +./build/cocos-cli data ../ai/covid19/datasets/h1.zip ./private.pem -d +./build/cocos-cli data ../ai/covid19/datasets/h2.zip ./private.pem -d +./build/cocos-cli data ../ai/covid19/datasets/h3.zip ./private.pem -d ``` +Expected output for each upload: + ```bash -cd ai/covid19 +🔗 Connected to agent without TLS +Uploading dataset: ../ai/covid19/datasets/h1.zip +📦 Uploading data [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded dataset! ✔ ``` -Download the data from [COVID-19 Radiography Database](https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database) dataset. - -You can install kaggle-cli to download the dataset: +Watch the CVMS server logs for training progress. The COVID-19 training process may take significant time due to the large dataset and model complexity. Look for completion messages: ```bash -pip install kaggle +&{message:"Method Data took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Starting"} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Completed"} +&{event_type:"ConsumingResults" timestamp:{...} computation_id:"1" originator:"agent" status:"Ready"} ``` -Set the [kaggle API key](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#api-credentials) and download the dataset: +#### Download COVID-19 Results + +From your cocos directory: ```bash -kaggle datasets download -d tawsifurrahman/covid19-radiography-database +./build/cocos-cli result ./private.pem ``` -Prepare the dataset: +Expected output: ```bash -python tools/prepare_datasets.py covid19-radiography-database.zip -d datasets +🔗 Connected to agent without TLS +⏳ Retrieving computation result file +📥 Downloading result [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Computation result retrieved and saved successfully as results.zip! ✔ ``` -Zip the folders: +#### Test the Trained Model -```bash -zip -r datasets/h1.zip datasets/h1 -zip -r datasets/h2.zip datasets/h2 -zip -r datasets/h3.zip datasets/h3 -``` +1. Copy results to the COVID-19 directory: -Start prism + ```bash + cp results.zip ../ai/covid19/ + ``` -```bash -git clone https://github.com/ultravioletrs/prism.git -``` +2. Navigate to COVID-19 directory and extract results: -```bash -cd prism -``` + ```bash + cd ../ai/covid19 + unzip results.zip -d results + ``` -```bash -make run -``` +3. Test the model with test data (use any image from the test dataset): + + ```bash + python predict.py --model results/model.pth --image datasets/test/COVID/COVID-2.png + ``` + +Expected output will show the predicted class (Normal, Viral Pneumonia, or COVID-19). -The following recording will demonstrate how to setup prism - https://jam.dev/c/8067f697-4eaa-407f-875a-17119e4f3901 +#### Remove COVID-19 CVM -Build cocos artifacts: +Use the `` obtained during CVM creation: ```bash -make all +./build/cocos-cli remove-vm ``` -Before running the computation server, we need to issue certificates for the computation server and the client. This can be done by running the following commands: +Expected output: ```bash -./build/cocos-cli keys -k="rsa" +🔗 Connected to manager using without TLS +🔗 Removing virtual machine +✅ Virtual machine removed successfully ``` -You need to have done the following: +## Testing with Prism -- Create a user at `localhost:9095` -- Create a workspace -- Login to the created workspace -- Create a backend with `localhost` as the address -- Issue Certs for the backend, request download and download the certs -- Unzip the folder and copy the contents to the managers `cmd/manager/` directory under `cocos` folder -- Start the manager with the backend address: Take note the memory size is set to `25G` since we will be downloading pytorch and pretrained model inside the VM +Prism provides a web-based interface for managing Cocos computations. - ```bash - cd cmd/manager - ``` +### Prerequisites - Make sure you have the `bzImage` and `rootfs.cpio.gz` in the `cmd/manager/img` directory. +1. **Clone and start Prism:** - ```bash - sudo \ (main|…1⚑2) - MANAGER_QEMU_SMP_MAXCPUS=4 \ - MANAGER_QEMU_MEMORY_SIZE=25G \ - MANAGER_GRPC_URL=localhost:7011 \ - MANAGER_LOG_LEVEL=debug \ - MANAGER_QEMU_ENABLE_SEV_SNP=false \ - MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ - MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ - MANAGER_GRPC_CLIENT_CERT=cert.pem \ - MANAGER_GRPC_CLIENT_KEY=key.pem \ - MANAGER_GRPC_SERVER_CA_CERTS=ca.pem \ - go run main.go - ``` + ```bash + git clone https://github.com/ultravioletrs/prism.git + cd prism + make run + ``` -- Create the covid computation. To get the filehash for all the files go to `cocos` folder and use the cocos-cli. For the file names use `h1.zip`, `h2.zip`, `h3.zip` and `train.py` +2. **Prepare COVID-19 datasets** (follow the same steps as in the Cocos section above) - ```bash - ./build/cocos-cli file-hash ../ai/covid19/datasets/h1.zip - ``` +3. **Build Cocos artifacts:** - ```bash - ./build/cocos-cli file-hash ../ai/covid19/datasets/h2.zip - ``` + ```bash + cd cocos + make all + ``` - ```bash - ./build/cocos-cli file-hash ../ai/covid19/datasets/h3.zip - ``` +4. **Generate keys:** - ```bash - ./build/cocos-cli file-hash ../ai/covid19/train.py - ``` + ```bash + ./build/cocos-cli keys -k="rsa" + ``` -- After the computation has been created upload your public key generate by `cocos-cli`. This key will enable you to upload the datatsets and algorithms and also download the results. +### Prism Setup Process - ```bash - ./build/cocos-cli keys - ``` +1. **Create a user account** +2. **Create a workspace** +3. **Login to the created workspace** +4. **Create a CVM** and wait for it to come online +5. **Create the computation** and set a name and description. +6. Add participants using computation roles. -- Click run computation and wait for the vm to be provisioned.Copy the aggent port number and export `AGENT_GRPC_URL` +### Create COVID-19 Computation - ```bash - export AGENT_GRPC_URL=localhost: - ``` +To create the computation in Prism, you'll need sha3-256 checksums for all datasets and the algorithm. Generate these from the cocos folder: -- After vm has been provisioned upload the datasets and the algorithm +```bash +./build/cocos-cli checksum ../ai/covid19/datasets/h1.zip +./build/cocos-cli checksum ../ai/covid19/datasets/h2.zip +./build/cocos-cli checksum ../ai/covid19/datasets/h3.zip +./build/cocos-cli checksum ../ai/covid19/train.py +``` - ```bash - ./build/cocos-cli algo ../ai/covid19/train.py ./private.pem -a python -r ../ai/covid19/requirements.txt - ``` +Use the file names `h1.zip`, `h2.zip`, `h3.zip`, and `train.py` when creating the computation asset in Prism. Link the assets to the computation. - ```bash - ./build/cocos-cli data ../ai/covid19/datasets/h1.zip ./private.pem - ``` +### Upload Public Key - ```bash - ./build/cocos-cli data ../ai/covid19/datasets/h2.zip ./private.pem - ``` +After creating the computation, upload your public key generated by `cocos-cli`. This enables you to upload datasets/algorithms and download results: - ```bash - ./build/cocos-cli data ../ai/covid19/datasets/h3.zip ./private.pem - ``` +```bash +cat public.pem +``` -- The computation will run and you will get an event that the results are ready. You can download the results by running the following command: +### Run Computation - ```bash - ./build/cocos-cli results ./private.pem - ``` +1. **Click "Run Computation"** and select an available cvm. +2. **Copy the agent port number** and export it: -The above will generate a `results.zip` file. Copy this file to the ai directory: + ```bash + export AGENT_GRPC_URL=localhost: + ``` -```bash -cp results.bin ../ai/covid19/ -``` +3. **Upload the algorithm and datasets:** -Test the model with the test data: + ```bash + ./build/cocos-cli algo ../ai/covid19/train.py ./private.pem -a python -r ../ai/covid19/requirements.txt + ./build/cocos-cli data ../ai/covid19/datasets/h1.zip ./private.pem + ./build/cocos-cli data ../ai/covid19/datasets/h2.zip ./private.pem + ./build/cocos-cli data ../ai/covid19/datasets/h3.zip ./private.pem + ``` -```bash -cd ../ai/covid19 -``` +4. **Monitor the computation** through the Prism interface until you receive an event indicating results are ready -```bash -unzip results.zip -d results -``` +5. **Download the results:** + + ```bash + ./build/cocos-cli result ./private.pem + ``` + +### Test Results -The image can be any image from the test dataset: +Follow the same testing steps as in the Cocos section: ```bash +cp results.zip ../ai/covid19/ +cd ../ai/covid19 +unzip results.zip -d results python predict.py --model results/model.pth --image datasets/test/COVID/COVID-2.png ``` + +## Notes + +- The COVID-19 model training is computationally intensive and may require significant time and resources +- Ensure adequate memory allocation (25GB recommended) when running the Manager +- The model works with chest X-ray images and classifies them into three categories +- Test images are available in the `datasets/test` directory after running the preparation script diff --git a/fraud-detection/README.md b/fraud-detection/README.md index e08ea1b..e45bfff 100644 --- a/fraud-detection/README.md +++ b/fraud-detection/README.md @@ -4,6 +4,7 @@ This project aims to detect fraudulent credit card transactions using machine le Given the class imbalance ratio, script measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). ## Dataset + The dataset used in this project contains transactions made by European cardholders over a period of two days in September 2013. It has been modified using Principal Component Analysis (PCA) for confidentiality reasons. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. The dataset includes the following features: @@ -13,99 +14,146 @@ The dataset includes the following features: - `Amount`: Transaction amount. - `Class`: Label indicating whether the transaction is fraudulent (1) or not (0). +## Setup Virtual Environment + +```bash +python3 -m venv venv +source venv/bin/activate +pip install -r requirements.txt +``` + +## Install -Fetch the data from Kaggle - [Fraud Detection Database](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud) dataset +Fetch the data from Kaggle - [Credit Card Fraud Database](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud) dataset ```bash kaggle datasets download -d mlg-ulb/creditcardfraud ``` To run the above command you would need [kaggle cli](https://github.com/Kaggle/kaggle-api) installed and API credentials setup. This can be done by following [this documentation](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#kaggle-api). + You will get `creditcardfraud.zip` in the folder Run: + ```bash unzip creditcardfraud.zip -d datasets/ ``` -This will extract the contents of `creditcardfraud.zip` into `datasets` directory +This will extract the contents of `creditcardfraud.zip` into `datasets` directory -### Train Model +## Train Model To train model run script execute: + ```bash python fraud-detection.py -``` +``` + Script will produce model `fraud_model.ubj` -### Test Model + +## Test Model + Inference can be done using `prediction.py`. Make sure to move the `fraud_model.ubj` file to the datasets directory as well. ```bash python prediction.py ``` + Results will be `.png` images that represents confusion matrix and AUPRC ![](images/AUPRC.png) ![](images/C-matrix.png) ## Testing with Cocos -Make sure you have the Cocos repository cloned and eos buildroot installed. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) +Make sure you have the Cocos repository cloned and set up. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) -Clone the ai repository which has the fraud detection algorithm: +### Prerequisites -```bash -git clone https://github.com/ultravioletrs/ai.git -``` +1. **Clone the repositories:** -```bash -cd ai/fraud-detection -``` + ```bash + git clone https://github.com/ultravioletrs/cocos.git + git clone https://github.com/ultravioletrs/ai.git + ``` -Download the data from [Kaggle](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud/download?datasetVersionNumber=3). +2. **Navigate to the fraud detection directory:** -Extract the `creditcard.csv` file into the datasets directory + ```bash + cd ai/fraud-detection + ``` -```bash -unzip archive.zip -d datasets -``` +3. **Download and prepare the dataset:** -Change the directory to cocos: + Install kaggle-cli to download the dataset: -```bash -cd ../../cocos -``` + ```bash + pip install kaggle + ``` -Build cocos artifacts: + Set the [kaggle API key](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#api-credentials) and download the dataset: -```bash -make all -``` + ```bash + kaggle datasets download -d mlg-ulb/creditcardfraud + ``` -Before running the computation server, we need to issue public and private key pairs. This can be done by running the following commands: + Prepare the dataset: + + ```bash + unzip creditcardfraud.zip -d datasets/ + ``` + +4. **Change to cocos directory and build artifacts:** + + ```bash + cd ../../cocos + make all + ``` + +5. **Generate keys:** + + ```bash + ./build/cocos-cli keys -k="rsa" + ``` + +### Finding Your Host IP Address + +Find your host machine's IP address (avoid using localhost): ```bash -./build/cocos-cli keys -k rsa +ip a ``` -Run the computation server: +Look for your network interface (e.g., wlan0 for WiFi, eth0 for Ethernet) and note the inet address. For example, if you see `192.168.1.100`, use that as your ``. + +### Start Core Services + +#### Start the Computation Management Server (CVMS) + +From your cocos directory, start the CVMS server with the fraud detection algorithm and dataset: ```bash -go run ./test/computations/main.go ../ai/fraud-detection/ fraud-detection.py public.pem false ../ai/fraud-detection/datasets/creditcard.csv +cd cocos +HOST= go run ./test/cvms/main.go -algo-path ../ai/fraud-detection/fraud-detection.py -public-key-path public.pem -attested-tls-bool false -data-paths ../ai/fraud-detection/datasets/creditcard.csv ``` -On another terminal, run manager: +Expected output: ```bash -cd cmd/manager +{"time":"...","level":"INFO","msg":"cvms_test_server service gRPC server listening at :7001 without TLS"} ``` -Make sure you have the `bzImage` and `rootfs.cpio.gz` in the `cmd/manager/img` directory. +#### Start the Manager + +Navigate to the cocos/cmd/manager directory and start the Manager: ```bash +cd cocos/cmd/manager sudo \ MANAGER_QEMU_SMP_MAXCPUS=4 \ -MANAGER_QEMU_MEMORY_SIZE=25G \ -MANAGER_GRPC_URL=localhost:7001 \ +MANAGER_QEMU_MEMORY_SIZE=8G \ +MANAGER_GRPC_HOST=localhost \ +MANAGER_GRPC_PORT=7002 \ MANAGER_LOG_LEVEL=debug \ MANAGER_QEMU_ENABLE_SEV_SNP=false \ MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ @@ -113,236 +161,244 @@ MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ go run main.go ``` -After sometime you will see the computation server will output a port number. This port number is the port on which the computation server is running. You can use this port number to run the client. - -The logs will look like this: +Expected output: ```bash -{"time":"2024-08-19T10:23:50.068445068+03:00","level":"INFO","msg":"manager_test_server service gRPC server listening at :7001 without TLS"} -{"time":"2024-08-19T10:24:17.767534539+03:00","level":"DEBUG","msg":"received who am on ip address [::1]:45608"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1724052258 nanos:76069455} computation_id:"1" originator:"manager" status:"starting"} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1724052258 nanos:76390596} computation_id:"1" originator:"manager" status:"in-progress"} -received agent log -&{message:"char device redirected to /dev/pts/5 (label compat_monitor0)\n" computation_id:"1" level:"debug" timestamp:{seconds:1724052258 nanos:140448274}} -received agent log -&{message:"qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:EDX.mmxext [bit 22]\nqemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:EDX.fxsr-opt [bit 25]\nqemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.cr8legacy [bit 4]\nqemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6]\nqemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.misalignsse [bit 7]\nqemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.osvw [bit 9]\n" computation_id:"1" level:"error" timestamp:{seconds:1724052258 nanos:177482826}} -received agent log -&{message:"\x1b[2J\x1b[01;01H" computation_id:"1" level:"debug" timestamp:{seconds:1724052258 nanos:485734327}} -received agent event -&{event_type:"vm-provision" timestamp:{seconds:1724052271 nanos:480190393} computation_id:"1" originator:"manager" status:"complete"} -received runRes -&{agent_port:"6050" computation_id:"1"} -received agent log -&{message:"Transition: receivingManifest -> receivingManifest\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1724052271 nanos:479293635}} -received agent event -&{event_type:"receivingAlgorithm" timestamp:{seconds:1724052271 nanos:480207098} computation_id:"1" originator:"agent" status:"in-progress"} -received agent log -&{message:"agent service gRPC server listening at :7002 without TLS" computation_id:"1" level:"INFO" timestamp:{seconds:1724052271 nanos:480676615}} -received agent event -&{event_type:"receivingData" timestamp:{seconds:1724052647 nanos:92491532} computation_id:"1" originator:"agent" status:"in-progress"} -received agent log -&{message:"Transition: receivingData -> receivingData\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1724052647 nanos:92466438}} -received agent event -&{event_type:"running" timestamp:{seconds:1724052722 nanos:889675666} computation_id:"1" originator:"agent" status:"in-progress"} -received agent log -&{message:"computation run started" computation_id:"1" level:"DEBUG" timestamp:{seconds:1724052722 nanos:889653708}} -received agent log -&{message:"Collecting pandas~=2.2.2 (from -r /tmp/requirements.txt3799616143 (line 1))\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1724052725 nanos:908283024}} -received agent log -&{message:" Downloading pandas-2.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB)\n" computation_id:"1" level:"DEBUG" timestamp:{seconds:1724052726 nanos:447295301}} -received agent log - +{"time":"...","level":"INFO","msg":"Manager started without confidential computing support"} +{"time":"...","level":"INFO","msg":"manager service gRPC server listening at localhost:7002 without TLS"} ``` -On another terminal, upload the artifacts to the computation server: +### Create CVM and Upload Fraud Detection Algorithm + +#### Create CVM + +From your cocos directory: ```bash -export AGENT_GRPC_URL=localhost: +export MANAGER_GRPC_URL=localhost:7002 +./build/cocos-cli create-vm --log-level debug --server-url ":7001" ``` -```bash -./build/cocos-cli algo ../ai/fraud-detection/fraud-detection.py ./private.pem -a python -r ../ai/fraud-detection/requirements.txt +**Important:** Note the id and port from the cocos-cli output. -``` +Expected output: -Output: +```bash +🔗 Connected to manager using without TLS +🔗 Creating a new virtual machine +✅ Virtual machine created successfully with id and port ``` -2024/08/19 10:30:27 Uploading algorithm binary: ../../ai/fraud-detection/fraud-detection.py -Uploading algorithm... 100% [===================================================================================>] -2024/08/19 10:30:47 Successfully uploaded algorithm + +Expected CVMS server output: + +```bash +&{message:"Method InitComputation for computation id 1 took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"ReceivingAlgorithm" timestamp:{...} computation_id:"1" originator:"agent" status:"InProgress"} +&{message:"agent service gRPC server listening at 10.0.2.15: without TLS" computation_id:"1" level:"INFO" timestamp:{...}} ``` +#### Export Agent gRPC URL -Upload the data to the computation server: +Set the AGENT_GRPC_URL using the port noted in the previous step (default 6100): ```bash -./build/cocos-cli data ../ai/fraud-detection/datasets/creditcard.csv ./private.pem +export AGENT_GRPC_URL=localhost: ``` -Output: -``` -2024/08/19 10:31:41 Uploading dataset CSV: ../../ai/fraud-detection/datasets/creditcard.csv -Uploading data... 100% [========================================================================================>] -2024/08/19 10:32:02 Successfully uploaded dataset -``` +#### Upload Fraud Detection Algorithm -When the results are ready, download the results: +From your cocos directory: ```bash -./build/cocos-cli result ./private.pem -``` -Output: -``` -2024/08/19 10:55:01 Retrieving computation result file -2024/08/19 10:55:21 Computation result retrieved and saved successfully! +./build/cocos-cli algo ../ai/fraud-detection/fraud-detection.py ./private.pem -a python -r ../ai/fraud-detection/requirements.txt ``` -The above will generate a `results.zip` file. Copy this file to the ai directory: +Expected output: ```bash -cp results.zip ../ai/fraud-detection/ +🔗 Connected to agent without TLS +Uploading algorithm file: ../ai/fraud-detection/fraud-detection.py +🚀 Uploading algorithm [███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded algorithm! ✔ ``` -Test the model with the test data: +#### Upload Fraud Detection Dataset + +Upload the fraud detection dataset: ```bash -cd ../ai/fraud-detection +./build/cocos-cli data ../ai/fraud-detection/datasets/creditcard.csv ./private.pem ``` +Expected output: + ```bash -unzip results.zip -d results +🔗 Connected to agent without TLS +Uploading dataset: ../ai/fraud-detection/datasets/creditcard.csv +📦 Uploading data [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded dataset! ✔ ``` -The image can be any image from the test dataset: +Watch the CVMS server logs for training progress. Look for completion messages: ```bash -python prediction.py +&{message:"Method Data took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Starting"} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Completed"} +&{event_type:"ConsumingResults" timestamp:{...} computation_id:"1" originator:"agent" status:"Ready"} ``` -## Testing with Prism - -Make sure you have the Cocos repository cloned and eos buildroot installed. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) +#### Download Fraud Detection Results -Clone the ai repository which has the fraud detection algorithm: +From your cocos directory: ```bash -git clone https://github.com/ultravioletrs/ai.git +./build/cocos-cli result ./private.pem ``` +Expected output: + ```bash -cd ai/fraud-detection +🔗 Connected to agent without TLS +⏳ Retrieving computation result file +📥 Downloading result [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Computation result retrieved and saved successfully as results.zip! ✔ ``` -For detailed instructions on how to fetch datasets from Kaggle, please refer to the [Datasets](#dataset) section. -In your browser launch to PRISM SaaS at https://prism.ultraviolet.rs. +#### Test the Trained Model + +1. Copy results to the fraud detection directory: + + ```bash + cp results.zip ../ai/fraud-detection/ + ``` + +2. Navigate to fraud detection directory and extract results: + + ```bash + cd ../ai/fraud-detection + unzip results.zip -d results + ``` + +3. Test the model: -- Create a user at https://prism.ultraviolet.rs. -- Create a workspace -- Login to the created workspace -- Create a backend. -- Issue Certs for the backend, request download and download the certs -- Unzip the folder and copy the contents to the managers `cmd/manager/` directory under `cocos` folder -- Start the manager with the backend address. + ```bash + python prediction.py + ``` -The following recordings demonstrate how to set up prism and run computation: [Part1](https://jam.dev/c/a9d0771c-eea7-4b91-8e78-b856a8fab1a6) and [Part2](https://jam.dev/c/a6e66c22-fdd9-42c0-9231-f8e3f074d28e) +Expected output will show confusion matrix and AUPRC visualizations as .png images. -Build cocos artifacts: +#### Remove Fraud Detection CVM + +Use the `` obtained during CVM creation: ```bash -make all +./build/cocos-cli remove-vm ``` -Before running the computation server, we need to issue certificates for the computation server and the client. This is done from the PRISM SaaS. -Public/Private key pairs are needed for the users that will provide the algorithm, dataset and consume the results. - -This can be done by running the following commands: +Expected output: ```bash -./build/cocos-cli keys -k rsa +🔗 Connected to manager using without TLS +🔗 Removing virtual machine +✅ Virtual machine removed successfully ``` -You need to have done the following: +## Testing with Prism +Prism provides a web-based interface for managing Cocos computations. - ```bash - cd cmd/manager - ``` +### Prerequisites - Make sure you have the `bzImage` and `rootfs.cpio.gz` in the `cmd/manager/img` directory. +1. **Clone and start Prism:** - ```bash - sudo \ - MANAGER_QEMU_SMP_MAXCPUS=4 \ - MANAGER_QEMU_MEMORY_SIZE=25G \ - MANAGER_GRPC_URL=localhost:7011 \ - MANAGER_LOG_LEVEL=debug \ - MANAGER_QEMU_ENABLE_SEV_SNP=false \ - MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ - MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ - MANAGER_GRPC_CLIENT_CERT=cert.pem \ - MANAGER_GRPC_CLIENT_KEY=key.pem \ - MANAGER_GRPC_SERVER_CA_CERTS=ca.pem \ - go run main.go - ``` + ```bash + git clone https://github.com/ultravioletrs/prism.git + cd prism + make run + ``` -- Create the fraud detection computation. To get the filehash for all the files go to `cocos` folder and use the cocos-cli. For the file names use `creditcard.csv` and `fraud-detection.py` +2. **Prepare fraud detection dataset** (follow the same steps as in the Cocos section above) - ```bash - ./build/cocos-cli file-hash ../ai/fraud-detection/datasets/creditcard - ``` +3. **Build Cocos artifacts:** - ```bash - ./build/cocos-cli file-hash ../ai/fraud-detection/fraud-detection.py - ``` + ```bash + cd cocos + make all + ``` -- After the computation has been created, each user needs to upload their public key generated by `cocos-cli`. This key will enable the respective user to upload the datatsets and algorithms and also download the results. +4. **Generate keys:** - ```bash - ./build/cocos-cli keys -k rsa - ``` + ```bash + ./build/cocos-cli keys -k="rsa" + ``` -- Click run computation and wait for the vm to be provisioned.Copy the aggent port number and export `AGENT_GRPC_URL` +### Prism Setup Process - ```bash - export AGENT_GRPC_URL=localhost: - ``` +1. **Create a user account** +2. **Create a workspace** +3. **Login to the created workspace** +4. **Create a CVM** and wait for it to come online +5. **Create the computation** and set a name and description. +6. Add participants using computation roles. -- After vm has been provisioned upload the datasets and the algorithm +### Create Fraud Detection Computation - ```bash - ./build/cocos-cli algo ../ai/fraud-detection/fraud-detection.py ./private.pem -a python -r ../ai/fraud-detection/requirements.txt - ``` +To create the computation in Prism, you'll need sha3-256 checksums for the dataset and algorithm. Generate these from the cocos folder: - ```bash - ./build/cocos-cli data ../ai/fraud-detection/creditcard.csv ./private.pem - ``` +```bash +./build/cocos-cli checksum ../ai/fraud-detection/datasets/creditcard.csv +./build/cocos-cli checksum ../ai/fraud-detection/fraud-detection.py +``` -- The computation will run, and you will get an event that the results are ready. You can download the results by running the following command: +Use the file names `creditcard.csv` and `fraud-detection.py` when creating the computation asset in Prism. Link the assets to the computation. - ```bash - ./build/cocos-cli results ./private.pem - ``` +### Upload Public Key -The above will generate a `results.zip` file. Copy this file to the ai directory: +After creating the computation, upload your public key generated by `cocos-cli`. This enables you to upload datasets/algorithms and download results: ```bash -cp results.bin ../ai/fraud-detection/ +cat public.pem ``` -Test the model with the test data: +### Run Computation -```bash -cd ../ai/fraud-detection -``` +1. **Click "Run Computation"** and select an available cvm. +2. **Copy the agent port number** and export it: -```bash -unzip results.zip -d results -``` + ```bash + export AGENT_GRPC_URL=localhost: + ``` -The image can be any image from the test dataset: +3. **Upload the algorithm and dataset:** + + ```bash + ./build/cocos-cli algo ../ai/fraud-detection/fraud-detection.py ./private.pem -a python -r ../ai/fraud-detection/requirements.txt + ./build/cocos-cli data ../ai/fraud-detection/datasets/creditcard.csv ./private.pem + ``` + +4. **Monitor the computation** through the Prism interface until you receive an event indicating results are ready + +5. **Download the results:** + + ```bash + ./build/cocos-cli result ./private.pem + ``` + +### Test Results + +Follow the same testing steps as in the Cocos section: ```bash +cp results.zip ../ai/fraud-detection/ +cd ../ai/fraud-detection +unzip results.zip -d results python prediction.py ``` + +## Notes + +- The fraud detection model training is relatively fast compared to image-based models +- Memory allocation of 8GB should be sufficient for the Manager +- The model works with credit card transaction data and produces binary classification (fraud/non-fraud) +- Results include confusion matrix and AUPRC visualizations to evaluate model performance diff --git a/rul-turbofan/README.md b/rul-turbofan/README.md index da93df0..8df37aa 100644 --- a/rul-turbofan/README.md +++ b/rul-turbofan/README.md @@ -1,4 +1,4 @@ -111# Remaining Useful Life (RUL) Prediction with LSTM +# Remaining Useful Life (RUL) Prediction with LSTM This repository contains code and resources for predicting the Remaining Useful Life (RUL) of machinery using Long Short-Term Memory (LSTM) neural networks. The dataset used for this project is provided by NASA and was downloaded from Kaggle. @@ -6,12 +6,21 @@ This repository contains code and resources for predicting the Remaining Useful - [Introduction](#introduction) - [Dataset](#dataset) - [Model Architecture](#model-architecture) +- [Setup Virtual Environment](#setup-virtual-environment) +- [Install](#install) +- [Train Model](#train-model) +- [Test Model](#test-model) - [Results](#results) +- [Testing with Cocos](#testing-with-cocos) +- [Testing with Prism](#testing-with-prism) +- [Notes](#notes) ## Introduction + Predicting the Remaining Useful Life (RUL) of machinery is crucial for maintenance planning and avoiding unexpected failures. This project leverages a Long Short-Term Memory (LSTM) neural network to predict RUL based on sensor data. ## Dataset + The dataset used in this project is from NASA's Prognostics Data Repository, available on Kaggle. It consists of sensor measurements from machinery over time, including data from various operational conditions. ### Experimental Scenario @@ -19,7 +28,6 @@ The dataset used in this project is from NASA's Prognostics Data Repository, ava This project serves as an experimental exploration into predictive maintenance using machine learning techniques. While the results are promising, it's important to note that this is an experimental setup. Download datasets: - [NASA Dataset on Kaggle](https://www.kaggle.com/datasets/behrad3d/nasa-cmaps) -or you can also download datasets here: - [NASA website](https://www.nasa.gov/intelligent-systems-division/discovery-and-systems-health/pcoe/pcoe-data-set-repository/) The NASA CMAPSS dataset includes the following key components: @@ -31,53 +39,92 @@ The NASA CMAPSS dataset includes the following key components: Dataset is divided into four data sets: -Data Set: FD001 -Train trjectories: 100 -Test trajectories: 100 -Conditions: ONE (Sea Level) -Fault Modes: ONE (HPC Degradation) - -Data Set: FD002 -Train trjectories: 260 -Test trajectories: 259 -Conditions: SIX -Fault Modes: ONE (HPC Degradation) +**Data Set: FD001** +- Train trajectories: 100 +- Test trajectories: 100 +- Conditions: ONE (Sea Level) +- Fault Modes: ONE (HPC Degradation) -Data Set: FD003 -Train trjectories: 100 -Test trajectories: 100 -Conditions: ONE (Sea Level) -Fault Modes: TWO (HPC Degradation, Fan Degradation) +**Data Set: FD002** +- Train trajectories: 260 +- Test trajectories: 259 +- Conditions: SIX +- Fault Modes: ONE (HPC Degradation) -Data Set: FD004 -Train trjectories: 248 -Test trajectories: 249 -Conditions: SIX -Fault Modes: TWO (HPC Degradation, Fan Degradation) +**Data Set: FD003** +- Train trajectories: 100 +- Test trajectories: 100 +- Conditions: ONE (Sea Level) +- Fault Modes: TWO (HPC Degradation, Fan Degradation) - Each data set is further divided into training and test subsets. Each time series is from a different engine, the data can be considered to be from a fleet of engines of the same type. Each engine starts with different degrees of initial wear and manufacturing variation which is unknown to the user. +**Data Set: FD004** +- Train trajectories: 248 +- Test trajectories: 249 +- Conditions: SIX +- Fault Modes: TWO (HPC Degradation, Fan Degradation) +Each data set is further divided into training and test subsets. Each time series is from a different engine, the data can be considered to be from a fleet of engines of the same type. Each engine starts with different degrees of initial wear and manufacturing variation which is unknown to the user. The goal of using this dataset is to leverage these measurements and settings to predict the Remaining Useful Life (RUL) of the engines accurately. ## Model Architecture -The model architecture is based on LSTM (Long Short Term Memory), which is well-suited for time series prediction tasks. -To train and evaluate the model, run the following command: -`python RUL_training.py` +The model architecture is based on LSTM (Long Short Term Memory), which is well-suited for time series prediction tasks. + +## Setup Virtual Environment + +```bash +python3 -m venv venv +source venv/bin/activate +pip install -r requirements.txt +``` + +PyTorch can be installed from the [PyTorch website](https://pytorch.org/get-started/locally/). Follow the instructions to match your specific system configuration (e.g., CUDA version, OS). + +## Install + +Fetch the data from Kaggle - [NASA CMAPS Dataset](https://www.kaggle.com/datasets/behrad3d/nasa-cmaps) dataset + +```bash +kaggle datasets download -d behrad3d/nasa-cmaps +``` + +To run the above command you would need [kaggle cli](https://github.com/Kaggle/kaggle-api) installed and API credentials setup. This can be done by following [this documentation](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#kaggle-api). + +You will get `nasa-cmaps.zip` in the folder + +Extract the dataset: + +```bash +unzip nasa-cmaps.zip -d datasets/ +``` + +This will create `datasets` directory with the NASA CMAPSS dataset files. + +## Train Model + +To do the training, execute: + +```bash +python RUL_training.py +``` + +## Test Model + +Inference can be done using `pred_model.py`. The model file will be saved after training. + +```bash +python pred_model.py +``` ## Results The performance of the model is evaluated using metrics such as Mean Squared Error (Train Loss), Validation Mean Squared Error (Val Loss) and R² Score (Coefficient of Determination) -After the training process is completed, the algorithm saves the trained model to a file. This allows you to reuse the model for predictions without needing to retrain it each time. The model is saved in a `pth` format. Additionally, it generates graphs of training loss and validation loss over epochs to help visualize the model's learning process. This helps in visualizing the learning process and diagnosing potential issues like overfitting. +After the training process is completed, the algorithm saves the trained model to a file. This allows you to reuse the model for predictions without needing to retrain it each time. The model is saved in a `pth` format. Additionally, it generates graphs of training loss and validation loss over epochs to help visualize the model's learning process. ![](images/val-r2.png) -To use model, run: - -`python pred_model.py` - Visualize Predictions: After running the script, it will generate a plot showing the predicted RUL versus the actual RUL. This plot helps in understanding how well the model predicts the Remaining Useful Life. @@ -86,141 +133,354 @@ Here's an example of how the plot might look: ![](images/rul.png) In this plot: - - The x-axis represents the time cycles. - The y-axis represents the Remaining Useful Life (RUL). The blue curve represents the actual RUL, while the red curve represents the predicted RUL. This visualization helps assess how well the model predicts the RUL compared to the ground truth. -## Testing with Prism +## Testing with Cocos + +Make sure you have the Cocos repository cloned and set up. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) + +### Prerequisites + +1. **Clone the repositories:** + + ```bash + git clone https://github.com/ultravioletrs/cocos.git + git clone https://github.com/ultravioletrs/ai.git + ``` + +2. **Navigate to the RUL directory:** + + ```bash + cd ai/rul-turbofan + ``` + +3. **Download and prepare the dataset:** + + Install kaggle-cli to download the dataset: + + ```bash + pip install kaggle + ``` + + Set the [kaggle API key](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md#api-credentials) and download the dataset: + + ```bash + kaggle datasets download -d behrad3d/nasa-cmaps + ``` + + Prepare the dataset: + + ```bash + unzip nasa-cmaps.zip -d datasets/ + ``` + +4. **Change to cocos directory and build artifacts:** + + ```bash + cd ../../cocos + make all + ``` + +5. **Generate keys:** + + ```bash + ./build/cocos-cli keys -k="rsa" + ``` + +### Finding Your Host IP Address + +Find your host machine's IP address (avoid using localhost): + +```bash +ip a +``` + +Look for your network interface (e.g., wlan0 for WiFi, eth0 for Ethernet) and note the inet address. For example, if you see `192.168.1.100`, use that as your ``. + +### Start Core Services + +#### Start the Computation Management Server (CVMS) + +From your cocos directory, start the CVMS server with the RUL algorithm and datasets: + +```bash +cd cocos +HOST= go run ./test/cvms/main.go -algo-path ../ai/rul-turbofan/RUL_training.py -public-key-path public.pem -attested-tls-bool false -data-paths ../ai/rul-turbofan/datasets/train_FD001.txt,../ai/rul-turbofan/datasets/test_FD001.txt,../ai/rul-turbofan/datasets/RUL_FD001.txt +``` + +Expected output: + +```bash +{"time":"...","level":"INFO","msg":"cvms_test_server service gRPC server listening at :7001 without TLS"} +``` + +#### Start the Manager -Make sure you have the Cocos repository cloned and eos buildroot installed. This can be done by following the instructions in the [Cocos Documentation](https://docs.cocos.ultraviolet.rs/getting-started/) +Navigate to the cocos/cmd/manager directory and start the Manager: -Clone the ai repository which has the rul-turbofan algorithm: +```bash +cd cocos/cmd/manager +sudo \ +MANAGER_QEMU_SMP_MAXCPUS=4 \ +MANAGER_QEMU_MEMORY_SIZE=8G \ +MANAGER_GRPC_HOST=localhost \ +MANAGER_GRPC_PORT=7002 \ +MANAGER_LOG_LEVEL=debug \ +MANAGER_QEMU_ENABLE_SEV_SNP=false \ +MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ +MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ +go run main.go +``` + +Expected output: + +```bash +{"time":"...","level":"INFO","msg":"Manager started without confidential computing support"} +{"time":"...","level":"INFO","msg":"manager service gRPC server listening at localhost:7002 without TLS"} +``` + +### Create CVM and Upload RUL Algorithm + +#### Create CVM + +From your cocos directory: ```bash -git clone https://github.com/ultravioletrs/ai.git +export MANAGER_GRPC_URL=localhost:7002 +./build/cocos-cli create-vm --log-level debug --server-url ":7001" ``` +**Important:** Note the id and port from the cocos-cli output. + +Expected output: + +```bash +🔗 Connected to manager using without TLS +🔗 Creating a new virtual machine +✅ Virtual machine created successfully with id and port +``` + +Expected CVMS server output: + +```bash +&{message:"Method InitComputation for computation id 1 took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"ReceivingAlgorithm" timestamp:{...} computation_id:"1" originator:"agent" status:"InProgress"} +&{message:"agent service gRPC server listening at 10.0.2.15: without TLS" computation_id:"1" level:"INFO" timestamp:{...}} +``` + +#### Export Agent gRPC URL + +Set the AGENT_GRPC_URL using the port noted in the previous step (default 6100): + ```bash -cd ai/rul-turbofan +export AGENT_GRPC_URL=localhost: ``` -In your browser launch to PRISM SaaS at https://prism.ultraviolet.rs. -- Create a user at https://prism.ultraviolet.rs. -- Create a workspace -- Login to the created workspace -- Create a backend. -- Issue Certs for the backend, request download and download the certs -- Unzip the folder and copy the contents to the managers `cmd/manager/` directory under `cocos` folder -- Start the manager with the backend address. +#### Upload RUL Algorithm -The following recordings demonstrate how to set up prism and run computation: [Part1](https://jam.dev/c/a9d0771c-eea7-4b91-8e78-b856a8fab1a6) and [Part2](https://jam.dev/c/a6e66c22-fdd9-42c0-9231-f8e3f074d28e) +From your cocos directory: -Build cocos artifacts: +```bash +./build/cocos-cli algo ../ai/rul-turbofan/RUL_training.py ./private.pem -a python -r ../ai/rul-turbofan/requirements.txt +``` + +Expected output: ```bash -make all +🔗 Connected to agent without TLS +Uploading algorithm file: ../ai/rul-turbofan/RUL_training.py +🚀 Uploading algorithm [███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded algorithm! ✔ ``` -Before running the computation server, we need to issue certificates for the computation server and the client. This is done from the PRISM SaaS. -Public/Private key pairs are needed for the users that will provide the algorithm, dataset and consume the results. +#### Upload RUL Datasets -This can be done by running the following commands: +Upload each dataset file: ```bash -./build/cocos-cli keys +./build/cocos-cli data ../ai/rul-turbofan/datasets/train_FD001.txt ./private.pem +./build/cocos-cli data ../ai/rul-turbofan/datasets/test_FD001.txt ./private.pem +./build/cocos-cli data ../ai/rul-turbofan/datasets/RUL_FD001.txt ./private.pem ``` -You need to have done the following: +Expected output for each upload: +```bash +🔗 Connected to agent without TLS +Uploading dataset: ../ai/rul-turbofan/datasets/train_FD001.txt +📦 Uploading data [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Successfully uploaded dataset! ✔ +``` - ```bash - cd cmd/manager - ``` +Watch the CVMS server logs for training progress. Look for completion messages: - Make sure you have the `bzImage` and `rootfs.cpio.gz` in the `cmd/manager/img` directory. +```bash +&{message:"Method Data took ... to complete without errors" computation_id:"1" level:"INFO" timestamp:{...}} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Starting"} +&{event_type:"Running" timestamp:{...} computation_id:"1" originator:"agent" status:"Completed"} +&{event_type:"ConsumingResults" timestamp:{...} computation_id:"1" originator:"agent" status:"Ready"} +``` - ```bash - sudo \ - MANAGER_QEMU_SMP_MAXCPUS=4 \ - MANAGER_QEMU_MEMORY_SIZE=25G \ - MANAGER_GRPC_URL=localhost:7011 \ - MANAGER_LOG_LEVEL=debug \ - MANAGER_QEMU_ENABLE_SEV_SNP=false \ - MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd \ - MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd \ - MANAGER_GRPC_CLIENT_CERT=cert.pem \ - MANAGER_GRPC_CLIENT_KEY=key.pem \ - MANAGER_GRPC_SERVER_CA_CERTS=ca.pem \ - go run main.go - ``` +#### Download RUL Results -- Create the rul-training computation. To get the filehash for all the files go to `cocos` folder and use the cocos-cli. For the file names use `RUL_FD001.txt`, `test_FD001.txt`, `train_FD001.txt` and `rul-training.py` +From your cocos directory: - ```bash - go run ./cmd/cli/main.go checksum ../ai/rul-turbofan/datasets/RUL_FD001.txt - ``` +```bash +./build/cocos-cli result ./private.pem +``` - ```bash - go run ./cmd/cli/main.go checksum ../ai/rul-turbofan/datasets/train_FD001.txt - ``` +Expected output: + +```bash +🔗 Connected to agent without TLS +⏳ Retrieving computation result file +📥 Downloading result [██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████] [100%] +Computation result retrieved and saved successfully as results.zip! ✔ +``` + +#### Test the Trained Model + +1. Copy results to the RUL directory: ```bash - go run ./cmd/cli/main.go checksum ../ai/rul-turbofan/datasets/test_FD001.txt - ``` + cp results.zip ../ai/rul-turbofan/ + ``` + +2. Navigate to RUL directory and extract results: ```bash - go run ./cmd/cli/main.go checksum ../ai/rul-turbofan/rul-training.py - ``` + cd ../ai/rul-turbofan + unzip results.zip -d results + ``` -- After the computation has been created, each user needs to upload their public key generated by `cocos-cli`. This key will enable the respective user to upload the datatsets and algorithms and also download the results. +3. Test the model: - ```bash - ./build/cocos-cli keys - ``` + ```bash + python pred_model.py + ``` -- Click run computation and wait for the vm to be provisioned. Copy the aggent port number and export `AGENT_GRPC_URL` +Expected output will show RUL prediction plots and performance metrics. - ```bash - export AGENT_GRPC_URL=localhost: - ``` +#### Remove RUL CVM -- After vm has been provisioned upload the datasets and the algorithm +Use the `` obtained during CVM creation: - ```bash - ./build/cocos-cli algo -a python ../ai/rul-turbofan/rul-training.py ./private.pem -r ../ai/rul-training/requirements.txt - ``` +```bash +./build/cocos-cli remove-vm +``` - ```bash - ./build/cocos-cli data ../ai/rul-turbofan/datasets/RUL_FD001.txt ./private.pem - ``` +Expected output: - ```bash - ./build/cocos-cli data ../ai/rul-turbofan/datasets/test_FD001.txt ./private.pem - ``` +```bash +🔗 Connected to manager using without TLS +🔗 Removing virtual machine +✅ Virtual machine removed successfully +``` + +## Testing with Prism + +Prism provides a web-based interface for managing Cocos computations. - ```bash - ./build/cocos-cli data ../ai/rul-turbofan/datasets/train_FD001.txt ./private.pem - ``` -- The computation will run, and you will get an event that the results are ready. You can download the results by running the following command: +### Prerequisites - ```bash - ./build/cocos-cli results ./private.pem - ``` +1. **Clone and start Prism:** -The above will generate a `results.zip` file. Copy this file to the ai directory: + ```bash + git clone https://github.com/ultravioletrs/prism.git + cd prism + make run + ``` + +2. **Prepare RUL datasets** (follow the same steps as in the Cocos section above) + +3. **Build Cocos artifacts:** + + ```bash + cd cocos + make all + ``` + +4. **Generate keys:** + + ```bash + ./build/cocos-cli keys -k="rsa" + ``` + +### Prism Setup Process + +1. **Create a user account** +2. **Create a workspace** +3. **Login to the created workspace** +4. **Create a CVM** and wait for it to come online +5. **Create the computation** and set a name and description. +6. Add participants using computation roles. + +### Create RUL Computation + +To create the computation in Prism, you'll need sha3-256 checksums for all datasets and the algorithm. Generate these from the cocos folder: ```bash -cp results.bin ../ai/rul-turbofan/ +./build/cocos-cli checksum ../ai/rul-turbofan/datasets/train_FD001.txt +./build/cocos-cli checksum ../ai/rul-turbofan/datasets/test_FD001.txt +./build/cocos-cli checksum ../ai/rul-turbofan/datasets/RUL_FD001.txt +./build/cocos-cli checksum ../ai/rul-turbofan/RUL_training.py ``` -Test the model with the test data: +Use the file names `train_FD001.txt`, `test_FD001.txt`, `RUL_FD001.txt`, and `RUL_training.py` when creating the computation asset in Prism. Link the assets to the computation. + +### Upload Public Key + +After creating the computation, upload your public key generated by `cocos-cli`. This enables you to upload datasets/algorithms and download results: ```bash -cd ../ai/rul-turbofan +cat public.pem ``` +### Run Computation + +1. **Click "Run Computation"** and select an available cvm. +2. **Copy the agent port number** and export it: + + ```bash + export AGENT_GRPC_URL=localhost: + ``` + +3. **Upload the algorithm and datasets:** + + ```bash + ./build/cocos-cli algo ../ai/rul-turbofan/RUL_training.py ./private.pem -a python -r ../ai/rul-turbofan/requirements.txt + ./build/cocos-cli data ../ai/rul-turbofan/datasets/train_FD001.txt ./private.pem + ./build/cocos-cli data ../ai/rul-turbofan/datasets/test_FD001.txt ./private.pem + ./build/cocos-cli data ../ai/rul-turbofan/datasets/RUL_FD001.txt ./private.pem + ``` + +4. **Monitor the computation** through the Prism interface until you receive an event indicating results are ready + +5. **Download the results:** + + ```bash + ./build/cocos-cli result ./private.pem + ``` + +### Test Results + +Follow the same testing steps as in the Cocos section: + ```bash +cp results.zip ../ai/rul-turbofan/ +cd ../ai/rul-turbofan unzip results.zip -d results -``` \ No newline at end of file +python pred_model.py +``` + +## Notes + +- The RUL model training with LSTM networks is moderately resource-intensive +- Memory allocation of 8GB should be sufficient for the Manager +- The model works with time-series sensor data from aircraft engines +- Results include visualization plots showing predicted vs actual RUL values +- The dataset contains multiple fault scenarios (FD001-FD004) - this example uses FD001 +- LSTM networks are particularly well-suited for sequential data and time-series prediction tasks