Real-time scheduling and inference framework for Online Failure Prediction models based on TensorFlow Lite (TFLite) Micro running on ESP32 devices
- Overview
- Requirements & Installation
- Configuration
- Usage
- Testing workflow
- Project Structure
- License
- Contact
This project provides a real-time scheduling and inference framework for Online Failure Prediction models based on TensorFlow Lite (TFLite) Micro running on ESP32 devices. It includes:
- Events gathering: serial UART or memory embedded events gatherer to feed models inferences.
- Observation Windows (OW): buffer incoming events, then schedule inferences to be ran after a fixed delay.
- Model management: Automatic code generation for embedding TFLite binaries and execution metadata (models inference time, decision threshold, triggering events).
- Scheduler: Planner to serialize model inferences within Maximum Inference Time windows. Serializes all inferences so they never overlap, computing each model’s start time and deadline.
- Offloading: when a model can’t be scheduled within the window, it’s offloaded (
OFFLOAD). - Runtime: A single persistent worker task that executes queued inferences, enforces start times and deadlines via high-resolution timers, and logs events.
- Deadline Timer: forcibly aborts any inference that exceeds its deadline, logging an
URGENT_OFFLOAD. - Logging & Analysis: Efficient UART-based CSV logging and desktop scripts (
plot_results.py,manage_test.py) for real-time monitoring, offline analysis, result visualization and automated testing.
- ESP32-WROOM physical board. Tests made with ESP32-WROOM Core i3 2367 at 2,4 GHz, 448KB ROM, 320 KB data DRAM, 200KB instructions IRAM, 4 MB flash memory
- ESP-IDF v6.0-dev or later
- Python ≥ 3.8
- pip, bash, xxd
# Clone repo
git clone https://github.com/youruser/ACES-RTEdgeInference.git
cd ACES-RTEdgeInference
# Install ESP-IDF
# Follow Espressif's [official guide](https://docs.espressif.com/projects/esp-idf/en/v6.0-dev/get-started/index.html).
cd esp-idf
git checkout v6.0-dev
./install.sh
. ./export.sh
cd ..
# Install Python dependencies
python3 -m pip install -r requirements.txtmain/config.h
#define OW_MS (10000) // Observation window (ms)
#define MIT_MS (5000) // Maximum Inference Time allowed (ms)
#define TUNIT_MS (400) // Events probing cycle (ms)
typedef int8_t event_t; // Events basic type
sdkconfig settings:
CONFIG_ESP_TIMER_SUPPORTS_ISR_DISPATCH_METHOD=y
CONFIG_TIMER_TASK_AFFINITY_CPU0=y
CONFIG_FREERTOS_HZ=1000
Arena size per model in tflite_runner.cpp: adjust ARENA_PER_MODEL.
- Generate model registry
From tflite files in ./models, with exec_times.csv, thresholds.csv, triggers.csv.
scripts/generate_models.sh
- Build & Flash
scripts/run.sh (serial|memory) - Plot results
Generates
results/<logfile>.jpgstacked-bar counts per window.plot_results.py <logfile>
- (Optional) Get models tflite files,
exec_times.csv,thresholds.csvandtriggers.csvinmodels/. - (Optional) Run:
python3 scrips/generate_models.sh
- (Optional) Prepare
tests/testNfolder with raw*-inout.csvandconfig.h. - Run (it may take several minutes):
python3 scripts/manage_test.py tests/testN -M 10
- Inspect
tests/testN/*metrics.csvandtests/testN/monitor*.jpg.
experiments/exp01: regular multimodel example with Predictions/Offloads experiments/exp02: multimodel example with UrgentOffload experiments/exp03: multimodel example with UrgentOffload and Tu less than MTIs experiments/exp04: exp03 example with serial input data
.
├── config/ # Source configuration
│ └── config.h # typedefs & parameters configuration
├── data/ # Test data: input events (to feed either serial gatherer or in-memory events)
│ ├── events_serial.txt # csv input events, a line at each time
│ └── memory_events.h # in-memory input events from events_serial.txt by generate_memory_data.py
├── main/ # ESP32 firmware source
│ ├── CMakeLists.txt # idf components register
│ ├── idf_component.yml # idf requirements
│ └── *.[ch] # C/C++ code
├── models/ # *.tflite models & metadata CSVs
│ ├── exec_times.csv # models exec_times (ms) a line each model as names order in the folder
│ ├── model*.tflite # tflite files
│ ├── thresholds.csv # decision thresholds (0..1) a line each model as names order in the folder
│ └── triggers.csv # models events triggers (-|*|events list) a line each one
├── results/ # results files, generated by ESP32 on board software execution
├── scripts/ # Desktop automation and analysis scripts
│ ├── generate_memory_data.py # Generates memory data from events_serial.txt
│ ├── generate_models.sh # Embeds models + exec_times + thresholds
│ ├── generate_serial_data.py # Puts events_serial.txt in the serial input
│ ├── list_ops.py # Support to identify op codes in tflite files
│ ├── manage_test.py # End-to-end test harness for k-fold data
│ ├── map_ops.py # Support to identify tflite operations in codes
│ ├── plot_results.py # Plot per-window stacked bar charts
│ └── run.sh # Run ESP32 board code with (memory|serial) events
├── tests/ # Test campaigns `testN` subfolders
│ ├── test1/ #
│ ├── config.h # Test source configuration
│ ├── events_serial.txt # Test aggregated input data
│ ├── model*-inout.csv # Model test data, input events, expected results
│ ├── monitor*.jpg # JPG Graphical test results (plot_results)
│ ├── monitor*.svg # SVG Graphical test results (plot_results)
│ ├── monitor*.log # ESP32 execution log
│ └── test1_metrics.csv # overall metrics per model
│ ├── test2/ #
│ └── testN/ #
├── CMakeLists.txt # idf project configuration
├── LICENSE.txt # MIT license
├── README.md # This file
├── requirements.txt # Python deps
└── sdkconfig # idf.py menuconfig physical configuration
- **This project is licensed under the MIT License.
- **Author: Juan C. Dueñas, ACES team, special thanks to Edith Galala
- **Email: juancarlos.duenas@upm.es