Skip to content

Releases: SpeyTech/certifiable-inference

v0.1.0: Certifiable CNN Inference Engine

15 Jan 00:49

Choose a tag to compare

🎉 v0.1.0: Certifiable CNN Inference Engine

Production-ready release of the Murray Deterministic Computing Platform (MDCP) reference implementation for safety-critical AI inference.


🔒 Patent-Protected Architecture

Built on MDCP (UK Patent GB2521625.0) - a deterministic computing architecture designed for systems where "mostly correct" isn't an option.


✨ What's New

This release provides a complete, certification-ready CNN inference stack:

Core Components

  • Fixed-Point Arithmetic (Q16.16) - Bit-perfect across all platforms
  • Matrix Operations (GEMM with 64-bit accumulation)
  • 2D Convolution (Deterministic sliding window)
  • Activation Functions (ReLU, Leaky ReLU, in-place processing)
  • Max Pooling (2×2 stride-2 dimension reduction)
  • Deterministic Hash Table (Sorted iteration order)

Key Properties

Zero dynamic allocation - No malloc() after initialization
Deterministic execution - Same input → Same output, always
Bounded timing - <5% P95 jitter proven (algorithmic variance)
WCET analyzable - Fixed iteration counts, no data-dependent branches
Platform independent - Bit-perfect across x86, ARM, RISC-V


📊 Verification Results

Test Coverage

  • 7 test suites - 100% passing
  • 70+ test cases - All verified
  • 1000+ determinism iterations - Bit-perfect repeatability proven
  • 10,000+ timing measurements - Consistency verified

Timing Benchmarks (Linux, User-Space)

Conv2D (16×16 → 14×14):

Mean:  13,625 ns
P50:   13,348 ns
P95:   13,435 ns  (154ns variance = 1.16% ✅ EXCELLENT)
P99:   20,817 ns  (OS scheduler interference)
Max:  189,305 ns  (rare context switch)

Matrix Multiply (10×10 × 10×10):

Mean:   6,258 ns
P50:    6,053 ns
P95:    6,116 ns  (132ns variance = 2.21% ✅ EXCELLENT)
P99:   10,548 ns  (OS scheduler interference)
Max:  122,084 ns  (rare context switch)

Key Finding: 95% of executions within 1-2% timing variance. P99 outliers are Linux scheduler artifacts that disappear on RTOS deployments.


🎯 Target Applications

This implementation is designed for safety-critical systems requiring:

  • DO-178C (Aerospace) - Level A/B capable
  • ISO 26262 (Automotive) - ASIL-D ready
  • IEC 62304 (Medical Devices) - Class C compliant
  • IEC 61508 (Industrial) - SIL 3/4 ready

Use cases:

  • Medical devices (pacemakers, surgical robots, insulin pumps)
  • Aerospace systems (flight control, satellite processors)
  • Automotive (ADAS, autonomous driving)
  • Industrial safety controllers

🔗 Interactive Demo

Try it live: https://inference.speytech.com/

The emulator demonstrates:

  • Real-time convolution operations
  • Sobel edge detection (vertical/horizontal)
  • Live timing statistics (P50/P95/P99 percentiles)
  • Pattern cycling (vertical bar, cross, diagonal, checkerboard)
  • Throughput metrics and sparkline visualization

See determinism in action - same pattern → same output, every time.


📚 Documentation

Complete requirements traceability maintained in docs/requirements/:

  • SRS-001 - Matrix Operations
  • SRS-002 - Fixed-Point Arithmetic
  • SRS-003 - Memory Management
  • SRS-004 - Convolution
  • SRS-005 - Activation Functions
  • SRS-006 - Numerical Stability
  • SRS-007 - Deterministic Execution Timing
  • SRS-008 - Max Pooling

Each document includes:

  • Mathematical specifications
  • Compliance mappings (DO-178C, ISO 26262, IEC 62304)
  • Verification methods
  • Traceability to code and tests

🚀 Quick Start

# Clone the repository
git clone https://github.com/williamofai/certifiable-inference.git
cd certifiable-inference

# Build
mkdir build && cd build
cmake ..
make

# Run tests
make test-all

# Run timing benchmarks
make benchmarks

# Try the edge detection example
./edge_detection

🏗️ Technical Highlights

Fixed-Point Arithmetic (Q16.16)

  • Range: -32,768 to +32,767.99998
  • Resolution: 0.0000152588 (1/65536)
  • No floating-point drift
  • Deterministic rounding (round-to-nearest)

Resource Bounds

  • All buffers pre-allocated by caller
  • O(1) stack usage for all operations
  • No recursion
  • No variable-length arrays

Deterministic Timing

  • Fixed iteration counts (dimension-based only)
  • No data-dependent branches in hot paths
  • 64-bit accumulation prevents overflow
  • Sequential memory access patterns

🔬 What Makes This Different

vs. TensorFlow Lite:

  • TensorFlow: Data-dependent optimization (non-deterministic)
  • This: Fixed execution path (deterministic) ✅

vs. ONNX Runtime:

  • ONNX: Dynamic memory allocation (timing variance)
  • This: Pre-allocated buffers (predictable) ✅

vs. PyTorch Mobile:

  • PyTorch: Python runtime overhead (jitter)
  • This: Pure C, zero overhead (minimal jitter) ✅

💼 Commercial Licensing

This implementation is dual-licensed:

  • Open Source: GNU General Public License v3.0 (GPLv3)
  • Commercial: Available for proprietary use in safety-critical systems

The architecture is protected by UK Patent GB2521625.0 (MDCP - Murray Deterministic Computing Platform).

For commercial licensing and certification support:


👨‍💻 About

Built by William Murray - 30 years of UNIX infrastructure experience applied to deterministic computing for safety-critical systems.

  • Visiting Scholar at Heriot-Watt University
  • Creator of C-Sentinel (semantic security monitor)
  • Creator of c-from-scratch (educational C course)
  • Patent holder: GB2521625.0 (MDCP)
  • Based in the Scottish Highlands 🏴󠁧󠁢󠁳󠁣󠁴󠁿

🙏 Acknowledgments

If even one person finds this useful for medical devices, aerospace, or automotive AI, it was worth sharing.

Built from the Scottish Highlands. Pure C99. Zero compromises.


📖 Learn More


Building deterministic AI systems for when lives depend on the answer.