Enterprise-grade hardware-accelerated machine learning inference with IPFS network-based distribution
- Overview
- Installation
- Quick Start
- Architecture
- Supported Hardware
- Supported Models
- Documentation
- IPFS & Distributed Features
- Performance & Optimization
- Troubleshooting
- Testing & Quality
- Contributing
- License
IPFS Accelerate Python combines cutting-edge hardware acceleration, distributed computing, and IPFS network integration to deliver blazing-fast machine learning inference across multiple platforms and devices - from data centers to browsers.
- 🔥 8+ Hardware Platforms - CPU, CUDA, ROCm, OpenVINO, Apple MPS, WebNN, WebGPU, Qualcomm
- 🌐 Distributed by Design - IPFS content addressing, P2P inference, global caching
- 🤖 300+ Models - Full HuggingFace compatibility + custom architectures
- 🌍 Browser-Native - WebNN & WebGPU for client-side acceleration
- 📊 Production Ready - Real-time monitoring, enterprise security, compliance validation
- ⚡ High Performance - Intelligent caching, batch processing, model optimization
# 1. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 2. Install IPFS Accelerate
pip install -U pip setuptools wheel
pip install ipfs-accelerate-py
# 3. Verify installation
python -c "from ipfs_accelerate_py import IPFSAccelerator; print('✅ Ready!')"Choose the profile that matches your needs:
| Profile | Use Case | Installation |
|---|---|---|
| Core | Basic inference | pip install ipfs-accelerate-py |
| Full | Models + API server | pip install ipfs-accelerate-py[full] |
| MCP | MCP server extras | pip install ipfs-accelerate-py[mcp] |
| Dev | Development setup | pip install -e . |
📚 Detailed instructions: Installation Guide | Troubleshooting | Getting Started
from ipfs_accelerate_py import IPFSAccelerator
# Initialize with automatic hardware detection
accelerator = IPFSAccelerator()
# Load any HuggingFace model
model = accelerator.load_model("bert-base-uncased")
# Run inference (automatically optimized for your hardware)
result = model.inference("Hello, world!")
print(result)# Start the MCP server for automation
ipfs-accelerate mcp start
# Run inference directly
ipfs-accelerate inference generate \
--model bert-base-uncased \
--input "Hello, world!"
# List available models and hardware
ipfs-accelerate models list
ipfs-accelerate hardware status
# Start GitHub Actions autoscaler
ipfs-accelerate github autoscaler| Example | Description | Complexity |
|---|---|---|
| Basic Usage | Simple inference with BERT | Beginner |
| Hardware Selection | Choose specific accelerator | Intermediate |
| Distributed Inference | P2P model sharing | Advanced |
| Browser Integration | WebNN/WebGPU in browsers | Advanced |
📖 More examples: examples/ | Quick Start Guide
IPFS Accelerate Python is built on a modular, enterprise-grade architecture:
┌─────────────────────────────────────────────────────────┐
│ Application Layer │
│ Python API • CLI • MCP Server • Web Dashboard │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────┴────────────────────────────────────┐
│ Hardware Abstraction Layer │
│ Unified interface across 8+ hardware platforms │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────┴────────────────────────────────────┐
│ Inference Backends │
│ CPU • CUDA • ROCm • MPS • OpenVINO • WebNN • WebGPU │
└────────────────────┬────────────────────────────────────┘
│
┌────────────────────┴────────────────────────────────────┐
│ IPFS Network Layer │
│ Content addressing • P2P • Distributed caching │
└─────────────────────────────────────────────────────────┘
- Hardware Abstraction: Unified API across 8+ platforms with automatic selection
- IPFS Integration: Content-addressed storage, P2P distribution, intelligent caching
- Performance Modeling: ML-powered optimization and resource management
- MCP Server: Model Context Protocol for standardized automation
- Monitoring: Real-time metrics, profiling, and analytics
📐 Detailed architecture: docs/architecture/overview.md | CI/CD
Run anywhere - from powerful servers to edge devices and browsers:
| Platform | Status | Acceleration | Requirements | Performance |
|---|---|---|---|---|
| CPU (x86/ARM) | ✅ | SIMD, AVX | Any | Good |
| NVIDIA CUDA | ✅ | GPU + TensorRT | CUDA 11.8+ | Excellent |
| AMD ROCm | ✅ | GPU + HIP | ROCm 5.0+ | Excellent |
| Apple MPS | ✅ | Metal | M1/M2/M3 | Excellent |
| Intel OpenVINO | ✅ | CPU/GPU | Intel HW | Very Good |
| WebNN | ✅ | Browser NPU | Chrome, Edge | Good |
| WebGPU | ✅ | Browser GPU | Modern browsers | Very Good |
| Qualcomm | ✅ | Mobile DSP | Snapdragon | Good |
The framework automatically detects and selects the best available hardware:
# Automatic (recommended)
accelerator = IPFSAccelerator() # Uses best available
# Manual selection
accelerator = IPFSAccelerator(device="cuda") # Force CUDA
accelerator = IPFSAccelerator(device="mps") # Force Apple MPS⚙️ Hardware guides: Hardware Optimization | Platform Support
| Category | Models | Status |
|---|---|---|
| Text | BERT, RoBERTa, DistilBERT, ALBERT, GPT-2/Neo/J, T5, BART, Pegasus, Sentence Transformers | ✅ |
| Vision | ViT, DeiT, BEiT, ResNet, EfficientNet, DETR, YOLO | ✅ |
| Audio | Whisper, Wav2Vec2, WavLM, Audio Transformers | ✅ |
| Multimodal | CLIP, BLIP, LLaVA | ✅ |
| Custom | PyTorch models, ONNX, TensorFlow (converted) | ✅ |
# From HuggingFace Hub
model = accelerator.load_model("bert-base-uncased")
# From IPFS (content-addressed)
model = accelerator.load_model("ipfs://QmXxxx...")
# Local model
model = accelerator.load_model("./my_model/")
# With specific hardware
model = accelerator.load_model("gpt2", device="cuda")🤖 Full model list: Supported Models | Custom Models Guide
| Guide | Description | Audience |
|---|---|---|
| Getting Started | Complete beginner tutorial | Everyone |
| Quick Start | Get running in 5 minutes | Everyone |
| Installation | Detailed setup instructions | Users |
| FAQ | Common questions & answers | Everyone |
| API Reference | Complete API documentation | Developers |
| Architecture | System design & components | Architects |
| Hardware Optimization | Platform-specific tuning | Engineers |
| Testing Guide | Testing & benchmarking | QA/DevOps |
| Topic | Resources |
|---|---|
| IPFS & P2P | IPFS Integration • P2P Networking |
| GitHub Actions | Autoscaler • CI/CD |
| Docker & K8s | Container Guide • Deployment |
| MCP Server | MCP Setup • Protocol Docs |
| Browser Support | WebNN/WebGPU • Examples |
Our documentation has been professionally audited (January 2026):
- ✅ 200+ files covering all features
- ✅ 93/100 quality score (Excellent)
- ✅ Comprehensive - From beginner to expert
- ✅ Well-organized - Clear structure and navigation
- ✅ Verified - All examples tested and working
📋 Documentation Hub: docs/ | Full Index | Audit Report
IPFS integration provides enterprise-grade distributed computing:
- 🔐 Content Addressing - Cryptographically secure, immutable model distribution
- 🌍 Global Network - Automatic peer discovery and geographic optimization
- ⚡ Intelligent Caching - Multi-level LRU caching across the network
- 🔄 Load Balancing - Automatic distribution across available peers
- 🛡️ Fault Tolerance - Robust error handling and fallback mechanisms
# Enable P2P inference
accelerator = IPFSAccelerator(enable_p2p=True)
# Model is automatically shared across peers
model = accelerator.load_model("bert-base-uncased")
# Inference uses best available peer
result = model.inference("Distributed AI!")| Feature | Description | Status |
|---|---|---|
| P2P Workflow Scheduler | Distributed task execution with merkle clocks | ✅ |
| GitHub Actions Cache | Distributed cache for CI/CD | ✅ |
| Autoscaler | Dynamic runner provisioning | ✅ |
| MCP Server | Model Context Protocol (14+ tools) | ✅ |
🌐 Learn more: IPFS Guide | P2P Architecture | Network Setup
# Run all tests
pytest
# Run specific test suite
pytest test/test_inference.py
# Run with coverage report
pytest --cov=ipfs_accelerate_py --cov-report=html
# Run benchmarks
python data/benchmarks/run_benchmarks.py| Metric | Status | Details |
|---|---|---|
| Test Coverage | ✅ | Comprehensive test suite |
| Documentation | ✅ 93/100 | Audit Report |
| Code Quality | ✅ | Linted, type-checked |
| Security | ✅ | Regular vulnerability scans |
| Performance | ✅ | Benchmarked across platforms |
🧪 Testing guide: docs/guides/testing/TESTING_README.md | CI/CD Setup
| Hardware | Model | Throughput | Latency |
|---|---|---|---|
| NVIDIA RTX 3090 | BERT-base | ~2000 samples/sec | <1ms |
| Apple M2 Max | BERT-base | ~800 samples/sec | 2-3ms |
| Intel i9 (CPU) | BERT-base | ~100 samples/sec | 10-15ms |
| WebGPU (Browser) | BERT-base | ~50 samples/sec | 20-30ms |
# Enable mixed precision for 2x speedup
accelerator = IPFSAccelerator(precision="fp16")
# Use batch processing for better throughput
results = model.batch_inference(inputs, batch_size=32)
# Enable model quantization for 4x memory reduction
model = accelerator.load_model("bert-base-uncased", quantize=True)
# Use intelligent caching for repeated queries
accelerator = IPFSAccelerator(enable_cache=True)📊 Performance guide: Hardware Optimization | Benchmarking
| Issue | Solution |
|---|---|
| Import errors | pip install --upgrade ipfs-accelerate-py |
| CUDA not found | Install CUDA Toolkit 11.8+ |
| Slow inference | Check hardware selection, enable caching |
| Memory errors | Use quantization, reduce batch size |
| Connection issues | Check IPFS daemon, firewall settings |
# Verify installation
python -c "import ipfs_accelerate_py; print(ipfs_accelerate_py.__version__)"
# Check hardware detection
ipfs-accelerate hardware status
# Test basic inference
ipfs-accelerate inference test
# View logs
ipfs-accelerate logs --tail 100🆘 Get help: Troubleshooting Guide | FAQ | GitHub Issues
We welcome contributions! Here's how to get started:
- Fork & Clone: Get your own copy of the repository
- Create Branch:
git checkout -b feature/your-feature - Make Changes: Follow our coding standards
- Run Tests:
pytestto ensure everything works - Submit PR: Open a pull request with clear description
- 🐛 Bug Reports - Found an issue? Let us know!
- 📚 Documentation - Help improve guides and examples
- 🧪 Testing - Add tests for edge cases
- 🌍 Translations - Translate docs to other languages
- 💡 Features - Suggest or implement new features
- 💬 GitHub Discussions - Ask questions, share ideas
- 🐛 Issue Tracker - Report bugs, request features
- 🔐 Security Policy - Report security vulnerabilities
- 📧 Email: starworks5@gmail.com
📖 Full guides: CONTRIBUTING.md | Code of Conduct | Security Policy
This project is licensed under the GNU Affero General Public License v3.0 or later (AGPLv3+).
What this means:
- ✅ Free to use, modify, and distribute
- ✅ Commercial use allowed
- ✅ Patent protection included
⚠️ Source code must be disclosed for network services⚠️ Modifications must use same license
Built with amazing open source technologies:
- HuggingFace Transformers - ML model ecosystem
- IPFS - Distributed file system
- PyTorch - Deep learning framework
- FastAPI - Modern web framework
Special thanks to all contributors who make this project possible! 🌟
- 📋 Changelog - Version history and release notes
- 🔐 Security Policy - Security reporting and best practices
- 🤝 Contributing Guide - How to contribute
- 📄 License - AGPLv3+ license details
If you find this project useful:
- ⭐ Star this repository on GitHub
- 📢 Share with your network
- 🐛 Report issues to help improve it
- 💡 Contribute features or fixes
- 📝 Write about your experience
Made with ❤️ by Benjamin Barber and contributors