Connect 4 in AR with OpenCV
Real-time video processing backend for Connect4AR - an augmented reality Connect 4 game controlled by hand gestures. The FastAPI server handles WebRTC video streams, processes hand tracking using MediaPipe, and renders game state overlays in real-time with OpenCV. There is a simple React app in the frontend folder bootstrapped with create-react-app.
- WebRTC Video Streaming: Receives video streams from browser clients using aiortc
- Real-Time Hand Tracking: MediaPipe integration for precise hand landmark detection
- Gesture Recognition: Custom pinch detection algorithm with temporal 5-frame smoothing
- Game Logic: Complete Connect 4 implementation with win condition validation
- AR Overlay Rendering: OpenCV-based game board and chip rendering on live video
- Low Latency: Optimized for <100ms end-to-end latency at 720p/1080p
- Adaptive Resolution Handling: Automatically scales incoming frames to 1280ร720 using OpenCV when the browser sends lower resolutions during initial connection or bandwidth fluctuations
Client Browser (WebRTC)
โ Video Stream (1280ร720@20fps)
FastAPI Backend (aiortc)
โ Frame Processing
MediaPipe Hand Tracking
โ Landmark Detection
Game Logic & OpenCV Rendering
โ Processed Video
Client Browser (WebRTC)
- FastAPI: Async web framework for REST API endpoints
- aiortc: WebRTC implementation for Python
- MediaPipe: Google's hand tracking solution
- OpenCV: Computer vision and image processing
- NumPy: Efficient numerical operations
- Docker: Containerization for deployment
- GCP Compute Engine: Used GCP for custom TURN server with coTURN, and backend deployment with Docker
- Python 3.9+
- Docker (optional, for containerized deployment)
- TURN server (for production NAT traversal)
- Clone the repository
git clone https://github.com/yourusername/connect4ar-backend.git
cd connect4ar-backend- Install dependencies
pip install -r requirements.txt- Run the server
uvicorn main:app --host 0.0.0.0 --port 8000 --reload- Test the API
curl http://localhost:8000/- Build the image
docker build -t connect4ar-backend:latest .- Run the container
docker run -d \
-p 8000:8000 \
-e TURN_SERVER_IP=your.turn.server.ip \
-e TURN_USERNAME=username \
-e TURN_PASSWORD=password \
--name connect4ar-backend \
connect4ar-backend:latest| Variable | Description | Default |
|---|---|---|
TURN_SERVER_IP |
TURN server IP address for NAT traversal | Required |
TURN_USERNAME |
TURN server authentication username | Required |
TURN_PASSWORD |
TURN server authentication password | Required |
The backend uses the following ICE servers:
- Google STUN servers (public IP discovery)
- Custom TURN server (relay for NAT traversal)
Configuration in main.py:
ICE_SERVERS = [
RTCIceServer(urls="stun:stun.l.google.com:19302"),
RTCIceServer(
urls=f"turn:{TURN_SERVER_IP}:3478",
username=TURN_USERNAME,
credential=TURN_PASSWORD
),
]Initiates WebRTC connection by receiving SDP offer from client.
Request Body:
{
"sdp": "v=0\r\no=- ...",
"type": "offer",
"resolution": 720
}Response:
{
"sdp": "v=0\r\no=- ...",
"type": "answer"
}Closes all active peer connections.
Resets the game board state for all active games.
Toggles visibility of hand tracking landmarks overlay.
Health check endpoint.
Response:
{
"status": "ok",
"message": "FastAPI backend is running!",
"turn_server": "34.145.38.148",
"ice_servers": 3
}Returns ICE server configuration (useful for debugging).
The system detects pinch gestures using MediaPipe hand landmarks:
- Calculates distance between thumb tip (landmark 4) and index tip (landmark 8)
- Pinch threshold: 5% of frame width
- 5-frame temporal smoothing to reduce jitter
- Grab: Pinch above the board to grab a chip
- Drag: Move hand horizontally to select column
- Release: Release pinch to drop chip in column
- Win Detection: Automatically checks for 4-in-a-row (horizontal, vertical, diagonal)
class Connect4:
- 7ร6 game board (configurable)
- Turn-based gameplay (Player 1: Red, Player 2: Yellow)
- Win condition checking in 4 directions
- Valid move validation- Board Overlay: Semi-transparent blue board with white holes
- Chip Animation: Smooth falling animation using time-based interpolation
- Real-time Updates: Game state rendered at video frame rate
- Hand Tracking Visualization: Optional landmark overlay (toggle with
/toggle_tracking)
-
Frame Processing
- Efficient NumPy operations for board rendering
- Direct pixel manipulation with OpenCV
- Minimal memory allocation per frame
-
Hand Tracking
- Single-hand mode to reduce computation
- Confidence thresholds tuned for reliability (0.7 detection, 0.5 tracking)
-
Video Encoding
- Configurable resolution (720p/1080p)
- Maintains original frame rate
- Uses VideoFrame format for efficient aiortc integration
Problem: ICE connection fails or disconnects
- Solution: Verify TURN server is running and accessible
- Check: Firewall allows UDP ports 49152-65535 for TURN relay
Problem: High latency or dropped frames
- Solution: Reduce resolution or increase instance CPU/memory
- Check: Network bandwidth and server load
Problem: Gestures not detected consistently
- Solution: Ensure good lighting and contrast
- Check: Hand is within camera frame and fully visible
Problem: False pinch detections
- Solution: Adjust threshold in
game.py:thresh = w * 0.05
fastapi
uvicorn
aiortc
opencv-python
mediapipe
numpy
python-av
See requirements.txt for complete list with versions.
- Single simultaneous game session per backend instance
- Requires GPU/hardware acceleration for optimal performance at 1080p
- TURN server required for production deployment (NAT traversal)
- AI opponent using minimax algorithm
- WebGL-accelerated rendering
- Horizontal scaling with load balancer
MIT License - see LICENSE file for details
James Wen
- GitHub: @notjamesw
- MediaPipe team for hand tracking models
- aiortc contributors for Python WebRTC implementation
- Google for STUN server infrastructure
Note: This backend requires a corresponding frontend client. See Connect4AR Frontend for the React web application.