Turn your Pixel into a free, private AI API server.
Run Gemini Nano on the Tensor G5 chip and expose it as a local REST API — no cloud, no API keys, no cost.
An Android app that turns your Pixel phone into a self-hosted AI inference server. It runs an HTTP server directly on the device, accepting OpenAI-compatible API requests over your local network.
Under the hood it uses Google's on-device AI stack:
- Gemini Nano via ML Kit Prompt API — hardware-accelerated on the Tensor G5 TPU
- MediaPipe LLM as fallback — for custom open-weight models like Gemma 3n
All inference runs entirely on-device. Your data never leaves the phone.
- OpenAI-compatible API — drop-in replacement for
openai.ChatCompletion.create() - Streaming support — real-time Server-Sent Events (SSE) token streaming
- Zero configuration — install, tap Start, done
- Fully offline — no internet required after install
- Private by design — prompts and responses stay on your device
- Background service — keeps serving even when the app is minimized
- Custom model support — bring your own Gemma, LLaMA, or other compatible models
Download the APK from Releases and install:
adb install app-debug.apkOr build from source (see Building below).
Open Pixel10 AI Server, tap Start Server. The app will load the model and display your device's IP address.
From any device on the same network:
curl http://<phone-ip>:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "What is the Tensor G5 chip?"}]
}'All endpoints follow the OpenAI API format.
| Method | Path | Description |
|---|---|---|
POST |
/v1/chat/completions |
Chat completion (supports streaming) |
POST |
/v1/completions |
Text completion |
GET |
/v1/models |
List available models |
GET |
/health |
Server status, device info, uptime |
curl http://<phone-ip>:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "pixel10-on-device",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing briefly."}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}'curl -N http://<phone-ip>:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "pixel10-on-device",
"messages": [{"role": "user", "content": "Write a haiku about the ocean."}],
"stream": true
}'from openai import OpenAI
client = OpenAI(
base_url="http://<phone-ip>:8080/v1",
api_key="not-needed"
)
response = client.chat.completions.create(
model="pixel10-on-device",
messages=[{"role": "user", "content": "Hello from my laptop!"}]
)
print(response.choices[0].message.content)If Gemini Nano isn't available on your device, you can use custom open-weight models:
- Download a compatible model (e.g. Gemma 3n E2B)
- Push to the device:
adb push gemma-3n-E2B.task /data/data/com.pixel10.ai/files/
- Restart the app — it auto-detects model files
Supported formats: .task, .bin, .tflite
| Device | Chip | Backend |
|---|---|---|
| Pixel 10 / Pro / Pro XL | Tensor G5 | Gemini Nano (TPU-accelerated) |
| Pixel 9 series | Tensor G4 | Gemini Nano (TPU-accelerated) |
| Pixel 8 series | Tensor G3 | Gemini Nano (TPU-accelerated) |
| Other Android 12+ | Various | MediaPipe with custom models |
git clone https://github.com/alexpolo1/Pixel10-ai.git
cd Pixel10-ai
./gradlew assembleDebug
adb install app/build/outputs/apk/debug/app-debug.apkRequirements: JDK 17+, Android SDK 35
com.pixel10.ai/
├── Pixel10AIApp.kt # Application init
├── inference/
│ ├── OnDeviceModel.kt # Unified inference interface
│ ├── GeminiNanoModel.kt # ML Kit Prompt API backend
│ └── MediaPipeModel.kt # MediaPipe LLM backend
├── server/
│ ├── AIApiServer.kt # NanoHTTPD REST server
│ ├── ApiModels.kt # Request/response models
│ └── ApiServerService.kt # Foreground service
└── ui/
└── MainActivity.kt # Server controls & dashboard
This project is provided for educational and experimental purposes only.
The Gemini Nano model is accessed through the ML Kit GenAI API, which is subject to Google's ML Kit Terms of Service and the GenAI API Additional Terms. Exposing on-device models as a network API may not be a documented use case under those terms. Users are responsible for reviewing and complying with all applicable terms of service.
This project is not affiliated with, endorsed by, or sponsored by Google.
| Library | License |
|---|---|
| ML Kit GenAI | Google ToS |
| MediaPipe | Apache 2.0 |
| NanoHTTPD | BSD 3-Clause |
| Gson | Apache 2.0 |
| AndroidX | Apache 2.0 |
| Kotlin Coroutines | Apache 2.0 |
Copyright 2025 alexpolo1
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
See LICENSE for the full text.