A unified AI proxy server that provides free access to multiple LLM providers through Puter.js SDK
π‘ TL;DR: This project creates a local proxy server that gives you free access to GPT-4o, Claude, DeepSeek, Gemini, Grok, Mistral, and Qwen models through Puter.js SDK - no expensive API keys needed!
Features β’ Quick Start β’ Documentation β’ Models β’ Contributing
- β¨ Features
- π― Why Puter.js Proxy?
- π Quick Start
- π Documentation
- π€ Available Models
- ποΈ Architecture
- π§ Configuration Options
- π οΈ Development
- π Performance
- π€ Contributing
- π License
β οΈ Disclaimer- π Contact
| Feature | Description |
|---|---|
| π Free Access | No expensive API keys required |
| π Multi-Provider | Access 18+ LLM models from one endpoint |
| π Auto-Routing | Intelligent model selection based on task |
| β‘ High Performance | Low latency, optimized caching |
| π Privacy First | All requests route through local proxy |
| π³ Docker Ready | Easy deployment with containers |
| π‘ Standard APIs | OpenAI and Anthropic compatible endpoints |
| π§ Easy Setup | One-command installation |
Traditional AI API Costs:
βββββββββββββββββββ¬βββββββββββββββββββββ¬βββββββββββββββββββββ
β Provider β GPT-4o β Claude 3 Opus β
βββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β Price/1M tokens β $30.00 β $15.00 β
β Per 1K requests β ~$0.06 β ~$0.03 β
β Monthly (heavy) β $500+ β $250+ β
βββββββββββββββββββ΄βββββββββββββββββββββ΄βββββββββββββββββββββ
Puter.js Proxy:
βββββββββββββββββββ¬βββββββββββββββββββββ¬βββββββββββββββββββββ
β Provider β Puter.js β Savings β
βββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
β Price/1M tokens β FREE* β 100% β
β Per 1K requests β FREE* β FREE β
β Monthly (heavy) | FREE* | $0 β
βββββββββββββββββββ΄βββββββββββββββββββββ΄βββββββββββββββββββββ
* Through Puter.js free tier
- Node.js 18+ (Node.js 22 recommended)
- npm or yarn
- Git
# 1. Clone the repository
git clone https://github.com/mulkymalikuldhrs/jsputer-proxy.git
cd jsputer-proxy
# 2. Run setup script
chmod +x setup.sh
./setup.sh
# 3. Start the server
npm start
# 4. Test it!
curl -X POST http://localhost:3333/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"deepseek-chat","messages":[{"role":"user","content":"Hello!"}]}'{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1234567890,
"model": "deepseek-chat",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20
}
}# Clone the repository
git clone https://github.com/mulkymalikuldhrs/jsputer-proxy.git
cd jsputer-proxy
# Install dependencies
npm install
# Create .env file
cp .env.example .env
# Edit .env and add your Puter.js token (optional for testing)
# Start the server
npm start# Using Docker
docker build -t jsputer-proxy .
docker run -p 3333:3333 jsputer-proxy# Install as a service
sudo cp puter-proxy.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable puter-proxy
sudo systemctl start puter-proxy
# Check status
sudo systemctl status puter-proxyCreate a .env file in the project root:
# Puter.js Authentication Token (optional for basic usage)
# Get your token from https://puter.com/#/account
PUTER_AUTH_TOKEN=your_token_here
# Server Configuration
PORT=3333
NODE_ENV=production
# Logging
LOG_LEVEL=infoEdit your /home/mulky/opencode.json:
{
"provider": {
"puter-proxy": {
"models": {
"deepseek-chat": {},
"gpt-5-chat": {},
"gpt-4o": {},
"gpt-4o-mini": {},
"gemini-2.0-flash": {},
"claude-opus-4-5-latest": {},
"claude-sonnet-4": {},
"claude-haiku-4-5": {},
"grok-3": {},
"mistral-large-2512": {},
"codestral-2508": {},
"qwen-2.5-coder-32b-instruct": {}
},
"options": {
"baseURL": "http://localhost:3333/v1"
}
}
}
}Endpoint: POST http://localhost:3333/v1/chat/completions
curl -X POST http://localhost:3333/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Explain quantum computing"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}'Endpoint: POST http://localhost:3333/v1/messages
curl -X POST http://localhost:3333/v1/messages \
-H "Content-Type: application/json" \
-H "Anthropic-Api-Version: 2023-06-01" \
-d '{
"model": "claude-opus-4-5-latest",
"messages": [
{"role": "user", "content": "Write a Python function for Fibonacci"}
],
"max_tokens": 1000
}'Endpoint: POST http://localhost:3333/chat
Automatically selects the best model based on your query:
curl -X POST http://localhost:3333/chat \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Build a REST API in Python"}],
"stream": false
}'// Using OpenAI SDK
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'any-key', // Puter.js doesn't require real API key
baseURL: 'http://localhost:3333/v1'
});
async function chat() {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
stream: false
});
console.log(response.choices[0].message.content);
}
chat();import requests
def chat_with_ai(message: str, model: str = "deepseek-chat") -> str:
"""Chat with any available model"""
response = requests.post(
'http://localhost:3333/v1/chat/completions',
json={
'model': model,
'messages': [{'role': 'user', 'content': message}],
'stream': False
}
)
data = response.json()
return data['choices'][0]['message']['content']
# Example usage
print(chat_with_ai("Hello! How are you?", "gpt-4o"))# Interactive chat
node cli.mjs
# One-liner
echo "Hello!" | node cli.mjs| Model | Provider | Type | Best For |
|---|---|---|---|
deepseek-chat |
DeepSeek | Reasoning | General purpose, planning |
gpt-5-chat |
OpenAI | General | Latest OpenAI model |
gpt-4o |
OpenAI | General | Complex reasoning, code |
gpt-4o-mini |
OpenAI | Fast | Quick tasks, simple queries |
gemini-2.0-flash |
Fast | Balanced performance | |
gemini-2.0-flash-lite |
Ultra-Fast | Lightweight tasks | |
claude-opus-4-5-latest |
Anthropic | Code/Analysis | Best for code, architecture |
claude-sonnet-4 |
Anthropic | Balanced | Code + analysis |
claude-haiku-4-5 |
Anthropic | Fast | Quick tasks |
grok-3 |
xAI | General | xAI's flagship model |
grok-3-fast |
xAI | Fast | Quick responses |
grok-2-vision |
xAI | Vision | Image understanding |
mistral-large-2512 |
Mistral | General | Mistral's best model |
mistral-small-2506 |
Mistral | Fast | Quick tasks |
mistral-medium-2508 |
Mistral | Balanced | General use |
codestral-2508 |
Mistral | Code | Code generation |
devstral-medium-2507 |
Mistral | Code | Development tasks |
qwen-2.5-coder-32b-instruct |
Qwen/Coder | Code | Dedicated coding |
o1/o3- OpenAI reasoning models (not supported by Puter.js)gemini-2.5-pro- Not yet available via Puter.jsclaude-3-5-sonnet- Replaced by claude-sonnet-4
The router automatically selects the best model:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β QUERY TYPE DETECTION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β BUILDING (code, implement, debug, refactor, sql...) β
β β β
β claude-opus-4-5-latest β
β β
β PLANNING (plan, design, architecture, strategy...) β
β β β
β deepseek-chat β
β β
β REASONING (solve, explain, calculate, prove...) β
β β β
β gpt-4o β
β β
β FAST (simple question, <100 chars) β
β β β
β gpt-4o-mini β
β β
β DEFAULT β
β β β
β deepseek-chat β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββ
β Puter.js Proxy β
β (localhost:3333) β
βββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββββββββββ βββββββββββββββββββ
β /chat β β /v1/chat/completions β β /v1/messages β
β (Auto-Routing) β β (OpenAI Compatible) β β (Anthropic) β
ββββββββββ¬βββββββββ ββββββββββββββ¬βββββββββββββ ββββββββββ¬βββββββββ
β β β
ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Router (router.js) β
β Intelligent Model Selection β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Puter.js SDK (@heyputer) β
β puter.ai.chat(messages, options) β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Puter API β
β (api.puter.com) β
β β
β βββββββββββ βββββββββββ β
β βGPT-4o β βClaude β β
β βDeepSeek β βGemini β β
β βGrok β βMistral β β
β βββββββββββ βββββββββββ β
βββββββββββββββββββββββββββββββββββββββ
// In index.js
const app = express();
app.use(express.json({ limit: '50mb' })); // Large payload support
app.use(express.urlencoded({ extended: true, limit: '50mb' }));Add to your middleware:
const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 60 * 1000, // 1 minute
max: 100, // limit each IP to 100 requests per minute
message: { error: "Too many requests" }
});
app.use(limiter);# Enable debug mode
DEBUG=* npm start
# Or set log level
LOG_LEVEL=debug npm startputer-proxy/
βββ π src/ # Source files
β βββ index.js # Main server
β βββ index-https.js # HTTPS server
β βββ index-debug.js # Debug server with logging
β βββ client.js # Puter.js client
β βββ router.js # Auto-routing logic
β βββ globals.js # Polyfills
βββ π docs/ # Documentation
β βββ images/ # Images and diagrams
β βββ MODELS.md # Model guide
β βββ API.md # API documentation
βββ π scripts/ # Utility scripts
βββ package.json # Dependencies
βββ setup.sh # Installation script
βββ puter-proxy.service # Systemd service
βββ README.md # This file
Edit router.js:
export function pickModel(messages) {
const text = messages.map(m => m.content || "").join(" ").toLowerCase();
// Add new model detection
if (text.includes("newKeyword")) {
return "new-model-name";
}
// ... existing logic
}# Test all models
npm test
# Test specific model
npm test -- --model=gpt-4o
# Test proxy endpoints
npm run test:proxy| Model | First Token | Full Response | Throughput |
|---|---|---|---|
| gpt-4o | ~500ms | ~1.5s | 45 tokens/s |
| deepseek-chat | ~800ms | ~1.7s | 35 tokens/s |
| claude-opus-4 | ~1.2s | ~2.7s | 28 tokens/s |
| gpt-4o-mini | ~400ms | ~1.0s | 60 tokens/s |
- Use non-streaming for simple queries
- Reuse connections with keep-alive
- Cache responses for repeated queries
- Batch requests when possible
We welcome contributions! Please see our Contributing Guide for details.
- π Bug Reports - Report bugs or issues
- π‘ Feature Requests - Suggest new features
- π Documentation - Improve docs and tutorials
- π§ Pull Requests - Submit code changes
- π Translations - Help translate docs
# Fork the repository
# Clone your fork
git clone https://github.com/YOUR-USERNAME/jsputer-proxy.git
cd jsputer-proxy
# Create feature branch
git checkout -b feature/amazing-feature
# Make changes and commit
git commit -m "Add amazing feature"
# Push to your fork
git push origin feature/amazing-feature
# Open a Pull RequestThis project is licensed under the MIT License - see the LICENSE file for details.
Important: This project uses Puter.js SDK to access AI models. By using this software:
- You agree to Puter.js's Terms of Service and Privacy Policy
- This software is provided "as is" without warranty of any kind
- Usage may be subject to Puter.js's fair use policies
- The maintainers are not responsible for any costs, damages, or issues arising from use
Note: While Puter.js provides free access, please use responsibly and check their documentation for current limits and availability.
Mulky Malikul Dhaher
- π§ Email: mulkymalikuldhaher@email.com
- οΏ½ GitHub: @mulkymalikuldhrs
- πΌ LinkedIn: Mulky Malikul Dhaher
- π Puter.js Documentation
- π Issue Tracker
- π¬ Discussions
This project wouldn't be possible without these amazing resources:
- Puter.js - The official JavaScript SDK for Puter
- Express.js - Fast, unopinionated, minimalist web framework
- Node.js - JavaScript runtime
- dotenv - Environment variable management
- Puter.js Getting Started Guide
- Puter.js Auth Documentation
- Puter.js AI/Chat API
- OpenAI API Compatible
- Anthropic Messages API
If this project helped you, please:
- β Star the repository
- π¦ Share on Twitter
- π’ Spread the word on social media
- β Buy me a coffee
Made with β€οΈ by Mulky Malikul Dhaher
Puter.js Proxy - Free AI Access for Everyone π
See CHANGELOG.md for a detailed list of changes.
puter.js, puter-js, ai-proxy, llm-proxy, free-ai, free-llm,
chatgpt-alternative, claude-alternative, deepseek, gemini,
grok, mistral, qwen, openai-compatible, anthropic-compatible,
local-ai-server, self-hosted-ai, ai-gateway, llm-gateway,
nodejs, express, javascript, typescript, ai-api, llm-api,
free-api-key, no-api-key, budget-ai, cheap-ai, affordable-ai
Last updated: January 2026 | Version 1.0.0
