Skip to content

Latest commit

 

History

History
614 lines (495 loc) · 14.9 KB

File metadata and controls

614 lines (495 loc) · 14.9 KB
title Quick Start
description Get started with Visca AI Gateway in under 5 minutes

Welcome

This guide will help you make your first API request to Visca AI Gateway. You'll learn how to:

  • Obtain your API key
  • Make your first request
  • Use streaming responses
  • Handle errors effectively
**Prerequisites**: You'll need an account at [gateway.visca.ai](https://gateway.visca.ai) or a self-hosted instance. See the [Self-Host Guide](/self-host) for deployment options.

Step 1: Get Your API Key

Visit [gateway.visca.ai](https://gateway.visca.ai) and create an account or log in to your existing account. Go to the **API Keys** section in your dashboard. Click **Create API Key**, give it a descriptive name, and optionally set usage limits.
<Warning>
  Copy your API key immediately—it won't be shown again. Store it securely in your environment variables or secrets manager.
</Warning>
Add your API key to your environment:
```bash
export VISCA_API_KEY="vsk_your_api_key_here"
```

Step 2: Install SDK (Optional)

While Visca AI Gateway is fully compatible with OpenAI's SDKs, you can use any HTTP client. Here's how to set up the OpenAI SDK:

```bash pip install openai ``` ```bash npm install openai # or yarn add openai # or pnpm add openai ``` ```bash go get github.com/sashabaranov/go-openai ``` ```bash gem install ruby-openai ```

Step 3: Make Your First Request

Choose your preferred language and make your first request:

from openai import OpenAI

# Initialize the client
client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key="vsk_your_api_key_here"  # Use environment variable in production
)

# Create a chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=150
)

# Print the response
print(response.choices[0].message.content)
import OpenAI from 'openai';

// Initialize the client
const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

// Create a chat completion
async function main() {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in simple terms.' }
    ],
    max_tokens: 150
  });

  console.log(response.choices[0].message.content);
}

main();
import OpenAI from 'openai';

// Initialize the client
const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

// Create a chat completion
async function main(): Promise<void> {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in simple terms.' }
    ],
    max_tokens: 150
  });

  console.log(response.choices[0].message.content);
}

main();
curl https://api.visca.ai/v1/chat/completions \
  -H "Authorization: Bearer $VISCA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms."
      }
    ],
    "max_tokens": 150
  }'
package main

import (
    "context"
    "fmt"
    "os"
    
    openai "github.com/sashabaranov/go-openai"
)

func main() {
    // Create a custom configuration
    config := openai.DefaultConfig("vsk_your_api_key_here")
    config.BaseURL = "https://api.visca.ai/v1"
    
    client := openai.NewClientWithConfig(config)
    
    // Create a chat completion
    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "gpt-4o",
            Messages: []openai.ChatCompletionMessage{
                {
                    Role:    openai.ChatMessageRoleSystem,
                    Content: "You are a helpful assistant.",
                },
                {
                    Role:    openai.ChatMessageRoleUser,
                    Content: "Explain quantum computing in simple terms.",
                },
            },
            MaxTokens: 150,
        },
    )
    
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }
    
    fmt.Println(resp.Choices[0].Message.Content)
}

Step 4: Try Streaming Responses

For real-time applications, use streaming to receive responses as they're generated:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

# Stream the response
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a short poem about AI."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

async function streamResponse() {
  const stream = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Write a short poem about AI.' }],
    stream: true
  });

  for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
  }
}

streamResponse();
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

async function streamResponse(): Promise<void> {
  const stream = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Write a short poem about AI.' }],
    stream: true
  });

  for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
  }
}

streamResponse();
curl https://api.visca.ai/v1/chat/completions \
  -H "Authorization: Bearer $VISCA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Write a short poem about AI."}
    ],
    "stream": true
  }'

Available Models

Visca AI Gateway supports 50+ models across multiple providers. Here are some popular options:

- `gpt-4o` - Most capable, multimodal model - `gpt-4o-mini` - Fast and affordable - `gpt-4-turbo` - Previous generation flagship - `gpt-3.5-turbo` - Fast and cost-effective - `dall-e-3` - Image generation - `claude-3-5-sonnet-20241022` - Latest and most capable - `claude-3-opus-20240229` - Most powerful Claude model - `claude-3-sonnet-20240229` - Balanced performance - `claude-3-haiku-20240307` - Fastest and most affordable - `gemini-2.0-flash-exp` - Latest experimental model - `gemini-1.5-pro` - Advanced reasoning and multimodal - `gemini-1.5-flash` - Fast and efficient - `llama-3.1-405b` - Meta's largest model - `llama-3.1-70b` - Powerful open model - `mixtral-8x22b` - Mistral's mixture of experts - `qwen-2.5-72b` - High-quality Chinese/English model To see all available models, make a request to the `/v1/models` endpoint or check your dashboard.

Using Different Providers

Simply change the model name to use a different provider:

# OpenAI
response = client.chat.completions.create(model="gpt-4o", ...)

# Anthropic
response = client.chat.completions.create(model="claude-3-5-sonnet-20241022", ...)

# Google
response = client.chat.completions.create(model="gemini-2.0-flash-exp", ...)

# Open source via Groq (ultra-fast)
response = client.chat.completions.create(model="llama-3.1-70b", ...)

Request Metadata

Track requests with custom metadata for analytics and cost allocation:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_headers={
        "X-Visca-Metadata": json.dumps({
            "user_id": "user_123",
            "app_name": "my_app",
            "environment": "production"
        })
    }
)
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
  headers: {
    'X-Visca-Metadata': JSON.stringify({
      user_id: 'user_123',
      app_name: 'my_app',
      environment: 'production'
    })
  }
});
curl https://api.visca.ai/v1/chat/completions \
  -H "Authorization: Bearer $VISCA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Visca-Metadata: {\"user_id\":\"user_123\",\"app_name\":\"my_app\"}" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'

Learn more about metadata tracking.

Error Handling

Always implement proper error handling for production applications:

from openai import OpenAI, APIError, RateLimitError, APIConnectionError

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)
    
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
    
except APIConnectionError as e:
    print(f"Connection error: {e}")
    
except APIError as e:
    print(f"API error: {e}")
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

try {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }]
  });
  
  console.log(response.choices[0].message.content);
  
} catch (error) {
  if (error.status === 429) {
    console.error('Rate limit exceeded:', error.message);
  } else if (error.status >= 500) {
    console.error('Server error:', error.message);
  } else {
    console.error('API error:', error.message);
  }
}
import OpenAI from 'openai';
import { APIError } from 'openai/error';

const client = new OpenAI({
  baseURL: 'https://api.visca.ai/v1',
  apiKey: process.env.VISCA_API_KEY
});

try {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello!' }]
  });
  
  console.log(response.choices[0].message.content);
  
} catch (error) {
  if (error instanceof APIError) {
    console.error(`API Error (${error.status}):`, error.message);
  } else {
    console.error('Unexpected error:', error);
  }
}

Common Error Codes

Status Code Meaning Solution
400 Bad Request Check your request parameters
401 Unauthorized Verify your API key is correct
403 Forbidden Check API key permissions and rate limits
429 Rate Limit Implement exponential backoff and retry
500 Server Error Retry with exponential backoff
503 Service Unavailable Provider is down, will auto-failover if configured

Best Practices

- Never hardcode API keys in your source code - Use environment variables or secrets management - Rotate keys regularly - Set up usage limits per key - Use different keys for development and production ```python import time from openai import OpenAI, APIError
def make_request_with_retry(client, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello!"}]
            )
        except APIError as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt  # Exponential backoff
            time.sleep(wait_time)
```
- Use the dashboard to track requests, costs, and latency - Set up alerts for unusual spending patterns - Add metadata to requests for detailed analytics - Review cost reports regularly - Use streaming for real-time applications - Set appropriate `max_tokens` to control costs - Choose the right model for your use case (cost vs. capability) - Enable caching for repeated queries - Use routing strategies for optimal latency/cost

Next Steps

Set up cost optimization and automatic failover

<Card title="API Keys & Security" icon="shield" href="/docs/features/api-keys"

Configure fine-grained access control and limits

<Card title="Vision & Multimodal" icon="image" href="/docs/features/vision"

Work with images and vision models

<Card title="Self-Host" icon="server" href="/self-host"

Deploy on your own infrastructure

Need Help?

Explore advanced features Get help from the community Report bugs and request features Contact our support team