Skip to content

[Feature] Together AI Platform Integration with Native Provider Support #82

@Rohitjoshi9023

Description

@Rohitjoshi9023

Summary

Add native support for Together AI platform as a first-class provider in the Copilot SDK. This includes seamless integration with Together AI's diverse model catalog, real-time inference API, and enterprise-grade features like batch processing and usage monitoring.

Problem / Use Case

  1. Growing Together AI Adoption: More teams are migrating to Together AI for cost-effective, open-source model inference at scale
  2. Feature Parity: Currently, accessing Together AI requires manual API integration without built-in fallback chains or routing strategies (e.g., falling back to open-source models when proprietary APIs are rate-limited)
  3. Consistency Gaps: No standardized way to handle Together AI alongside other providers (OpenAI, Anthropic, etc.) in a single SDK

Proposed Solution

1. Together AI Provider Implementation

  • Add TogetherAIProvider class with full support for:
    • Text Generation: 70+ open-source models (Llama, Mistral, Falcon, etc.)
    • Batch Processing API: Asynchronous batch job submission for large-scale processing
    • Embeddings: Via Together AI embeddings endpoint
    • Streaming: Real-time token streaming with proper error handling

2. Model Registry Integration

  • Register Together AI models in the SDK's model registry with metadata:
    • Input/output token limits
    • Pricing per 1M tokens
    • Context window size
    • Quantization levels available

3. Fallback & Routing Compatibility

  • Enable fallback chains across Together AI models (e.g., fallback from Llama-70B to Llama-7B on rate limits)
  • Support routing strategies (priority, round-robin) specifically optimized for Together AI's load balancing

4. Authentication & Configuration

  • Environment variable support: TOGETHER_API_KEY
  • Configurable base URL for self-hosted Together inference endpoints
  • Request timeout and retry logic aligned with Together's SLA

Alternatives Considered

  1. Using Together AI's SDK directly - Lacks centralized error handling and routing across multiple providers
  2. Generic HTTP client - Would require duplicating error handling and rate limit logic
  3. Async wrapper layer - Insufficient for seamless integration with existing fallback/routing infrastructure

Additional Context

  • Together AI API Reference: https://docs.together.ai/reference
  • Community request for multi-provider support with open-source models
  • Competitive advantage: Better support for cost-sensitive applications## Summary

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions