This directory contains documentation for the internal components (scripts) of the CustomGroqChat package.
The CustomGroqChat package is built with a modular architecture, where each component handles a specific responsibility. This design enables flexible customization, extensibility, and easier maintenance.
┌─────────────┐
│ GroqClient │
└─────┬───────┘
│
▼
┌─────────────────┐
│ RequestHandler │
└────┬─────┬──────┘
│ │
┌─────────────┘ └──────────────┐
│ │
┌──────────▼─────────┐ ┌─────────▼────────┐
│ QueueManager │ │ TokenCounter │
└──────────┬─────────┘ └──────────────────┘
│
│
┌──────────▼─────────┐ ┌────────────────────┐
│ RateLimitHandler │◄────┤ ConfigLoader │
└──────────┬─────────┘ └────────────────────┘
│
┌──────────▼─────────┐
│ APIClient │
└────────────────────┘
| Component | Documentation |
|---|---|
| Package Overview | Package Exports |
| Main Interface | GroqClient |
| Request Processing | RequestHandler |
| Queue Management | QueueManager |
| API Communication | APIClient |
| Rate Limiting | RateLimitHandler |
| Configuration | ConfigLoader |
| Token Counting | TokenCounter |
| Error Handling | Exceptions |
- GroqClient: The primary interface for applications integrating with the Groq Cloud API. It provides high-level methods for chat completions, text completions, and other API features, abstracting away the complexity of request handling, queuing, and rate limiting.
- RequestHandler: Validates and prepares requests before they're sent to the API. Manages token counting, parameter validation, and converts application-level requests to API-compatible formats.
- QueueManager: Implements a priority queue system for handling concurrent requests efficiently. Maintains separate queues for different priority levels and ensures requests are processed in the appropriate order.
- APIClient: Handles direct communication with the Groq Cloud API. Manages connection pooling, timeout handling, and response parsing. Abstracts the HTTP layer, providing a clean interface for other components.
-
RateLimitHandler: Tracks and enforces rate limits for different models. Implements token bucket algorithms to handle both per-minute and per-day limits efficiently.
-
ConfigLoader: Manages configuration loading from files or environment variables. Validates configuration values and provides sensible defaults.
-
TokenCounter: Accurately counts tokens for different request types. Ensures requests stay within model context limits and helps track token-based rate limits.
-
Exceptions: Defines the custom exception hierarchy used throughout the package, providing detailed error information and handling guidance.
- The
GroqClientuses theRequestHandlerto prepare and validate requests. - The
RequestHandleruses theTokenCounterto validate token limits and theQueueManagerto queue requests. - The
QueueManageruses theRateLimitHandlerto check rate limits and theAPIClientto send requests. - The
RateLimitHandleruses theConfigLoaderto get rate limit configurations.
Each component is designed to be:
- Single-responsibility: Each component focuses on one aspect of the system.
- Loosely coupled: Components interact through well-defined interfaces.
- Testable: Each component can be tested in isolation.
- Configurable: Behavior can be customized through configuration.
- Extensible: New functionality can be added without modifying existing code.
- Examples - Examples of using these components together