generated from mintlify/starter
-
Notifications
You must be signed in to change notification settings - Fork 6
Model Selection Guide #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
15914e8
add Model Selection Guide
danstarns 8006249
typo
danstarns a2b064c
trunk
danstarns 015121e
image change
danstarns 30d6068
new lines in cards + trunk fixes
danstarns 8800748
Update agents/model-selection.mdx
danstarns 7a1f72b
Merge branch 'main' into model-selection-guide
danstarns 47f2fc7
Update agents/model-selection.mdx
danstarns ae1c162
updates, less long winded, default to GPT-4.1
danstarns 7f272ba
remove repomix
danstarns 3706034
remove repeated block
danstarns 1615947
Update model-selection.mdx
fengjessica 75ce1ac
Update model-selection.mdx
fengjessica 56989c4
update image
fengjessica 442c5ab
resolve merge conflict
johnymontana 56238bc
trunk fmt
johnymontana bc6b0d5
style updates
johnymontana fb60e00
format
johnymontana File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
title: "Model Selection Guide" | ||
sidebarTitle: "Choose the Right Model" | ||
description: | ||
"Select the optimal model for your agent based on your goals and use case." | ||
--- | ||
|
||
Choosing the right model is essential to building effective agents. This guide | ||
helps you evaluate trade-offs, pick the right model for your use case, and | ||
iterate quickly. | ||
|
||
 | ||
|
||
## Key considerations | ||
|
||
- **Accuracy and output quality:** Advanced logic, mathematical problem-solving, | ||
and multi-step analysis may require high-capability models. | ||
- **Domain expertise:** Performance varies by domain (for example, creative | ||
writing, code, scientific analysis). Review model benchmarks or test with your | ||
own examples. | ||
- **Context window:** Long documents, extensive conversations, or large | ||
codebases require models with longer context windows. | ||
- **Embeddings:** For semantic search or similarity, consider embedding models. | ||
These aren't for text generation. | ||
- **Latency:** Real-time apps may need low-latency responses. Smaller models (or | ||
“Mini,” “Nano,” and “Flash” variants) typically respond faster than larger | ||
models. | ||
|
||
## Models by task / use case at a glance | ||
|
||
| Task / use case | Example models | Key strengths | Considerations | | ||
| --------------------------------------- | -------------------------------------------------- | ---------------------------------------------- | ------------------------------------ | | ||
| General-purpose conversation | Claude 4 Sonnet, GPT-4.1, Gemini Pro | Balanced, reliable, creative | May not handle edge cases as well | | ||
| Complex reasoning and research | Claude 4 Opus, O3, Gemini 2.5 Pro | Highest accuracy, multi-step analysis | Higher cost, quality critical | | ||
| Creative writing and content | Claude 4 Opus, GPT-4.1, Gemini 2.5 Pro | High-quality output, creativity, style control | High cost for premium content | | ||
| Document analysis and summarization | Claude 4 Opus, Gemini 2.5 Pro, Llama 3.3 | Handles long inputs, comprehension | Higher cost, slower | | ||
| Real-time apps | Claude 3.5 Haiku, GPT-4o Mini, Gemini 1.5 Flash 8B | Low latency, high throughput | Less nuanced, shorter context | | ||
| Semantic search and embeddings | OpenAI Embedding 3, Nomic AI, Hugging Face | Vector search, similarity, retrieval | Not for text generation | | ||
| Custom model training & experimentation | Llama 4 Scout, Llama 3.3, DeepSeek, Mistral | Open source, customizable | Requires setup, variable performance | | ||
|
||
<Note> | ||
Hypermode provides access to the most popular open source and commercial | ||
models through [Hypermode Model Router documentation](/model-router). We're | ||
constantly evaluating model usage and adding new models to our catalog based | ||
on demand. | ||
</Note> | ||
|
||
## Get started | ||
|
||
You can change models at any time in your agent settings. Start with a | ||
general-purpose model, then iterate and optimize as you learn more about your | ||
agent's needs. | ||
|
||
1. [**Create an agent**](/create-agent) with GPT-4.1 (default). | ||
2. **Define clear instructions and [connections](/connections)** for the agent's | ||
role. | ||
3. **Test with real examples** from your workflow. | ||
4. **Refine and iterate** based on results. | ||
5. **Evaluate alternatives** once you understand patterns and outcomes. | ||
|
||
<Tip> | ||
**Value first, optimize second.** Clarify the task requirements before tuning | ||
for specialized capabilities or cost. | ||
</Tip> | ||
|
||
## Comparison of select large language models | ||
|
||
| Model | Best For | Considerations | Context Window+ | Speed | Cost++ | | ||
| -------------------- | ----------------------------------- | --------------------------------------- | -------------------- | --------- | ------ | | ||
| **Claude 4 Opus** | Complex reasoning, long docs | Higher cost, slower than lighter models | Very long (200K+) | Moderate | $$$$ | | ||
| **Claude 4 Sonnet** | General-purpose, balanced workloads | Less capable than Opus for edge cases | Long (100K+) | Fast | $$$ | | ||
| **GPT-4.1** | Most tasks, nuanced output | Higher cost, moderate speed | Long (128K) | Moderate | $$$ | | ||
| **GPT-4.1 Mini** | High-volume, cost-sensitive | Less nuanced, shorter context | Medium (32K-64K) | Very Fast | $$ | | ||
| **GPT o3** | General chat, broad compatibility | May lack latest features/capabilities | Medium (32K-64K) | Fast | $$ | | ||
| **Gemini 2.5 Pro** | Up-to-date info | Limited access, higher cost | Long (128K+) | Moderate | $$$ | | ||
| **Gemini 2.5 Flash** | Real-time, rapid responses | Shorter context, less nuanced | Medium (32K-64K) | Very Fast | $$ | | ||
| **Llama 4 Scout** | Privacy, customization, open source | Variable performance | Medium-Long (varies) | Fast | $ | | ||
|
||
<sup> | ||
\+ Context window sizes are approximate and may vary by deployment/version. | ||
</sup> | ||
<sup>++ Relative cost per 1K tokens ($ = lowest, $$$$ = highest)</sup> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.