Skip to content

Commit bf3f760

Browse files
danstarnsryanfoxtylerfengjessicajohnymontana
authored
Model Selection Guide (#150)
* add Model Selection Guide * typo * trunk * image change * new lines in cards + trunk fixes * Update agents/model-selection.mdx Co-authored-by: Ryan Fox-Tyler <60440289+ryanfoxtyler@users.noreply.github.com> * Update agents/model-selection.mdx Co-authored-by: Ryan Fox-Tyler <60440289+ryanfoxtyler@users.noreply.github.com> * updates, less long winded, default to GPT-4.1 * remove repomix * remove repeated block * Update model-selection.mdx * Update model-selection.mdx * update image * trunk fmt * style updates * format --------- Co-authored-by: Ryan Fox-Tyler <60440289+ryanfoxtyler@users.noreply.github.com> Co-authored-by: fengjessica <jessica.feng@gmail.com> Co-authored-by: William Lyon <lyonwj@gmail.com>
1 parent bd19f5d commit bf3f760

File tree

5 files changed

+90
-1509
lines changed

5 files changed

+90
-1509
lines changed

agents/model-selection.mdx

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
title: "Model Selection Guide"
3+
sidebarTitle: "Choose the Right Model"
4+
description:
5+
"Select the optimal model for your agent based on your goals and use case."
6+
---
7+
8+
Choosing the right model is essential to building effective agents. This guide
9+
helps you evaluate trade-offs, pick the right model for your use case, and
10+
iterate quickly.
11+
12+
![Select your model](/images/agents/model-selection.png)
13+
14+
## Key considerations
15+
16+
- **Accuracy and output quality:** Advanced logic, mathematical problem-solving,
17+
and multi-step analysis may require high-capability models.
18+
- **Domain expertise:** Performance varies by domain (for example, creative
19+
writing, code, scientific analysis). Review model benchmarks or test with your
20+
own examples.
21+
- **Context window:** Long documents, extensive conversations, or large
22+
codebases require models with longer context windows.
23+
- **Embeddings:** For semantic search or similarity, consider embedding models.
24+
These aren't for text generation.
25+
- **Latency:** Real-time apps may need low-latency responses. Smaller models (or
26+
“Mini,” “Nano,” and “Flash” variants) typically respond faster than larger
27+
models.
28+
29+
## Models by task / use case at a glance
30+
31+
| Task / use case | Example models | Key strengths | Considerations |
32+
| --------------------------------------- | -------------------------------------------------- | ---------------------------------------------- | ------------------------------------ |
33+
| General-purpose conversation | Claude 4 Sonnet, GPT-4.1, Gemini Pro | Balanced, reliable, creative | May not handle edge cases as well |
34+
| Complex reasoning and research | Claude 4 Opus, O3, Gemini 2.5 Pro | Highest accuracy, multi-step analysis | Higher cost, quality critical |
35+
| Creative writing and content | Claude 4 Opus, GPT-4.1, Gemini 2.5 Pro | High-quality output, creativity, style control | High cost for premium content |
36+
| Document analysis and summarization | Claude 4 Opus, Gemini 2.5 Pro, Llama 3.3 | Handles long inputs, comprehension | Higher cost, slower |
37+
| Real-time apps | Claude 3.5 Haiku, GPT-4o Mini, Gemini 1.5 Flash 8B | Low latency, high throughput | Less nuanced, shorter context |
38+
| Semantic search and embeddings | OpenAI Embedding 3, Nomic AI, Hugging Face | Vector search, similarity, retrieval | Not for text generation |
39+
| Custom model training & experimentation | Llama 4 Scout, Llama 3.3, DeepSeek, Mistral | Open source, customizable | Requires setup, variable performance |
40+
41+
<Note>
42+
Hypermode provides access to the most popular open source and commercial
43+
models through [Hypermode Model Router documentation](/model-router). We're
44+
constantly evaluating model usage and adding new models to our catalog based
45+
on demand.
46+
</Note>
47+
48+
## Get started
49+
50+
You can change models at any time in your agent settings. Start with a
51+
general-purpose model, then iterate and optimize as you learn more about your
52+
agent's needs.
53+
54+
1. [**Create an agent**](/create-agent) with GPT-4.1 (default).
55+
2. **Define clear instructions and [connections](/connections)** for the agent's
56+
role.
57+
3. **Test with real examples** from your workflow.
58+
4. **Refine and iterate** based on results.
59+
5. **Evaluate alternatives** once you understand patterns and outcomes.
60+
61+
<Tip>
62+
**Value first, optimize second.** Clarify the task requirements before tuning
63+
for specialized capabilities or cost.
64+
</Tip>
65+
66+
## Comparison of select large language models
67+
68+
| Model | Best For | Considerations | Context Window+ | Speed | Cost++ |
69+
| -------------------- | ----------------------------------- | --------------------------------------- | -------------------- | --------- | ------ |
70+
| **Claude 4 Opus** | Complex reasoning, long docs | Higher cost, slower than lighter models | Very long (200K+) | Moderate | $$$$ |
71+
| **Claude 4 Sonnet** | General-purpose, balanced workloads | Less capable than Opus for edge cases | Long (100K+) | Fast | $$$ |
72+
| **GPT-4.1** | Most tasks, nuanced output | Higher cost, moderate speed | Long (128K) | Moderate | $$$ |
73+
| **GPT-4.1 Mini** | High-volume, cost-sensitive | Less nuanced, shorter context | Medium (32K-64K) | Very Fast | $$ |
74+
| **GPT o3** | General chat, broad compatibility | May lack latest features/capabilities | Medium (32K-64K) | Fast | $$ |
75+
| **Gemini 2.5 Pro** | Up-to-date info | Limited access, higher cost | Long (128K+) | Moderate | $$$ |
76+
| **Gemini 2.5 Flash** | Real-time, rapid responses | Shorter context, less nuanced | Medium (32K-64K) | Very Fast | $$ |
77+
| **Llama 4 Scout** | Privacy, customization, open source | Variable performance | Medium-Long (varies) | Fast | $ |
78+
79+
<sup>
80+
\+ Context window sizes are approximate and may vary by deployment/version.
81+
</sup>
82+
<sup>++ Relative cost per 1K tokens ($ = lowest, $$$$ = highest)</sup>

0 commit comments

Comments
 (0)