Skip to content

First Ollama call is slow (cold start) #1

@lucharo

Description

@lucharo

Description

The first call to Ollama after starting v2t takes significantly longer than subsequent calls. This appears to be an Ollama cold-start issue.

Logs

2025-12-12 12:08:32.828 | INFO     | voice2text:process_audio:136 - Transcribing...
2025-12-12 12:08:35.247 | INFO     | voice2text:process_audio:141 - Raw: Hey, how is it going? (2.42s)
2025-12-12 12:08:35.248 | INFO     | voice2text:process_audio:147 - Cleaning up...
2025-12-12 12:09:01.533 | INFO     | voice2text:process_audio:165 - Clean: How is it going? (26.28s)  <-- FIRST CALL: 26s
2025-12-12 12:09:04.926 | SUCCESS  | voice2text:process_audio:172 - Pasted!

2025-12-12 12:09:14.448 | INFO     | voice2text:process_audio:136 - Transcribing...
2025-12-12 12:09:16.807 | INFO     | voice2text:process_audio:141 - Raw: Hey, how's it going? (2.36s)
2025-12-12 12:09:16.808 | INFO     | voice2text:process_audio:147 - Cleaning up...
2025-12-12 12:09:17.389 | INFO     | voice2text:process_audio:165 - Clean: How's it going? (0.58s)   <-- SECOND CALL: 0.6s

Analysis

  • First Ollama cleanup: 26.28s
  • Second Ollama cleanup: 0.58s

This is a ~45x difference. Likely Ollama loading the model into memory on first call.

Possible solutions (for future consideration)

  • Warm up Ollama on startup (similar to how we warm up Whisper)
  • Document this as expected behavior
  • Investigate if ollama has a keep-alive or preload option

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions