Skip to content

Conversation

@j93hahn
Copy link

@j93hahn j93hahn commented Jan 27, 2026

Why are the changes needed?

When ImageSpec.exist() returns None (failed to verify if image exists due to Docker not running, AWS CLI issues, network problems, etc.), the previous behavior was to assume the image exists and skip building. This caused ErrImagePull errors at runtime when the image didn't actually exist in the registry.

Example error that triggered this fix:

Warning  Failed  kubelet  Failed to pull image "472386928882.dkr.ecr.us-west-2.amazonaws.com/exa-hephaestus/flytekit:LPFpPiQrpf5Xk_3oyZC58Q": 
  rpc error: code = NotFound desc = failed to pull and unpack image: not found
Warning  Failed  kubelet  Error: ErrImagePull

What changes were proposed in this pull request?

Changed the behavior in ImageSpecBuilder.should_build() when exist() returns None:

  • Before: Assumed image exists, skipped building, printed blue message
  • After: Raises RuntimeError with a clear message explaining how to fix the issue

This fails fast and surfaces the real problem (Docker/AWS not configured) immediately, rather than failing later at pod scheduling time with a confusing ErrImagePull error.

How was this patch tested?

Manual verification of the logic change. The fix is minimal and the behavior is straightforward.

Human review checklist

  • Verify this is the desired behavior (fail fast vs. silently assume image exists)
  • Check if the error message is helpful: suggests ensuring Docker/AWS CLI are configured, or using FLYTE_FORCE_PUSH_IMAGE_SPEC=1 to force rebuild

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Link to Devin run: https://app.devin.ai/sessions/2d0f2b4119614e8db4c39e002b676ce8
Requested by: @j93hahn

When ImageSpec.exist() returns None (failed to verify), the previous
behavior was to assume the image exists and skip building. This caused
ErrImagePull errors when the image didn't actually exist in the registry.

Now we build the image when we can't verify its existence, which is safer
than assuming it exists and failing later during pod scheduling.

Co-Authored-By: joshua@exa.ai <joshuajahn2001@gmail.com>
@devin-ai-integration
Copy link

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

… fails

Changed behavior to raise RuntimeError with clear message explaining
how to fix (ensure Docker/AWS CLI configured, or use FLYTE_FORCE_PUSH_IMAGE_SPEC=1).

Co-Authored-By: joshua@exa.ai <joshuajahn2001@gmail.com>
@devin-ai-integration devin-ai-integration bot changed the title [image_spec]: Build image when existence check fails [image_spec]: Fail when image existence check fails Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants