Source.run/1 error handling #180

tor-oscar · 2026-01-14T16:07:17Z

A new attempt at adding error handling to the Source.run/1 call.

This has been attempted twice before, #155 and #153. Both were (implicitly) closed due to me being dragged away for a very long time, sorry about that.

If the Source.run/1 function returns an error tuple, raises an error or takes longer than Dataloader.dataloader_timeout/1 the Source is removed from :sources field and the key :error with the reason as value is added.

This causes a variety of issues down the line, among them the misleading/unhelpful error message:

** (RuntimeError) Source does not exist: <failed_source_name>

Registered sources are:

[<other_source_names...> , :error]

This PR intends to preserve the source name key and replace the source implementation with {:error, reason}instead of replacing the source key value pair with the error key value pair.

There is no reasonable recovery in case of a failure. This is motivated by a two points:

There is no way to clear the failing batches from a Source (currently), any future Source.run/1 will
retry the same batches with potential additional batches. If the issue causing a timeout is not
resolved between runs it will incur severe delays to full response.
If future Source.run/1 on failed Sources would be allowed, it is not possible to distinguish
between failures of future batch/key and the previous Source error without keeping track of
which batch was part of which run. Not distinguishing between different errors could be
very detrimental during debugging.

It works seamlessly for Dataloader.get/4 and Dataloader.get_many/4 as they will simply handle the error tuple as any other error tuple.

It works fine for pending_batches? as source that has returned an error does not have any pending batches, ie false is a reasonable return value.

It is a little icky with Dataloader.put/5 since it is currently no perceivable difference between calling it on an errored source and not. The error will not be triggered until Dataloader.get/4 or Dataloader.get_many/4 is called.

It is a little icky with Dataloader.load/4 and Dataloader.load_many/4 since it is currently no perceivable difference between calling it on an errored source and not. The error will not be triggered until Dataloader.get/4 or Dataloader.get_many/4 is called.

Additional changes

Added test of the timeouts and general error handling
Added test describing the max_timeout feature
Made the added margin to the timeout configurable to support testing without waiting seconds.

Thoughts?

Error aware do_load format Add run timeout test Add timeout value test Add configurable timeout margin Fix crash in timeout handling

tor-oscar · 2026-02-02T10:19:45Z

@benwilson512 Do you have any comments or opinions on this approach to handling Source errors?

tor-oscar added 9 commits February 21, 2024 15:03

Improve error handling

5a1a38c

Error aware do_load format Add run timeout test Add timeout value test Add configurable timeout margin Fix crash in timeout handling

Fix warnings and tests

c5ff82b

Update deps

3bec86c

Handle run with errored sources

1899b87

Merge remote-tracking branch 'upstream/main' into v2

f917e8d

Merge remote-tracking branch 'upstream/main' into HEAD

442d621

comment

ce4b2a6

improve tests

97bf34b

spelling

f7d1bda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Source.run/1 error handling #180

Source.run/1 error handling #180

Uh oh!

tor-oscar commented Jan 14, 2026

Uh oh!

tor-oscar commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Source.run/1 error handling #180

Are you sure you want to change the base?

Source.run/1 error handling #180

Uh oh!

Conversation

tor-oscar commented Jan 14, 2026

Uh oh!

tor-oscar commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant