Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions modules/http_client/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,49 @@ All notable changes to `dream_http_client` will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 5.1.3 - 2026-03-17

### Fixed

- **ETS table ownership no longer causes silent stream process crashes.**
`dream_http_client` is now a proper OTP application. Both ETS tables
(`dream_http_client_ref_mapping` and `dream_http_client_stream_recorders`)
are created in the application's `start/2` callback and owned by the
application master process, which lives for the entire application lifetime.
Previously, the tables were owned by whichever short-lived process first
created them. When that process exited, the tables were destroyed. Concurrent
stream processes crashed with `badarg` inside
`decode_stream_message_for_selector/1`, killing the stream silently — no
`on_stream_error` callback fired.
- **Race condition on ETS table creation eliminated.** Application start is
serialized by the BEAM's application controller. Two processes can no longer
race to create the tables.
- **Removed all `try/catch error:badarg` guards from ETS access.** The previous
fix wrapped every ETS operation in try/catch, which silently swallowed errors
(e.g., returning compressed bytes as raw data). With the table now guaranteed
to exist for the application's lifetime, these guards are unnecessary and
were masking real errors. ETS operations are now direct calls — if the table
is gone, the process crashes loudly, which is correct behavior.

### Added

- **OTP application infrastructure.** `dream_http_client_app` (application
behaviour) and `dream_http_client_sup` (supervisor) provide the standard OTP
lifecycle for the module's ETS tables. This follows the same pattern used by
Ranch (Cowboy's transport layer).
- **7 regression tests** covering ETS table ownership verification and
concurrent streaming scenarios. Integration tests verify concurrent streams
from short-lived callers all complete — the exact scenario from the bug
report.

### Removed

- **Unsupervised holder process.** The bare `spawn` + `ets:give_away` pattern
was replaced by proper OTP application ownership.
- **Lazy ETS table creation.** `ensure_ref_mapping_table/0`,
`ensure_recorder_table/0`, and `ensure_ets_tables/0` are removed. Tables
exist from application start.

## 5.1.2 - 2026-03-03

### Fixed
Expand Down
5 changes: 4 additions & 1 deletion modules/http_client/gleam.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name = "dream_http_client"
version = "5.1.2"
version = "5.1.3"
description = "Type-safe HTTP client for Gleam with streaming support"
licences = ["MIT"]
repository = { type = "github", user = "TrustBound", repo = "dream" }
Expand All @@ -18,6 +18,9 @@ simplifile = ">= 2.3.0 and < 3.0.0"
gleam_otp = ">= 1.2.0 and < 2.0.0"
gleam_crypto = ">= 1.3.0 and < 2.0.0"

[erlang]
application_start_module = "dream_http_client_app"

[dev-dependencies]
dream = { path = "../.." }
dream_mock_server = { path = "../mock_server" }
Expand Down
124 changes: 124 additions & 0 deletions modules/http_client/releases/release-5.1.3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# dream_http_client v5.1.3

**Release Date:** March 17, 2026

This patch release fixes a critical bug where concurrent HTTP streams would
silently die when the process that created the shared ETS table exited.
`dream_http_client` is now a proper OTP application — its ETS tables are
owned by the application master process, not by short-lived stream processes.
No API changes.

---

## Bug Fix: ETS table ownership causes silent stream process crashes

### The problem

`dream_httpc_shim` uses two named ETS tables (`dream_http_client_ref_mapping`
and `dream_http_client_stream_recorders`) for runtime state. Three separate
issues combined to produce silent stream failures:

**1. Transient table ownership**

ETS named tables are owned by their creating process. The tables were created
lazily by `ensure_ref_mapping_table()` and `ensure_recorder_table()`, called
from short-lived processes (the caller of `start_stream()`, or the stream
process itself via `request_stream_messages()`). When the creating process
exited, the table was destroyed — taking all ref mappings with it.

**2. Race condition on creation**

If two processes called `ensure_ref_mapping_table()` concurrently when the
table didn't exist, both saw `undefined` from `ets:info/1`, and the second
`ets:new/2` call crashed with `badarg` (named table already exists).

**3. Unprotected ETS access**

Seven functions accessed the ETS table with bare `ets:lookup`/`ets:insert`/
`ets:delete` calls — no error handling. When the table was destroyed, these
threw `badarg` inside `decode_stream_message_for_selector/1`, which runs in
the `receive` block of `gleam_erlang_ffi:select/2`. The crash killed the
stream process, but since it was spawned unlinked (`process.spawn_unlinked`),
no `on_stream_error` callback fired. The stream silently died.

### Observed in production

- Three independent streaming requests hung for exactly 900,001ms (the monitor
timeout), because the stream process crashed silently and no `on_stream_error`
callback fired
- Zero crash reports appeared in CloudWatch logs

### The fix: OTP application (the Ranch pattern)

`dream_http_client` is now a proper OTP application. Both ETS tables are
created in the application's `start/2` callback, owned by the application
master process — a long-lived process managed by the BEAM's application
controller that exists for the entire application lifetime.

```erlang
-module(dream_http_client_app).
-behaviour(application).

start(_Type, _Args) ->
ets:new(dream_http_client_ref_mapping, [set, public, named_table]),
ets:new(dream_http_client_stream_recorders, [set, public, named_table]),
dream_http_client_sup:start_link().
```

This is the same pattern used by Ranch (the transport layer behind Cowboy)
for its `ranch_server` ETS table.

**What this eliminates:**

- **Transient ownership.** The application master never exits during normal
operation. The tables live as long as the application.
- **Race conditions.** Application start is serialized by the BEAM's
application controller. Two processes cannot race to create the tables.
- **Silent failures.** All `try/catch error:badarg` guards have been removed
from ETS access functions. If the table somehow doesn't exist (which means
the application isn't started), the process crashes loudly — visible,
debuggable, and correct.
- **Unsupervised processes.** The bare `spawn` + `ets:give_away` holder
process is gone, replaced by OTP's built-in application lifecycle.
- **Lazy table creation.** `ensure_ref_mapping_table/0`,
`ensure_recorder_table/0`, and `ensure_ets_tables/0` are removed. Tables
exist from application start, not from first use.

### Regression tests (7 new tests, 192 total)

**ETS ownership verification:**

| Test | What it proves |
|---|---|
| `table_is_owned_by_application_not_caller` | Table exists, owner is not the test process, owner is alive |
| `stored_mappings_survive_writer_process_exit` | Data persists after the writing process exits |

**Integration tests through `start_stream()` API:**

| Test | What it proves |
|---|---|
| `stream_from_expired_caller_completes` | Stream started from a short-lived caller completes after caller exits |
| `concurrent_streams_from_expired_callers_both_complete` | Fast stream (1s) + slow stream (10s) from short-lived callers both complete — the exact bug scenario |
| `three_concurrent_streams_all_complete` | Three concurrent streams don't interfere with each other |
| `sequential_streams_after_process_exit` | New stream works after a previous stream's process has exited |
| `five_concurrent_streams_from_expired_callers` | Stress test: 5 concurrent streams from 5 short-lived callers all complete |

---

## Files changed

- `modules/http_client/src/dream_http_client/dream_http_client_app.erl` — New
OTP application behaviour module; creates ETS tables in `start/2`
- `modules/http_client/src/dream_http_client/dream_http_client_sup.erl` — New
minimal supervisor
- `modules/http_client/gleam.toml` — Added `[erlang] application_start_module`
- `modules/http_client/src/dream_http_client/dream_httpc_shim.erl` — Removed
`ensure_ref_mapping_table/0`, `table_holder_loop/0`, and all `try/catch`
guards on ETS access; restored direct ETS calls
- `modules/http_client/src/dream_http_client/client.gleam` — Removed lazy
table creation (`ensure_ets_tables`, `ensure_recorder_table`,
`ensure_ref_mapping_table_wrapper`, `ets_table_exists`, `ets_new`)
- `modules/http_client/test/ets_table_ownership_test.gleam` — 7 regression
tests for the new architecture
- `modules/http_client/test/ets_table_ownership_ffi.erl` — Simplified FFI
helpers for ETS introspection
35 changes: 0 additions & 35 deletions modules/http_client/src/dream_http_client/client.gleam
Original file line number Diff line number Diff line change
Expand Up @@ -2145,24 +2145,11 @@ fn pair_with_name(value: String, name: String) -> #(String, String) {
/// client.cancel_stream_handle(stream)
/// ```
pub fn start_stream(request: ClientRequest) -> Result(StreamHandle, String) {
// Ensure ETS tables exist before spawning
ensure_ets_tables()

// Spawn process to handle the stream
let stream_pid = process.spawn_unlinked(fn() { run_stream_process(request) })

Ok(StreamHandle(pid: stream_pid))
}

// Ensure all required ETS tables exist
fn ensure_ets_tables() -> Nil {
ensure_recorder_table()
ensure_ref_mapping_table_wrapper()
}

@external(erlang, "dream_httpc_shim", "ensure_ref_mapping_table")
fn ensure_ref_mapping_table_wrapper() -> Nil

fn run_stream_process(request: ClientRequest) -> Nil {
// Try playback from recording first
case maybe_replay_from_recording(request) {
Expand Down Expand Up @@ -2445,27 +2432,6 @@ type MessageStreamRecorderState {
// ETS table name for recorder state
const recorder_table_name = "dream_http_client_stream_recorders"

// Ensure ETS table exists (idempotent)
fn ensure_recorder_table() -> Nil {
case ets_table_exists(recorder_table_name) {
True -> Nil
False -> {
ets_new(recorder_table_name, [
atom.create("set"),
atom.create("public"),
atom.create("named_table"),
])
Nil
}
}
}

@external(erlang, "dream_httpc_shim", "ets_table_exists")
fn ets_table_exists(name: String) -> Bool

@external(erlang, "dream_httpc_shim", "ets_new")
fn ets_new(name: String, options: List(atom.Atom)) -> d.Dynamic

@external(erlang, "dream_httpc_shim", "ets_insert")
fn ets_insert(
table: String,
Expand All @@ -2488,7 +2454,6 @@ fn store_message_stream_recorder(
rec: recorder.Recorder,
recorded_req: recording.RecordedRequest,
) -> Nil {
ensure_recorder_table()
let RequestId(id) = request_id
ets_insert(recorder_table_name, id, rec, recorded_req, [], [], None)
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
-module(dream_http_client_app).
-behaviour(application).
-export([start/2, stop/1]).

start(_Type, _Args) ->
ets:new(dream_http_client_ref_mapping, [set, public, named_table]),
ets:new(dream_http_client_stream_recorders, [set, public, named_table]),
dream_http_client_sup:start_link().

stop(_State) ->
ok.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
-module(dream_http_client_sup).
-behaviour(supervisor).
-export([start_link/0, init/1]).

start_link() ->
supervisor:start_link({local, ?MODULE}, ?MODULE, []).

init([]) ->
{ok, {#{strategy => one_for_one, intensity => 5, period => 10}, []}}.
30 changes: 7 additions & 23 deletions modules/http_client/src/dream_http_client/dream_httpc_shim.erl
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@
-export([request_stream/6, fetch_next/2, fetch_start_headers/2, request_stream_messages/6,
cancel_stream/1, cancel_stream_by_string/1, receive_stream_message/1,
decode_stream_message_for_selector/1, normalize_headers/1, request_sync/5,
ets_table_exists/1, ets_new/2, ets_insert/7, ets_lookup/2, ets_delete/2,
ensure_ref_mapping_table/0]).
ets_table_exists/1, ets_new/2, ets_insert/7, ets_lookup/2, ets_delete/2]).

%% @doc Start a streaming HTTP request with pull-based chunk retrieval
%%
Expand Down Expand Up @@ -503,8 +502,6 @@ request_stream_messages(Method, Url, Headers, Body, _ReceiverPid, TimeoutMs) ->
ok = ensure_started(inets),
ok = configure_httpc(),

ensure_ref_mapping_table(),

NUrl = to_list(Url),
NHeaders = maybe_add_accept_encoding(to_headers(Headers)),
Req = build_req(NUrl, NHeaders, Body),
Expand Down Expand Up @@ -732,17 +729,17 @@ decode_stream_message_for_selector({http, InnerMessage}) ->
error(badarg)
end.

%% Get string ID for httpc ref, creating mapping if needed
%% This handles the case where selector receives messages before we stored the mapping
%% Get string ID for httpc ref, creating mapping if needed.
%% This handles the case where selector receives messages before we stored
%% the mapping.
get_or_create_string_id(HttpcRef) ->
case lookup_string_by_ref(HttpcRef) of
{some, StringId} ->
StringId;
none ->
%% First time seeing this ref - create mapping
StringId = ref_to_string(HttpcRef),
store_ref_mapping(StringId, HttpcRef),
StringId
NewId = ref_to_string(HttpcRef),
store_ref_mapping(NewId, HttpcRef),
NewId
end.

%% @doc Normalize HTTP headers to binary tuples for Gleam decoding
Expand Down Expand Up @@ -1091,26 +1088,14 @@ ets_delete(TableName, Key) ->
%% Table for mapping string IDs to httpc refs (for cancellation)
-define(REF_MAPPING_TABLE, dream_http_client_ref_mapping).

%% Ensure ref mapping table exists (created on first use)
ensure_ref_mapping_table() ->
case ets:info(?REF_MAPPING_TABLE) of
undefined ->
ets:new(?REF_MAPPING_TABLE, [set, public, named_table]),
ok;
_ ->
ok
end.

%% Convert httpc ref to unique string ID
%% Uses the ref's string representation which is guaranteed unique
ref_to_string(Ref) ->
ensure_utf8_binary(io_lib:format("~p", [Ref])).

%% Store bidirectional mapping: string <-> ref
store_ref_mapping(StringId, HttpcRef) ->
ensure_ref_mapping_table(),
ets:insert(?REF_MAPPING_TABLE, {StringId, HttpcRef}),
%% Also store reverse mapping for message translation
ets:insert(?REF_MAPPING_TABLE, {HttpcRef, StringId}),
ok.

Expand All @@ -1137,7 +1122,6 @@ maybe_store_stream_zlib(StringId, Headers) ->
case detect_stream_encoding(Headers) of
{_Enc, WindowBits} ->
Z = init_zlib_context(WindowBits),
ensure_ref_mapping_table(),
ets:insert(?REF_MAPPING_TABLE, {{zlib, StringId}, Z}),
ok;
none ->
Expand Down
40 changes: 40 additions & 0 deletions modules/http_client/test/ets_table_ownership_ffi.erl
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
-module(ets_table_ownership_ffi).
-export([table_exists/0, table_owner_is_not_self/0, table_owner_is_alive/0,
store_test_mapping/0, lookup_test_mapping_exists/0,
clear_test_mappings/0]).

-define(TABLE, dream_http_client_ref_mapping).

table_exists() ->
case ets:info(?TABLE) of
undefined -> false;
_ -> true
end.

table_owner_is_not_self() ->
case ets:info(?TABLE, owner) of
undefined -> false;
Pid -> Pid =/= self()
end.

table_owner_is_alive() ->
case ets:info(?TABLE, owner) of
undefined -> false;
Pid -> is_process_alive(Pid)
end.

store_test_mapping() ->
ets:insert(?TABLE, {ets_test_ref_key, <<"ets_test_string_id">>}),
ets:insert(?TABLE, {<<"ets_test_string_id">>, ets_test_ref_key}),
nil.

lookup_test_mapping_exists() ->
case ets:lookup(?TABLE, ets_test_ref_key) of
[{ets_test_ref_key, <<"ets_test_string_id">>}] -> true;
_ -> false
end.

clear_test_mappings() ->
ets:delete(?TABLE, ets_test_ref_key),
ets:delete(?TABLE, <<"ets_test_string_id">>),
nil.
Loading
Loading