Skip to content

Fix CI hang in reconnect tests after STATE_CONNECTED scope change#23

Merged
SteveEasley merged 3 commits intoSteveEasley:mainfrom
robotdan:fix/test-hang-after-reconnect-signal-change
May 6, 2026
Merged

Fix CI hang in reconnect tests after STATE_CONNECTED scope change#23
SteveEasley merged 3 commits intoSteveEasley:mainfrom
robotdan:fix/test-hang-after-reconnect-signal-change

Conversation

@robotdan
Copy link
Copy Markdown
Contributor

@robotdan robotdan commented May 6, 2026

Problem

Every CI run on main since PR #18 merged (#19, #20, #21, v1.1.6 release, #22) was cancelled at the 6-hour job timeout. Job logs show pytest gets through 34 tests, then hangs forever on test #35 (test_reconnect_calls_on_reconnect).

2026-04-05T14:57:33Z  pytest -q
...
2026-04-05T20:57:19Z  ..................................
2026-04-05T20:57:19Z  ##[error]The runner has received a shutdown signal.

This PR fixes the hang and also fixes a separate failing test that the hang was hiding.

Root cause 1: hanging reconnect tests

The two tests added in PR #18 (Refresh device state after auto-reconnect) had two bugs that combined to produce an infinite hang:

Bug A: waiting for STATE_CONNECTED on initial connect

After PR #19 (Dispatch STATE_CONNECTED only on auto-reconnect) moved the dispatch out of _connect(), await connect_signal.wait() after initial connect() waits forever. PR #19 updated other tests to assert state == STATE_CONNECTED directly, but PR #19 was opened before PR #18 merged so the two tests added by #18 were not on its radar.

Bug B: missing sleep before emulator.stop()

Other reconnect tests (test_reconnect_during_event, test_reconnect_cancelled) include await asyncio.sleep(0.1) between the initial connect() and emulator.stop() to let the emulator's _handle_client callback register the client. PR #18's two tests omit it. Without the pause, emulator.stop() finds an empty _clients list, doesn't disconnect anything, the test client never sees EOF, and STATE_DISCONNECTED is never dispatched — disconnect_signal.wait() hangs.

Fix

  1. Drop await connect_signal.wait() after the initial connect; assert state == STATE_CONNECTED directly (matches the pattern used by tests updated in PR Dispatch STATE_CONNECTED only on auto-reconnect #19).
  2. Drop the now-redundant connect_signal.clear() calls — the signal is no longer set during initial connect.
  3. Add await asyncio.sleep(0.1) before emulator.stop() (matches the pattern used by other reconnect tests).

Root cause 2: bad mock.patch target

Once the hang was fixed and CI ran to completion, tests/test_d_device.py::test_send_syslog_identification_uses_dunder_version (added in PR #22) failed:

AttributeError: <module 'kaleidescape.device'...> does not have the attribute '__version__'

The test patched kaleidescape.device.__version__, but device.py does from . import __version__ inside _send_syslog_identification, so it's a function-local, not a module attribute. mock.patch requires the attribute to exist on the target.

This test never executed in CI before merge because PR #22's runs were cancelled by the same hang.

Fix

Patch kaleidescape.__version__ (the actual location). The function's from . import __version__ re-reads it on each call, so the patched sentinel propagates through.

robotdan and others added 3 commits May 6, 2026 12:26
PR SteveEasley#18 (Refresh device state after auto-reconnect) added two tests that
wait on `connect_signal` (STATE_CONNECTED) after the initial connect.
PR SteveEasley#19 (Dispatch STATE_CONNECTED only on auto-reconnect) then moved the
STATE_CONNECTED dispatch out of `_connect()` and into `_reconnect()`.
PR SteveEasley#19 updated other tests with the same pattern, but SteveEasley#19 was opened
before SteveEasley#18 merged, so the two tests added by SteveEasley#18 weren't updated.

Result: `await connect_signal.wait()` after initial connect waits
forever, hanging CI for the full 6h job timeout. Every CI run on main
since SteveEasley#18 has been cancelled this way (SteveEasley#19, SteveEasley#20, SteveEasley#21, v1.1.6, SteveEasley#22).

Match the pattern used by other tests updated in SteveEasley#19: assert
`state == STATE_CONNECTED` directly after the initial connect, and
keep `connect_signal.wait()` only for the post-reconnect signal where
STATE_CONNECTED is actually dispatched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
test_send_syslog_identification_uses_dunder_version (PR SteveEasley#22) patched
`kaleidescape.device.__version__`, which doesn't exist as a module
attribute — `device.py` does `from . import __version__` inside
`_send_syslog_identification`, making it a function-local. mock.patch
requires the attribute to exist on the target, so the test failed with
AttributeError.

Patch `kaleidescape.__version__` (the real location) instead. The
function's `from . import __version__` re-reads it on each call, so
the patched sentinel propagates through.

This test never executed in CI before merge because PR SteveEasley#22's runs were
cancelled at the 6h job timeout (the same hang fixed in this PR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two tests added in PR SteveEasley#18 omitted the standard
`await asyncio.sleep(0.1)` between the initial `connect()` and
`emulator.stop()`. Without that pause, the emulator's `_handle_client`
callback hasn't registered the new client yet, so `emulator.stop()`
finds an empty `_clients` list, doesn't disconnect anything, and the
test client never sees EOF. STATE_DISCONNECTED is never dispatched and
`disconnect_signal.wait()` hangs.

Match the existing pattern from `test_reconnect_during_event` and
`test_reconnect_cancelled` (both use the same sleep for the same reason).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@SteveEasley SteveEasley merged commit 0c4d1ec into SteveEasley:main May 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants