Skip to content

async admin stats refresh#350

Open
lyonsno wants to merge 4 commits intojundot:mainfrom
lyonsno:fix/async-admin-stats-refresh
Open

async admin stats refresh#350
lyonsno wants to merge 4 commits intojundot:mainfrom
lyonsno:fix/async-admin-stats-refresh

Conversation

@lyonsno
Copy link
Copy Markdown
Contributor

@lyonsno lyonsno commented Mar 22, 2026

Summary

This PR moves menubar admin-stats polling off the main thread and hardens the refresh lifecycle so stale results from prior server generations do not leak back into the UI.

It also folds in the menubar import-safety/package-init work that was previously split into #349, so the packaged app modules remain importable in headless/CI environments without pulling in the heavy omlx runtime.

What Changed

  • Moved menubar stats refresh work into a background worker instead of doing HTTP polling on the timer callback.
  • Added single-flight guarding and generation-token based invalidation for async refresh results.
  • Discarded stale callbacks after stop/restart and other non-running state transitions.
  • Added fallback invalidation for STOPPING and a watchdog to recover from stuck in-flight refresh state.
  • Closed abandoned requests.Session objects on invalidation, stale completion, non-running completion, and snapshot failure paths.
  • Kept session adoption on the main thread only after the refresh token is validated.
  • Added comments/docstrings clarifying the main-thread ownership model for refresh state and the serial reuse of requests.Session.
  • Reworked lightweight version loading for the menubar package so it works in both repo/dev and bundled .app layouts without importing the heavy omlx runtime.
  • _is_newer_version now uses packaging.version.Version for PEP 440-aware comparison.
  • _load_version() now logs a warning before falling back to "0.0.0".
  • Synced the branch with current main and resolved the resulting overlap in packaging/omlx_app/app.py.

Test Coverage

Added and expanded focused contract tests for:

  • non-blocking stats polling cadence
  • stale success/failure callback rejection
  • stop/start and status-flap invalidation
  • STOPPING fallback behavior
  • watchdog recovery
  • session isolation across generations
  • session cleanup on abandoned failure paths
  • import-safety for omlx_app and omlx_app.app
  • bundle-layout version loading
  • version comparison via packaging.version.Version
  • warning-on-fallback behavior for missing version files

Validation

Focused menubar/package suites passing locally:

  • /Users/noahlyons/dev/omlx/.venv/bin/python -m pytest tests/test_omlx_app_async_refresh.py tests/test_omlx_app_package_init.py -q

Sanity check:

  • python3 -m py_compile packaging/omlx_app/app.py

Notes

@lyonsno lyonsno changed the title Fix/async admin stats refresh async admin stats refresh Mar 22, 2026
@lyonsno lyonsno marked this pull request as ready for review March 23, 2026 05:16
@jundot jundot force-pushed the main branch 7 times, most recently from bef2aeb to 86720d8 Compare March 23, 2026 19:49
Move menubar admin-stats polling off the main thread into a background
worker. Guard against stale results with single-flight locking and
generation-token invalidation across stop/restart, status-flap, and
STOPPING transitions.

Key changes:
- Background worker for HTTP stats polling with per-generation sessions
- Watchdog recovery for stuck in-flight refresh state
- Session cleanup on invalidation, stale completion, and failure paths
- Lightweight version loader for bundled .app layouts (no omlx import)
- Internal version comparator replacing packaging.version dependency
  (handles stable upgrades, rc→final, .postN)

Includes focused contract tests for all async refresh, invalidation,
session isolation, watchdog, and version comparison paths.
@lyonsno lyonsno force-pushed the fix/async-admin-stats-refresh branch 2 times, most recently from 87885e9 to e1b3a44 Compare March 24, 2026 17:05
Copy link
Copy Markdown
Owner

@jundot jundot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Moving stats polling off the main thread makes sense since the 2s HTTP timeout could freeze the menubar.

A few things:

  • Thread safety: _stats_refresh_in_flight / _stats_refresh_token are accessed from both main thread and background workers without a lock. The 5s polling interval makes races unlikely, but a brief comment or a threading.Lock would clarify the intent.

  • _is_newer_version custom parser: PEP 440 has edge cases (epochs, local versions, dev+pre combos) the regex doesn't cover. packaging ships with pip so it's available everywhere. If the bundled .app is missing it, that's a build config fix, not a code workaround.

  • requests.Session across threads: not documented as thread-safe. Current pattern is probably fine (one thread at a time) but worth a comment.

  • _load_version fallback: silently returning "0.0.0" could mask bundle build issues. A log warning would help.

Overall design is solid and test coverage is thorough. Nothing blocking, mostly hardening suggestions.

@jundot jundot force-pushed the main branch 2 times, most recently from 187e87b to dfc5b20 Compare March 29, 2026 10:44
@lyonsno
Copy link
Copy Markdown
Contributor Author

lyonsno commented Mar 29, 2026

Thanks for the review — all four points addressed in 58898f8:

  • Thread safety: went with comments over a lock. The single-flight guard (_stats_refresh_in_flight) means only one worker runs at a time, so the main thread and workers never race on the token/flag fields. Added comments at each access point explaining the ownership model.

  • _is_newer_version: replaced the custom regex parser with packaging.version.Version. Test updated to cover epochs, dev releases, and invalid version strings.

  • requests.Session across threads: added a docstring note explaining the session is reused serially (the single-flight guard prevents overlapping workers), not shared concurrently.

  • _load_version fallback: now logs a warning with the candidate paths before returning "0.0.0". Added a test asserting the warning fires.

lyonsno and others added 2 commits March 29, 2026 16:38
Make _fetch_stats_snapshot a @staticmethod so the background worker
never touches self — api_key, base_url, and session are snapshotted
by the caller before dispatch.

Drop eager session close from _invalidate_stats_refresh_generation;
the stale-token rejection in the completion handler already closes
orphaned sessions when the worker finishes.

Tighten the stale-token guard: reject when token != current, not only
when token is not None and != current. The old guard silently accepted
results with a missing token field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants