Skip to content

feat: add unified context-aware logging system#2

Open
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
devin/1773848994-logging-system
Open

feat: add unified context-aware logging system#2
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
devin/1773848994-logging-system

Conversation

@devin-ai-integration
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot commented Mar 18, 2026

Summary

Replaces all inconsistent logging approaches (13+ raw print() calls, manual [DEBUG]/[INFO] tags, sys.stderr writes, and the partial log_callback: Callable abstraction) with a unified, context-aware, hierarchical logging system.

New package src/crab/log/:

  • logger.pyCrabLogger with hierarchical context nesting via enter(), thread-safe emit with a shared lock, live subprocess stdout streaming via background threads
  • formatters.pyRichFormatter (ANSI-colored tree-style) and PlainFormatter (grep-friendly brackets)
  • handlers.pyStreamHandler (stdout → slurm_output.log under SLURM), TUIHandler (Rich markup → Textual RichLog widget)
  • __init__.pyget_logger() factory, reads CRAB_LOG_LEVEL env var

Integration changes:

  • engine.pyEngine and ExperimentRunner accept CrabLogger instead of Callable. run_job() and end_job() accept a logger. Live streaming enabled for concurrent apps via stream_process() background threads.
  • orchestrator.py — All print() replaced. New --log-level CLI flag added. Workers read CRAB_LOG_LEVEL from environment.
  • controller.pyTUIController builds a CrabLogger with a TUIHandler wired to the RichLog widget callback, replacing raw Rich markup strings in log calls.
  • slurm.py / mpi.py — Debug prints removed; wl_managers stay logger-agnostic.
  • Wrappers (microbench_common.py, miniFE.py, ib_send_lat.py) — Stray print() calls removed.

Review & Testing Checklist for Human

  • Signature compatibility at all call sites: run_job(), end_job(), wait_timed(), Engine(), and ExperimentRunner() all have changed signatures. Verify no call site was missed (search for old patterns like log_callback=, log_fn=, run_job( without logger=). A missed call site will cause a runtime crash on the cluster.
  • Dynamic _stream_thread attribute on job objects: job._stream_thread is set dynamically (not declared on the job/app class). Verify this doesn't conflict with existing attributes and that all hasattr guards are sufficient. Consider whether this should be a proper attribute on the base class.
  • Concurrent app detection logic: concurrent = len(static_schedule) > 1 or len(dependency_map) > 0 determines whether to live-stream stdout. Verify this correctly identifies concurrent vs sequential scenarios for your experiment configurations. A false positive means unnecessary threads; a false negative means no live output for concurrent apps.
  • TUI dual-handler behavior: TUIController now has both a StreamHandler (from get_logger()) AND a TUIHandler. This means TUI mode logs to both stdout and the widget. Verify this is acceptable or if the StreamHandler should be omitted in TUI mode.
  • End-to-end test on SLURM: Submit a real benchmark job and verify (a) slurm_output.log contains properly formatted, colored output, (b) tail -f slurm_output.log renders colors correctly, (c) concurrent app output is interleaved correctly with context labels, (d) --log-level DEBUG shows debug output, and (e) the TUI log tab displays records correctly.

Notes

  • traceback.print_exc() calls in fatal error handlers still write directly to stderr — this is intentional since the logger itself may be in a broken state at that point.
  • The workerpool_scheduler.py logging was intentionally left untouched per requirements (it's a separate sub-benchmark with its own scheduler.log).
  • The BenchmarkState import was removed from controller.py — verify it was truly unused before merging.

Link to Devin session: https://app.devin.ai/sessions/fc7b895897f54186a4d154c93d00b658
Requested by: @SharkGamerZ

New src/crab/log/ package with:
- CrabLogger: hierarchical context nesting (worker > experiment > run > app)
- RichFormatter: ANSI-colored tree-style output
- PlainFormatter: grep-friendly bracketed output
- StreamHandler: stdout (-> slurm_output.log under SLURM)
- TUIHandler: routes records to Textual RichLog widget
- Live subprocess output streaming via background threads
- Thread-safe concurrent logging with write locks
- CRAB_LOG_LEVEL env var + --log-level CLI flag

Integration:
- engine.py: CrabLogger replaces log_callback, context nesting per experiment/run/app
- orchestrator.py: all print() replaced, --log-level flag added
- slurm.py/mpi.py: debug prints removed (wl_managers stay logger-agnostic)
- controller.py: TUIHandler wired to RichLog widget
- wrappers: stray prints removed from microbench_common, miniFE, ib_send_lat

Co-Authored-By: Matteo Marcelletti <marcellettimatteo02@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: Matteo Marcelletti <marcellettimatteo02@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant