feat(proxy): log compressed messages alongside original request by gglucass · Pull Request #261 · chopratejas/headroom

gglucass · 2026-04-24T12:27:09Z

Description

Expose the post-compression message list that was actually sent upstream as a new compressed_messages field on RequestLog, paired with the existing (now consistently pre-compression) request_messages. Consumers of /transformations/feed — dashboards and any downstream observability — can now diff the two sides of a compression to see exactly what the pipeline stripped, replaced, or kept. Turns an abstract "saved N tokens" into a legible before/after.

Gated by the same log_full_messages flag as request_messages so the two sides stay in sync; it's pointless to store one without the other.

Also fixes a latent correctness bug: today's request_messages field is inconsistent across the four RequestLog construction sites — sometimes it's the pre-compression snapshot, sometimes it's the mutated body["messages"] (which is the compressed list, because the proxy mutates body in place before the log call). After this change, request_messages always means pre-compression and compressed_messages always means what went upstream.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Performance improvement
Code refactoring (no functional changes)

Note on "breaking": strictly speaking this is a semantic correction of an inconsistently-populated field, not a schema break. The field name request_messages is unchanged and the JSON shape is unchanged; what changes is that the field now consistently holds the pre-compression list. Consumers that treated it as "whatever messages we have" continue to work. Consumers that depended on the accidental post-compression value (if any existed) would shift to compressed_messages.

Changes Made

headroom/proxy/models.py: RequestLog gains compressed_messages: list[dict] | None = None. Doc comment explains it's paired with request_messages and gated by the same log_full_messages flag.
headroom/proxy/handlers/anthropic.py (2 sites — Bedrock non-streaming and main non-streaming): request_messages now consistently sources from original_messages (the pre-compression snapshot at line 724), compressed_messages sources from body["messages"] (the compressed list after in-place mutation at line 1189). Both gated symmetrically.
headroom/proxy/handlers/streaming.py (2 sites — main streaming in _finalize_stream_response, Bedrock streaming in _stream_response_bedrock): same treatment. _stream_response_bedrock gains a new original_messages: list[dict] | None = None parameter so it has access to the pre-compression snapshot; the sole caller in anthropic.py now threads it through.
headroom/proxy/server.py: /transformations/feed adds compressed_messages to the JSON payload alongside the existing request_messages / response_content. Split into a separate preceding commit is a one-time EOL normalization to LF — the file blob in history carries CRLF but .gitattributes declares *.py text eol=lf, so any contributor editing server.py triggers the same whole-file renormalization. Separating the two commits keeps this feature commit's diff at a single line.
headroom/proxy/request_logger.py: compressed_messages is stripped from the JSONL file log and from get_recent() alongside the existing request_messages / response_content stripping. get_memory_stats() also counts it toward the deque's byte budget.

Testing

Unit tests pass (pytest)
Linting passes (ruff check . and ruff format --check .)
Type checking passes (mypy headroom)
New tests added for new functionality
Manual testing performed (via the Headroom Desktop client that consumes /transformations/feed — confirmed both fields arrive and render)

Test coverage added/extended:

tests/test_proxy/test_request_logger.py (new file): round-trip unit tests for RequestLogger. Confirms get_recent strips both sides (pre + post), get_recent_with_messages exposes both, and the JSONL file log drops both when log_full_messages=False.
tests/test_proxy/test_transformations_feed.py: extended to assert compressed_messages appears in the endpoint payload alongside request_messages / response_content.
tests/test_proxy_streaming_request_logger.py: existing include/omit tests updated to assert both sides populate when the flag is on and both are None when it's off.

Test Output

$ uv run ruff check headroom tests
All checks passed!
$ uv run ruff format --check headroom tests
614 files already formatted
$ uv run pytest tests/test_proxy/test_request_logger.py tests/test_proxy_streaming_request_logger.py tests/test_proxy/test_transformations_feed.py -v
...
tests/test_proxy/test_request_logger.py::test_get_recent_strips_compressed_messages_alongside_request_and_response PASSED
tests/test_proxy/test_request_logger.py::test_get_recent_with_messages_returns_compressed_messages PASSED
tests/test_proxy/test_request_logger.py::test_jsonl_file_strips_both_sides_when_log_full_messages_disabled PASSED
tests/test_proxy_streaming_request_logger.py::test_finalize_stream_response_logs_original_and_compressed_messages PASSED
tests/test_proxy_streaming_request_logger.py::test_finalize_stream_response_omits_messages_when_log_full_messages_disabled PASSED
tests/test_proxy/test_transformations_feed.py::test_transformations_feed_returns_messages PASSED
...
============================== 11 passed in 5.36s ==============================

Checklist

My code follows the project's style guidelines
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas (the two-sided gating at each log site, the _stream_response_bedrock parameter addition, and the get_memory_stats accounting)
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have updated the CHANGELOG.md if applicable

Additional Notes

Non-Anthropic backends

handlers/openai.py and handlers/gemini.py do not currently emit RequestLog entries at all — only Anthropic and the shared streaming paths do. This PR therefore only populates compressed_messages on Anthropic traffic (which is what /transformations/feed shows today). Wiring OpenAI and Gemini into RequestLogger end-to-end is a separate, larger gap worth its own PR.

`server.py` EOL normalization

The feature change in server.py is a single line. To keep the diff readable, the preceding commit is a whitespace-only chore(proxy): normalize server.py to LF per .gitattributes — the file blob was stored with CRLF terminators but .gitattributes declares *.py text eol=lf. Any contributor touching server.py triggers this renormalization; isolating it here keeps the feature commit reviewable. Happy to rebase / drop / reshape as preferred.

Downstream desktop compatibility

The Headroom Desktop client I work on now consumes compressed_messages and renders the pre/post pair side-by-side on the "Recent large compression" card. The desktop was updated to handle both shapes: proxies without the field render the legacy single "Request" block; proxies with the field render "Request (original, N tokens)" + "Request (compressed, M tokens)" where N/M come from input_tokens_original / input_tokens_optimized. No changes needed downstream if this PR lands.

🤖 Generated with Claude Code

The server.py blob in history carries CRLF line terminators, but .gitattributes declares `*.py text eol=lf`. Any edit to server.py therefore produces a whole-file diff as git renormalizes. Split this one-time cleanup into its own commit so feature diffs stay readable.

Expose the post-compression message list that was actually sent upstream as a new `compressed_messages` field on `RequestLog`, paired with the existing (now consistently pre-compression) `request_messages`. Consumers of `/transformations/feed` can now diff the two sides of a compression to see exactly what the pipeline stripped, replaced, or kept — turning an abstract "saved N tokens" into a legible before/after. Gated by the same `log_full_messages` flag as `request_messages` so the two sides stay in sync; it's pointless to store one without the other. Fixes a latent correctness bug along the way: today's `request_messages` field is inconsistent across the four `RequestLog` construction sites — sometimes it's the pre-compression snapshot, sometimes it's the mutated `body["messages"]` (which is the compressed list, because the proxy mutates `body` in place before the log call). After this change, `request_messages` is always pre-compression and `compressed_messages` is always what went upstream. Changes - `RequestLog` gains `compressed_messages: list[dict] | None = None`. - All four `RequestLog(...)` sites (`anthropic.py` non-streaming ×2, `streaming.py` streaming ×2) thread both fields through. - `_stream_response_bedrock` gains an `original_messages` parameter so the Bedrock streaming path has access to the pre-compression snapshot (the only caller now passes it in). - `/transformations/feed` adds `compressed_messages` to its JSON payload. - `RequestLogger` strips `compressed_messages` from the JSONL file log and from `get_recent` the same way it strips `request_messages`. - Memory accounting in `RequestLogger.get_memory_stats` now counts `compressed_messages` toward the deque's byte total. Tests - `test_proxy/test_request_logger.py` (new): strip behaviour on `get_recent`, presence on `get_recent_with_messages`, and JSONL file gating. - `test_proxy/test_transformations_feed.py`: extended to assert the new key is present/null alongside `request_messages` / `response_content`. - `test_proxy_streaming_request_logger.py`: existing include/omit tests updated to assert both sides (pre and post) are populated when the flag is on and dropped when it's off.

codecov · 2026-04-24T12:38:23Z

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
headroom/proxy/request_logger.py	33.33%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

…stats Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gglucass added 2 commits April 24, 2026 14:14

test(request_logger): cover compressed_messages branch in get_memory_…

22bca8f

…stats Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(proxy): log compressed messages alongside original request#261

feat(proxy): log compressed messages alongside original request#261
gglucass wants to merge 3 commits intochopratejas:mainfrom
gglucass:feat/compressed-messages-log

gglucass commented Apr 24, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gglucass commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Changes Made

Testing

Test Output

Checklist

Additional Notes

Non-Anthropic backends

server.py EOL normalization

Downstream desktop compatibility

Uh oh!

codecov Bot commented Apr 24, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gglucass commented Apr 24, 2026 •

edited

Loading

`server.py` EOL normalization