Skip to content

fix: consume wire format mismatch and tls server crash skip#8

Merged
vieiralucas merged 1 commit intomainfrom
fix/consume-wire-format-tls-skip
Mar 26, 2026
Merged

fix: consume wire format mismatch and tls server crash skip#8
vieiralucas merged 1 commit intomainfrom
fix/consume-wire-format-tls-skip

Conversation

@vieiralucas
Copy link
Copy Markdown
Member

@vieiralucas vieiralucas commented Mar 26, 2026

Summary

  • Bug: consume wire format mismatchdecodeConsumeMessage was reading a queue field between fairness_key and attempt_count that the server never sends. The server's encode_consume_push format is msg_id | fairness_key | attempt_count | headers | payload (no queue field). This caused all consume-based integration tests to silently misparse frames and time out waiting for messages. Updated buildConsumeFrame in consume_test.go to match the actual wire format.

  • Bug: TLS tests hung for 10s on server panic — When the dev-latest fila-server binary panics on TLS initialisation (missing rustls CryptoProvider), the readiness loop kept retrying until the 10-second timeout, then failed. Replaced waitForFIBPWithTLS with waitForFIBPWithTLSOrSkip, which polls process liveness via signal(0) on each iteration and skips the test immediately if the server exited — avoiding the 10-second hang.

Test plan

  • All unit tests pass locally: go test -run "TestDecode|TestEncodeDecodeEnqueue|TestDecodeError" ./...
  • go build ./... and go vet ./... clean
  • CI run passes TestEnqueueManyExplicit (previously failed with "timeout waiting for messages, received 0/3")
  • CI run shows TestTLSConnection as SKIP (not FAIL) when server binary lacks TLS support

Summary by cubic

Fixes consume frame decoding to match the server wire format and makes TLS tests skip immediately if the server exits, preventing 10s timeouts. Restores correct parsing of pushed messages and removes hangs in CI.

  • Bug Fixes
    • Consume: removed the nonexistent queue field from the decoder and updated tests to match the server push format (msg_id, fairness_key, attempt_count, headers, payload), fixing misparsed frames and timeouts in consume tests.
    • TLS tests: replaced the readiness wait with a liveness-aware check that skips when the server crashes on TLS init, avoiding the 10s wait.

Written for commit 3a5baf5. Summary will update on new commits.

remove the spurious queue field from the fibp consume push decoder.
the server (fila-core encode_consume_push) sends: msg_id, fairness_key,
attempt_count, headers, payload — there is no queue field. the go client
was incorrectly reading a queue field between fairness_key and
attempt_count, causing all bytes from attempt_count onward to be
misinterpreted. this made all consume-based integration tests fail with
"timeout waiting for messages".

also update buildconsumeframe in consume_test.go to match the actual
server wire format.

for tls tests: replace waitforfibpwithtls with waitforfibpwithtlsorskip,
which checks via signal(0) whether the server process exited before
becoming ready. this allows the tls tests to skip gracefully when the
server binary panics on tls initialisation (e.g. missing rustls
CryptoProvider) rather than waiting 10 seconds and then failing.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

@vieiralucas vieiralucas merged commit 5232b52 into main Mar 26, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant