Skip to content

Stabilize git-upload-pack cache key by normalizing volatile fields#113

Merged
kbukum1 merged 9 commits intomainfrom
kbukum1/fix-nix-cache-flakiness
Apr 27, 2026
Merged

Stabilize git-upload-pack cache key by normalizing volatile fields#113
kbukum1 merged 9 commits intomainfrom
kbukum1/fix-nix-cache-flakiness

Conversation

@kbukum1
Copy link
Copy Markdown
Contributor

@kbukum1 kbukum1 commented Apr 27, 2026

What

Stabilize the disk cache key for POST .../git-upload-pack so that
runs differing only in per-process metadata produce a cache hit instead of
a fresh upstream call.

Why

Nix smoke tests are flaky because nix resolves flake dependencies via the git
smart-HTTP protocol rather than a registry. Each fetch POST carries agent=git/x.y.z
and (Git 2.36+) session-id=<uuid> — fields that drift across runs and across
worker images. The cache key is a SHA-256 of the request body, so any drift
forces a real upstream call → rate limit → flake. The fix needs to be narrow
enough that it can't return a stale or incomplete pack.

How

A new internal/gitproto package exposes two helpers:

  • IsUploadPackRequest — gate that requires POST + path suffix /git-upload-pack
    • Content-Type: application/x-git-upload-pack-request (parsed via mime.ParseMediaType
      so RFC 7231 case/whitespace variants match).
  • NormalizeUploadPackBody — parses the pkt-line body, drops volatile capability
    tokens, re-encodes (recomputing length prefixes so different agent string
    lengths still hash equal), and on parse failure returns the input unchanged
    so callers fall back to opaque hashing.

internal/cache/handlers.go adds a single conditional in key() that swaps the
hash input — never the outbound request body — when the gate matches.

What is normalized:

  • Standalone agent= and session-id= pkt-lines (v2 capability section)
  • Inline agent=… and session-id=… tokens on v1 want lines

What is preserved (anything that can change the upstream response):

  • want, have, capabilities, command=, deepen / shallow / filter,
    ref-prefix, object-format, all framing packets

have lines are deliberately preserved: they drive object negotiation, so
collapsing them across requests with different client object sets could serve
an incomplete pack.

Tests

  • internal/gitproto: pkt-line parser/encoder including malformed inputs and a
    round-trip; v1 and v2 normalization including agent length drift, session-id
    drift, ref-prefix preservation, prefix-vs-substring discrimination.
  • internal/cache: integration tests that the gate fires for real upload-pack
    POSTs, that haves do not collapse, that malformed bodies fall back to raw
    hashing, and that lookalike non-git POSTs are not normalized.

Verified locally with go test -race -shuffle=on -count=2 ./....

Copilot AI review requested due to automatic review settings April 27, 2026 19:05
@kbukum1 kbukum1 requested a review from a team as a code owner April 27, 2026 19:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses CI flakiness in the proxy disk cache for git-based dependency fetching (notably Nix flakes) by stabilizing cache keys for POST .../git-upload-pack requests whose bodies vary between runs.

Changes:

  • Add git-upload-pack-specific body normalization for cache-key generation by hashing only extracted want OIDs.
  • Add unit tests asserting cache-key stability across differing have lines and agent strings, and scoping normalization to POST .../git-upload-pack.
Show a summary per file
File Description
internal/cache/handlers.go Normalizes cache key body hashing for git-upload-pack POSTs using extracted/sorted want OIDs.
internal/cache/handlers_test.go Adds tests covering normalized vs non-normalized cache-key behavior for git-upload-pack requests.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers_test.go Outdated
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 19:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Comments suppressed due to low confidence (1)

internal/cache/handlers.go:90

  • normalizeGitBody currently hashes only the want object IDs and drops all negotiated capabilities (beyond just the variable agent=). In git-upload-pack, capabilities (e.g., side-band/side-band-64k, thin-pack, ofs-delta, filter=blob:none, protocol v2 features) can change the response encoding/content; collapsing different capability sets onto the same cache key can serve an incompatible cached response to a client. Consider including a normalized capability set in the hash input (e.g., keep the first want line’s capabilities but strip only agent= and ignore have/done), so cache entries remain correct across clients/versions.
// normalizeGitBody extracts only the "want" object IDs from a git-upload-pack
// POST body, producing a stable hash input. The "have" lines and capability/agent
// strings vary between runs (different git versions, different local state) and
// are excluded so that the cache key remains consistent across CI runs.
func normalizeGitBody(data []byte) []byte {
	matches := gitUploadPackWantRegex.FindAllSubmatch(data, -1)
	if len(matches) == 0 {
		return nil
	}
	wants := make([]string, len(matches))
	for i, match := range matches {
		wants[i] = string(match[1])
	}
	sort.Strings(wants)
	return []byte(strings.Join(wants, "\n"))
  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers_test.go Outdated
@kbukum1 kbukum1 marked this pull request as draft April 27, 2026 19:22
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 19:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 3

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers_test.go Outdated
Comment thread internal/cache/handlers_test.go Outdated
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 19:45
@kbukum1 kbukum1 changed the title Normalize git-upload-pack body in cache key to fix nix smoke test flakiness Stabilize git-upload-pack cache key by stripping volatile have/agent fields (fixes nix smoke-test flakiness) Apr 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers_test.go Outdated
@kbukum1 kbukum1 changed the title Stabilize git-upload-pack cache key by stripping volatile have/agent fields (fixes nix smoke-test flakiness) Stabilize git-upload-pack cache key by stripping volatile have/agent fields Apr 27, 2026
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 19:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Comments suppressed due to low confidence (1)

internal/cache/handlers.go:115

  • normalizeGitBody uses bytes.TrimSpace, which can drop trailing/leading newlines and even completely erase an all-flush body after length stripping, increasing the chance of cache-key collisions. Since this output is only fed to SHA-256, it’s safer to avoid trimming and hash the exact normalized bytes after removing the intended fields.
func normalizeGitBody(data []byte) []byte {
	normalized := gitHaveLineRegex.ReplaceAll(data, nil)
	normalized = gitAgentRegex.ReplaceAll(normalized, nil)
	normalized = pktLineLengthRegex.ReplaceAll(normalized, nil)
	return bytes.TrimSpace(normalized)
}
  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers.go Outdated
@kbukum1 kbukum1 force-pushed the kbukum1/fix-nix-cache-flakiness branch from 9258466 to 88a0085 Compare April 27, 2026 20:29
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 20:29
@kbukum1 kbukum1 marked this pull request as ready for review April 27, 2026 20:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 3

Comment thread internal/pktline/pktline.go Outdated
Comment thread internal/cache/handlers.go Outdated
Comment thread internal/cache/handlers.go Outdated
@kbukum1 kbukum1 requested review from JamieMagee and Copilot April 27, 2026 20:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 4

Comment thread internal/cache/handlers.go Outdated
Comment thread internal/pktline/pktline.go Outdated
Comment thread internal/pktline/pktline.go Outdated
Comment thread internal/pktline/pktline.go Outdated
@kbukum1 kbukum1 marked this pull request as draft April 27, 2026 20:56
- Introduced `pktline.go` and `pktline_test.go` for handling pkt-line format.
- Added functions to parse and encode pkt-lines, supporting special packets like flush, delim, and response-end.
- Created tests to validate parsing of various pkt-line scenarios including empty input, malformed data, and real-world examples from git repositories.
- Moved pkt-line functionality into a dedicated package to streamline the gitproto implementation.
- Implemented `upload_pack.go` and `upload_pack_test.go` to manage upload-pack requests and normalize request bodies, ensuring consistent cache keys by stripping volatile fields.
- Removed legacy `pktline` package to consolidate functionality within `gitproto`.
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 21:15
@kbukum1 kbukum1 changed the title Stabilize git-upload-pack cache key by stripping volatile have/agent fields Stabilize git-upload-pack cache key by stripping volatile have/agent/session-id fields Apr 27, 2026
@kbukum1 kbukum1 changed the title Stabilize git-upload-pack cache key by stripping volatile have/agent/session-id fields Stabilize git-upload-pack cache key by normalizing volatile fields Apr 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 3

Comment thread internal/gitproto/upload_pack_test.go Outdated
Comment thread internal/gitproto/upload_pack.go Outdated
Comment thread internal/gitproto/pktline.go
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 21:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 1

Comment thread internal/gitproto/pktline.go Outdated
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 21:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 2

Comment thread internal/gitproto/upload_pack.go Outdated
Comment thread internal/gitproto/pktline.go Outdated
…t to handle case-insensitivity and whitespace in Content-Type
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 22:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 1

Comment thread internal/gitproto/upload_pack.go Outdated
@kbukum1 kbukum1 requested a review from Copilot April 27, 2026 22:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 6/6 changed files
  • Comments generated: 0 new

@kbukum1 kbukum1 marked this pull request as ready for review April 27, 2026 22:38
@kbukum1 kbukum1 merged commit 2803a0f into main Apr 27, 2026
10 of 12 checks passed
@kbukum1 kbukum1 deleted the kbukum1/fix-nix-cache-flakiness branch April 27, 2026 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants