fix: patch systemic OS-level vulnerabilities, OOM risks, and concurre…#799
fix: patch systemic OS-level vulnerabilities, OOM risks, and concurre…#799mrvenom17 wants to merge 1 commit intoentireio:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens core CLI runtime behavior against security, memory, and concurrency failure modes (symlink/path traversal risk, OS ARG_MAX exhaustion, concurrent map panics, and large-transcript OOM).
Changes:
- Return deep copies of global
PIIConfigmaps to avoid concurrent map read/write panics. - Stream file copies via
io.Copywithos.Root-based traversal-resistant writes to reduce memory usage. - Batch
git fetch origin <hash...>blob-fetch requests to avoid OS argument length limits. - Tighten transcript chunking and session state file handling with additional safety checks.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| redact/pii.go | Deep-copies PII config maps on read to avoid concurrent map access panics. |
| cmd/entire/cli/utils.go | Refactors copyFile to stream data and use os.Root for safer writes. |
| cmd/entire/cli/session/state.go | Adds post-rename validation that session state files are regular files (no symlink). |
| cmd/entire/cli/git_operations.go | Batches hash-based git fetch calls to reduce ARG_MAX risk. |
| cmd/entire/cli/agent/opencode/opencode.go | Moves OpenCode session dir under repo .git and ensures it exists. |
| cmd/entire/cli/agent/geminicli/gemini.go | Adds single-message-size guard during chunking (and base size refactor). |
| cmd/entire/cli/agent/claudecode/claude.go | Reworks transcript position + offset scanning to fixed-buffer reading with a max line limit. |
Comments suppressed due to low confidence (2)
cmd/entire/cli/agent/claudecode/claude.go:306
- The updated doc comment says this uses a fixed-size buffer to handle “arbitrarily long lines safely”, but the implementation enforces
maxLineSize = 10MBand silently skips parsing lines beyond that limit. Please update the comment to describe the actual behavior/limit so callers understand that oversized JSONL lines will be ignored.
// ExtractModifiedFilesFromOffset extracts files modified since a given line number.
// For Claude Code (JSONL format), offset is the starting line number.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns:
// - files: list of file paths modified by Claude (from Write/Edit tools)
// - currentPosition: total number of lines in the file
// - error: any error encountered during reading
cmd/entire/cli/agent/claudecode/claude.go:376
- This change rewrites transcript position counting and offset-based scanning logic in a way that’s easy to regress (EOF without trailing newline, very long single-line JSONL, offset boundaries, maxLineSize behavior). There are no unit tests covering
GetTranscriptPosition/ExtractModifiedFilesFromOffsetfor Claude Code; please add targeted tests for these edge cases to prevent future breakage.
// GetTranscriptPosition returns the current line count of a Claude Code transcript.
// Claude Code uses JSONL format, so position is the number of lines.
// This is a lightweight operation that only counts lines without parsing JSON.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns 0 if the file doesn't exist or is empty.
func (c *ClaudeCodeAgent) GetTranscriptPosition(path string) (int, error) {
if path == "" {
return 0, nil
}
file, err := os.Open(path) //nolint:gosec // Path comes from Claude Code transcript location
if err != nil {
if os.IsNotExist(err) {
return 0, nil
}
return 0, fmt.Errorf("failed to open transcript file: %w", err)
}
defer file.Close()
buf := make([]byte, 32*1024)
lineCount := 0
var lastByte byte = '\n'
for {
n, err := file.Read(buf)
if n > 0 {
lineCount += bytes.Count(buf[:n], []byte{'\n'})
lastByte = buf[n-1]
}
if err != nil {
if err == io.EOF {
if lastByte != '\n' {
lineCount++
}
break
}
return 0, fmt.Errorf("failed to read transcript: %w", err)
}
}
return lineCount, nil
}
// ExtractModifiedFilesFromOffset extracts files modified since a given line number.
// For Claude Code (JSONL format), offset is the starting line number.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns:
// - files: list of file paths modified by Claude (from Write/Edit tools)
// - currentPosition: total number of lines in the file
// - error: any error encountered during reading
func (c *ClaudeCodeAgent) ExtractModifiedFilesFromOffset(path string, startOffset int) (files []string, currentPosition int, err error) {
if path == "" {
return nil, 0, nil
}
file, openErr := os.Open(path) //nolint:gosec // Path comes from Claude Code transcript location
if openErr != nil {
if os.IsNotExist(openErr) {
return nil, 0, nil
}
return nil, 0, fmt.Errorf("failed to open transcript file: %w", openErr)
}
defer file.Close()
var lines []TranscriptLine
lineNum := 0
var lastByte byte = '\n'
buf := make([]byte, 32*1024)
var lineBuf []byte
const maxLineSize = 10 * 1024 * 1024 // 10MB limit
for {
n, readErr := file.Read(buf)
if n > 0 {
chunk := buf[:n]
for len(chunk) > 0 {
idx := bytes.IndexByte(chunk, '\n')
if idx == -1 {
if lineNum >= startOffset && len(lineBuf) < maxLineSize {
lineBuf = append(lineBuf, chunk...)
}
lastByte = chunk[len(chunk)-1]
break
}
if lineNum >= startOffset {
if len(lineBuf) < maxLineSize {
lineBuf = append(lineBuf, chunk[:idx]...)
var line TranscriptLine
if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil {
lines = append(lines, line)
}
}
lineBuf = lineBuf[:0]
}
lineNum++
lastByte = '\n'
chunk = chunk[idx+1:]
}
}
if readErr != nil {
if readErr == io.EOF {
if lastByte != '\n' {
if lineNum >= startOffset && len(lineBuf) > 0 && len(lineBuf) < maxLineSize {
var line TranscriptLine
if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil {
lines = append(lines, line)
}
}
lineNum++
}
break
}
return nil, 0, fmt.Errorf("failed to read transcript: %w", readErr)
}
}
return ExtractModifiedFiles(lines), lineNum, nil
}
| projectDir := SanitizePathForOpenCode(repoPath) | ||
| return filepath.Join(os.TempDir(), "entire-opencode", projectDir), nil | ||
| dir := filepath.Join(repoPath, ".git", "entire-opencode", projectDir) | ||
| if err := os.MkdirAll(dir, 0o700); err != nil { | ||
| return "", fmt.Errorf("failed to create secure opencode session dir: %w", err) | ||
| } |
There was a problem hiding this comment.
GetSessionDir constructs repoPath/.git/… directly. In Git worktrees, .git is a file (gitfile) rather than a directory, so MkdirAll on repoPath/.git/... will fail and OpenCode session handoff will break for worktrees/submodules. Consider resolving the actual git dir via git rev-parse --git-common-dir / --git-dir (or parsing the .git gitfile) and placing entire-opencode under that resolved directory instead of assuming .git is a directory.
| continue | ||
| } | ||
| msgSize := len(msgBytes) + 1 // +1 for comma separator | ||
|
|
There was a problem hiding this comment.
There’s trailing whitespace on the blank line after msgSize := .... This file won’t be gofmt-clean (CI will fail the formatting check). Please remove the whitespace / run gofmt so the blank line is truly empty.
| if idx == -1 { | ||
| if lineNum >= startOffset && len(lineBuf) < maxLineSize { | ||
| lineBuf = append(lineBuf, chunk...) | ||
| } | ||
| lastByte = chunk[len(chunk)-1] | ||
| break | ||
| } | ||
|
|
||
| if len(lineData) > 0 { | ||
| lineNum++ | ||
| if lineNum > startOffset { | ||
| var line TranscriptLine | ||
| if parseErr := json.Unmarshal(lineData, &line); parseErr == nil { | ||
| lines = append(lines, line) | ||
| if lineNum >= startOffset { | ||
| if len(lineBuf) < maxLineSize { | ||
| lineBuf = append(lineBuf, chunk[:idx]...) | ||
| var line TranscriptLine | ||
| if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil { |
There was a problem hiding this comment.
maxLineSize isn’t enforced strictly: the guards only check len(lineBuf) < maxLineSize before append, so a final append can push lineBuf past the intended limit (and it will still be unmarshaled). Consider checking len(lineBuf)+len(toAppend) <= maxLineSize (or truncating/discarding the remainder of the line once the limit is hit) to keep the memory bound predictable.
| return fmt.Errorf("failed to stat session state file after rename: %w", err) | ||
| } | ||
| if info.Mode()&os.ModeSymlink != 0 || !info.Mode().IsRegular() { | ||
| _ = os.Remove(stateFile) |
There was a problem hiding this comment.
If the post-rename Lstat indicates the state file is a symlink / non-regular file, the code ignores any os.Remove failure. For a security hardening path, it’d be better to surface removal failures (or at least wrap/return them) so callers can distinguish “detected and removed” from “detected but could not clean up”.
| _ = os.Remove(stateFile) | |
| if removeErr := os.Remove(stateFile); removeErr != nil { | |
| return fmt.Errorf("security error: state file is not a regular file (symlink / path traversal detected); additionally failed to remove suspicious file: %w", removeErr) | |
| } |
fix: patch critical OS-level, memory, and concurrency vulnerabilities
This commit addresses four systemic failure points in the core CLI runtime
that cause silent failures, panics, and security risks under edge conditions:
.gitdirectory toprevent local symlink attacks and arbitrary file overwrites via
/tmp.git fetcharguments by hash chunks (500 max) toprevent operating system
ARG_MAXexhaustion on heavy repository syncs.PIIConfigmaps ingetPIIConfig()toprevent concurrent map read/write panics across concurrent goroutines.
copyFileto useio.Copyandos.Root.OpenFileforO(1) memory streaming, preventing OOM faults on massive LLM transcripts.