Skip to content

fix: patch systemic OS-level vulnerabilities, OOM risks, and concurre…#799

Open
mrvenom17 wants to merge 1 commit intoentireio:mainfrom
mrvenom17:main
Open

fix: patch systemic OS-level vulnerabilities, OOM risks, and concurre…#799
mrvenom17 wants to merge 1 commit intoentireio:mainfrom
mrvenom17:main

Conversation

@mrvenom17
Copy link
Copy Markdown

fix: patch critical OS-level, memory, and concurrency vulnerabilities

This commit addresses four systemic failure points in the core CLI runtime
that cause silent failures, panics, and security risks under edge conditions:

  • opencode: Scope session temp dirs to the repository .git directory to
    prevent local symlink attacks and arbitrary file overwrites via /tmp.
  • git_operations: Batch git fetch arguments by hash chunks (500 max) to
    prevent operating system ARG_MAX exhaustion on heavy repository syncs.
  • redact: Return deep copies of PIIConfig maps in getPIIConfig() to
    prevent concurrent map read/write panics across concurrent goroutines.
  • utils: Refactor copyFile to use io.Copy and os.Root.OpenFile for
    O(1) memory streaming, preventing OOM faults on massive LLM transcripts.

@mrvenom17 mrvenom17 requested a review from a team as a code owner March 29, 2026 18:25
Copilot AI review requested due to automatic review settings March 29, 2026 18:25
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens core CLI runtime behavior against security, memory, and concurrency failure modes (symlink/path traversal risk, OS ARG_MAX exhaustion, concurrent map panics, and large-transcript OOM).

Changes:

  • Return deep copies of global PIIConfig maps to avoid concurrent map read/write panics.
  • Stream file copies via io.Copy with os.Root-based traversal-resistant writes to reduce memory usage.
  • Batch git fetch origin <hash...> blob-fetch requests to avoid OS argument length limits.
  • Tighten transcript chunking and session state file handling with additional safety checks.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
redact/pii.go Deep-copies PII config maps on read to avoid concurrent map access panics.
cmd/entire/cli/utils.go Refactors copyFile to stream data and use os.Root for safer writes.
cmd/entire/cli/session/state.go Adds post-rename validation that session state files are regular files (no symlink).
cmd/entire/cli/git_operations.go Batches hash-based git fetch calls to reduce ARG_MAX risk.
cmd/entire/cli/agent/opencode/opencode.go Moves OpenCode session dir under repo .git and ensures it exists.
cmd/entire/cli/agent/geminicli/gemini.go Adds single-message-size guard during chunking (and base size refactor).
cmd/entire/cli/agent/claudecode/claude.go Reworks transcript position + offset scanning to fixed-buffer reading with a max line limit.
Comments suppressed due to low confidence (2)

cmd/entire/cli/agent/claudecode/claude.go:306

  • The updated doc comment says this uses a fixed-size buffer to handle “arbitrarily long lines safely”, but the implementation enforces maxLineSize = 10MB and silently skips parsing lines beyond that limit. Please update the comment to describe the actual behavior/limit so callers understand that oversized JSONL lines will be ignored.
// ExtractModifiedFilesFromOffset extracts files modified since a given line number.
// For Claude Code (JSONL format), offset is the starting line number.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns:
//   - files: list of file paths modified by Claude (from Write/Edit tools)
//   - currentPosition: total number of lines in the file
//   - error: any error encountered during reading

cmd/entire/cli/agent/claudecode/claude.go:376

  • This change rewrites transcript position counting and offset-based scanning logic in a way that’s easy to regress (EOF without trailing newline, very long single-line JSONL, offset boundaries, maxLineSize behavior). There are no unit tests covering GetTranscriptPosition / ExtractModifiedFilesFromOffset for Claude Code; please add targeted tests for these edge cases to prevent future breakage.
// GetTranscriptPosition returns the current line count of a Claude Code transcript.
// Claude Code uses JSONL format, so position is the number of lines.
// This is a lightweight operation that only counts lines without parsing JSON.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns 0 if the file doesn't exist or is empty.
func (c *ClaudeCodeAgent) GetTranscriptPosition(path string) (int, error) {
	if path == "" {
		return 0, nil
	}

	file, err := os.Open(path) //nolint:gosec // Path comes from Claude Code transcript location
	if err != nil {
		if os.IsNotExist(err) {
			return 0, nil
		}
		return 0, fmt.Errorf("failed to open transcript file: %w", err)
	}
	defer file.Close()

	buf := make([]byte, 32*1024)
	lineCount := 0
	var lastByte byte = '\n'

	for {
		n, err := file.Read(buf)
		if n > 0 {
			lineCount += bytes.Count(buf[:n], []byte{'\n'})
			lastByte = buf[n-1]
		}
		if err != nil {
			if err == io.EOF {
				if lastByte != '\n' {
					lineCount++
				}
				break
			}
			return 0, fmt.Errorf("failed to read transcript: %w", err)
		}
	}

	return lineCount, nil
}

// ExtractModifiedFilesFromOffset extracts files modified since a given line number.
// For Claude Code (JSONL format), offset is the starting line number.
// Uses a fixed-size buffer to handle arbitrarily long lines safely.
// Returns:
//   - files: list of file paths modified by Claude (from Write/Edit tools)
//   - currentPosition: total number of lines in the file
//   - error: any error encountered during reading
func (c *ClaudeCodeAgent) ExtractModifiedFilesFromOffset(path string, startOffset int) (files []string, currentPosition int, err error) {
	if path == "" {
		return nil, 0, nil
	}

	file, openErr := os.Open(path) //nolint:gosec // Path comes from Claude Code transcript location
	if openErr != nil {
		if os.IsNotExist(openErr) {
			return nil, 0, nil
		}
		return nil, 0, fmt.Errorf("failed to open transcript file: %w", openErr)
	}
	defer file.Close()

	var lines []TranscriptLine
	lineNum := 0
	var lastByte byte = '\n'
	buf := make([]byte, 32*1024)
	var lineBuf []byte
	const maxLineSize = 10 * 1024 * 1024 // 10MB limit

	for {
		n, readErr := file.Read(buf)
		if n > 0 {
			chunk := buf[:n]
			for len(chunk) > 0 {
				idx := bytes.IndexByte(chunk, '\n')
				if idx == -1 {
					if lineNum >= startOffset && len(lineBuf) < maxLineSize {
						lineBuf = append(lineBuf, chunk...)
					}
					lastByte = chunk[len(chunk)-1]
					break
				}

				if lineNum >= startOffset {
					if len(lineBuf) < maxLineSize {
						lineBuf = append(lineBuf, chunk[:idx]...)
						var line TranscriptLine
						if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil {
							lines = append(lines, line)
						}
					}
					lineBuf = lineBuf[:0]
				}
				lineNum++
				lastByte = '\n'
				chunk = chunk[idx+1:]
			}
		}

		if readErr != nil {
			if readErr == io.EOF {
				if lastByte != '\n' {
					if lineNum >= startOffset && len(lineBuf) > 0 && len(lineBuf) < maxLineSize {
						var line TranscriptLine
						if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil {
							lines = append(lines, line)
						}
					}
					lineNum++
				}
				break
			}
			return nil, 0, fmt.Errorf("failed to read transcript: %w", readErr)
		}
	}

	return ExtractModifiedFiles(lines), lineNum, nil
}

Comment on lines 182 to +186
projectDir := SanitizePathForOpenCode(repoPath)
return filepath.Join(os.TempDir(), "entire-opencode", projectDir), nil
dir := filepath.Join(repoPath, ".git", "entire-opencode", projectDir)
if err := os.MkdirAll(dir, 0o700); err != nil {
return "", fmt.Errorf("failed to create secure opencode session dir: %w", err)
}
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetSessionDir constructs repoPath/.git/… directly. In Git worktrees, .git is a file (gitfile) rather than a directory, so MkdirAll on repoPath/.git/... will fail and OpenCode session handoff will break for worktrees/submodules. Consider resolving the actual git dir via git rev-parse --git-common-dir / --git-dir (or parsing the .git gitfile) and placing entire-opencode under that resolved directory instead of assuming .git is a directory.

Copilot uses AI. Check for mistakes.
continue
}
msgSize := len(msgBytes) + 1 // +1 for comma separator

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s trailing whitespace on the blank line after msgSize := .... This file won’t be gofmt-clean (CI will fail the formatting check). Please remove the whitespace / run gofmt so the blank line is truly empty.

Suggested change

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +346
if idx == -1 {
if lineNum >= startOffset && len(lineBuf) < maxLineSize {
lineBuf = append(lineBuf, chunk...)
}
lastByte = chunk[len(chunk)-1]
break
}

if len(lineData) > 0 {
lineNum++
if lineNum > startOffset {
var line TranscriptLine
if parseErr := json.Unmarshal(lineData, &line); parseErr == nil {
lines = append(lines, line)
if lineNum >= startOffset {
if len(lineBuf) < maxLineSize {
lineBuf = append(lineBuf, chunk[:idx]...)
var line TranscriptLine
if parseErr := json.Unmarshal(lineBuf, &line); parseErr == nil {
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxLineSize isn’t enforced strictly: the guards only check len(lineBuf) < maxLineSize before append, so a final append can push lineBuf past the intended limit (and it will still be unmarshaled). Consider checking len(lineBuf)+len(toAppend) <= maxLineSize (or truncating/discarding the remainder of the line once the limit is hit) to keep the memory bound predictable.

Copilot uses AI. Check for mistakes.
return fmt.Errorf("failed to stat session state file after rename: %w", err)
}
if info.Mode()&os.ModeSymlink != 0 || !info.Mode().IsRegular() {
_ = os.Remove(stateFile)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the post-rename Lstat indicates the state file is a symlink / non-regular file, the code ignores any os.Remove failure. For a security hardening path, it’d be better to surface removal failures (or at least wrap/return them) so callers can distinguish “detected and removed” from “detected but could not clean up”.

Suggested change
_ = os.Remove(stateFile)
if removeErr := os.Remove(stateFile); removeErr != nil {
return fmt.Errorf("security error: state file is not a regular file (symlink / path traversal detected); additionally failed to remove suspicious file: %w", removeErr)
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants