Skip to content

Conversation

xPartyBear
Copy link
Contributor

@xPartyBear xPartyBear commented Sep 14, 2025

Adding quotes here fixes zoekt-git-index failing when a pipe is in a branch/tag

Fixes #505

Summary by CodeRabbit

  • Bug Fixes
    • Ensures all selected branches are correctly passed to the indexer as a single argument, preventing missed or failed indexing when branch names include spaces or special characters.
    • Improves stability of code search indexing across multiple branches.
    • Users should see more consistent and complete search results across branches.

Adding quotes here fixes zoekt-git-index failing when a pipe is in a branch/tag
Copy link

coderabbitai bot commented Sep 14, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Updates the zoekt-git-index invocation to wrap the comma-separated list of revisions passed to -branches in quotes, changing -branches ${revisions.join(',')} to -branches "${revisions.join(',')}". No other logic or flow changes in packages/backend/src/zoekt.ts.

Changes

Cohort / File(s) Summary
Zoekt index command quoting
packages/backend/src/zoekt.ts
Quote the -branches argument so the joined revisions are passed as a single argument: -branches "${revisions.join(',')}". No other edits.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A hop, a skip, a quote I bring,
To corral branches in a single string.
No pipes to spill, no args astray—
The indexer hums in tidy array.
Thump-thump! goes the merge with cheer,
Safe little carrots, nothing to fear. 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Quote branches argument in zoekt.ts to fix Pipe" succinctly and accurately describes the primary change: quoting the branches argument in zoekt.ts to address issues caused by pipe characters in branch or tag names. It is a short, single sentence without extraneous file lists or emojis, and it directly relates to the changeset. A teammate scanning history will understand the primary change from the title alone.
Linked Issues Check ✅ Passed The PR implements the primary fix requested in issue #505 by adding quotes around the branches/tags argument in the zoekt-git-index command, preventing shell interpretation of pipe characters in branch names. The raw_summary shows the sole code change is switching -branches ${revisions.join(',')} to -branches "${revisions.join(',')}" with no other logic or control-flow changes, which satisfies the issue's main coding requirement. The issue's secondary suggestion to consider using spawn instead of exec was not implemented, but that was presented as an improvement rather than a required change. Therefore the PR meets the primary objectives from [#505].
Out of Scope Changes Check ✅ Passed The changeset appears limited to quoting the revisions/branches argument in backend/zoekt.ts and the raw_summary indicates no other files or unrelated functionality were modified. I detect no out-of-scope or unrelated code changes introduced by this PR.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@brendan-kellam
Copy link
Contributor

@coderabbitai review

Copy link

coderabbitai bot commented Sep 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/backend/src/zoekt.ts (1)

1-1: Replace child_process.exec with spawn/execFile and pass args as an array (prevent RCE in packages/backend/src/zoekt.ts)

exec(command) builds a shell string and is vulnerable to command injection via branch/tag names; switch to spawn/execFile with an args array and shell: false.

- import { exec } from "child_process";
+ import { spawn } from "child_process";
@@
-    const command = [
-        'zoekt-git-index',
-        '-allow_missing_branches',
-        `-index ${ctx.indexPath}`,
-        `-max_trigram_count ${settings.maxTrigramCount}`,
-        `-file_limit ${settings.maxFileSize}`,
-        `-branches "${revisions.join(',')}"`,
-        `-tenant_id ${repo.orgId}`,
-        `-repo_id ${repo.id}`,
-        `-shard_prefix ${shardPrefix}`,
-        repoPath
-    ].join(' ');
-
-    return new Promise<{ stdout: string, stderr: string }>((resolve, reject) => {
-        exec(command, (error, stdout, stderr) => {
-            if (error) {
-                reject(error);
-                return;
-            }
-            
-            if (stdout) {
-                stdout.split('\n').filter(line => line.trim()).forEach(line => {
-                    logger.info(line);
-                });
-            }
-            if (stderr) {
-                stderr.split('\n').filter(line => line.trim()).forEach(line => {
-                    // TODO: logging as regular info here and not error because non error logs are being
-                    // streamed in stderr and incorrectly being logged as errors at a high level
-                    logger.info(line);
-                });
-            }
-            
-            resolve({
-                stdout,
-                stderr
-            });
-        })
-    });
+    const args = [
+        '-allow_missing_branches',
+        '-index', ctx.indexPath,
+        '-max_trigram_count', String(settings.maxTrigramCount),
+        '-file_limit', String(settings.maxFileSize),
+        '-branches', revisions.join(','), // zoekt splits this internally
+        '-tenant_id', String(repo.orgId),
+        '-repo_id', String(repo.id),
+        '-shard_prefix', shardPrefix,
+        repoPath,
+    ];
+
+    return new Promise<{ stdout: string; stderr: string }>((resolve, reject) => {
+        const child = spawn('zoekt-git-index', args, { shell: false });
+        let stdoutBuf = '';
+        let stderrBuf = '';
+
+        child.stdout.setEncoding('utf8');
+        child.stderr.setEncoding('utf8');
+
+        child.stdout.on('data', (chunk: string) => {
+            stdoutBuf += chunk;
+            chunk.split('\n').filter(l => l.trim()).forEach(l => logger.info(l));
+        });
+        child.stderr.on('data', (chunk: string) => {
+            stderrBuf += chunk;
+            // Non-error logs often come via stderr from zoekt; keep at info
+            chunk.split('\n').filter(l => l.trim()).forEach(l => logger.info(l));
+        });
+        child.on('error', reject);
+        child.on('close', (code) => {
+            if (code !== 0) {
+                return reject(new Error(`zoekt-git-index exited with code ${code}`));
+            }
+            resolve({ stdout: stdoutBuf, stderr: stderrBuf });
+        });
+    });
🧹 Nitpick comments (1)
packages/backend/src/zoekt.ts (1)

50-58: Guard against commas in ref names (zoekt splits on commas).

Git refnames may legally contain commas. Because zoekt parses -branches by strings.Split(value, ","), any comma in a ref mis-splits the list. Consider warning and dropping such refs (or documenting that refs with commas are unsupported). (git-scm.com)

Example insertion before building args:

@@
     if (revisions.length > 64) {
@@
     }
+
+    // Skip refs that contain commas, since zoekt splits -branches on ','.
+    const invalid = revisions.filter(r => r.includes(','));
+    if (invalid.length) {
+        logger.warn(`Skipping ${invalid.length} revision(s) containing commas: ${invalid.join(' ')}`);
+        revisions = revisions.filter(r => !r.includes(','));
+    }

Also applies to: 60-66

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aab4a92 and 331de9a.

📒 Files selected for processing (1)
  • packages/backend/src/zoekt.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

brendan-kellam
brendan-kellam previously approved these changes Sep 16, 2025
Copy link
Contributor

@brendan-kellam brendan-kellam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thx for the PR 👍 could you add a changelog entry?

Added a note about fixing zoekt indexing issue with pipe in branch/tag names.
@brendan-kellam brendan-kellam merged commit 4a449da into sourcebot-dev:main Sep 17, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] Pipes in branch names
2 participants