Skip to content

refactor: consolidate PR file handling#27

Merged
digitarald merged 1 commit intomainfrom
digitarald/git-cleanup
Feb 17, 2026
Merged

refactor: consolidate PR file handling#27
digitarald merged 1 commit intomainfrom
digitarald/git-cleanup

Conversation

@digitarald
Copy link
Copy Markdown
Owner

@digitarald digitarald commented Feb 17, 2026

Summary

Migrates the VS Code extension PR command from simple-git to the built-in vscode.git API and consolidates shared utilities. Removes dead code, fixes bugs, and updates workspace instructions.

Changes

Extension: simple-gitvscode.git migration

  • vscode-extension/src/commands/pr.ts — Rewritten to use vscode.git API for all git operations (stage, commit, push, ref lookup)
  • vscode-extension/src/git.d.ts — New vendored type definitions for vscode.git API
  • vscode-extension/src/gitUtils.ts — New utility for monorepo-aware git repository discovery
  • vscode-extension/package.json — Removed simple-git dependency, added extensionDependencies: ["vscode.git"]

Shared utility consolidation

  • src/utils/pr.ts — Added isPrimerFile() and PRIMER_FILE_PATTERNS (moved from deleted localPr.ts)
  • src/services/localPr.ts — Deleted (all functions were dead code after extension migration)
  • vscode-extension/src/services.ts — Removed localPr re-exports, added isPrimerFile re-export from primer/utils/pr.js

Bug fixes

  • Fixed origin URL regex to handle repos with dots in names: (.+?)(?:\.git)?$ instead of [^/.]+
  • Added empty-PR guard comparing remote ref commit SHAs before creating PR
  • Deduplicated isPrimerFile — extension now imports shared implementation with backslash normalization

Tests

  • Merged isPrimerFile tests from localPr.test.ts into pr.test.ts, deleted stale test file
  • 275 tests passing, 13 test files

Workspace instructions

  • .github/copilot-instructions.md — Trimmed and restructured (~70 lines)
  • .github/instructions/vscode-extension.instructions.md — New on-demand instruction file with applyTo: "vscode-extension/**"

@digitarald digitarald merged commit 74481cb into main Feb 17, 2026
9 checks passed
@github-actions
Copy link
Copy Markdown

❌ Primer Eval: 3/6 pass (50%)

504.1s · model claude-sonnet-4.5 · judge claude-sonnet-4.5

Case Verdict Score Rationale
case-1 ✅ pass 58 Response B better matches the expectation. It explicitly lists more services mentioned in the expectation (evaluator, azureDevops, policy) and presents the architecture flow more clearly with numbered
case-2 ✅ pass 75 Response B better matches the expectation. It correctly covers local dev (tsx/npm run dev), all three test commands (test/test:watch/test:coverage), linting, formatting with prettier, type checking, a
case-3 ❌ fail 75 Response B better matches the expectation by explicitly covering shouldLog() for gating stderr output, providing detailed CommandResult examples, and demonstrating the withGlobalOpts pattern. Howev
case-4 ❌ fail 75 Response B slightly edges out Response A but both fail to fully match the expectation. Neither mentions the specific function names buildCriteria(), buildExtras(), loadPolicy(), or resolveChain() from
case-5 ✅ pass 72 Response A better matches the expectation by including the CLI flags (--areas, --areas-only, --area ) explicitly mentioned in the requirement. Both responses correctly describe the Copilot SDK s
case-6 ⚠️ unknown 45 Both responses cover file safety (safeWriteFile, validateCachePath, symlink rejection) and structured output (CommandResult, outputResult, stdout/stderr separation) adequately, but both completely omi

Details

case-1 · ✅ 58/100

Prompt: What is Primer's architecture and how are its layers organized?

Expected: Primer is a TypeScript CLI tool for priming repositories for AI-assisted development. It follows a layered architecture: src/index.ts is the entrypoint which defaults to the interactive TUI when no command is given, otherwise delegates to runCli in src/cli.ts. Commander wires subcommands (in src/commands/) to service functions (in src/services/), with shared utilities in src/utils/ and Ink/React TUI components in src/ui/. The CLI layer handles option parsing and output formatting, services contain all core logic (analyzer, instructions, readiness, evaluator, batch, git, github, azureDevops, etc.), and utils provide cross-cutting concerns like safe file I/O, structured output, and working directory management.

Judge: Response B better matches the expectation. It explicitly lists more services mentioned in the expectation (evaluator, azureDevops, policy) and presents the architecture flow more clearly with numbered layers. Both responses correctly identify the layered architecture (index.ts→cli.ts, commands, services, utils, UI), but Response B's service enumeration is more comprehensive and closer to the expectation's list. However, neither response explicitly captures the 'defaults to interactive TUI when no command given' behavior mentioned in the expectation. Response A has slightly better coverage of the copilot service but overall Response B's structure and completeness give it an edge.

Metric Without instructions With instructions
Time 22.7s 16.8s
Tokens 15.8k 16.7k
Tool calls 6 4
case-2 · ✅ 75/100

Prompt: What is the local development workflow and how does building for distribution differ?

Expected: For local development, run commands directly with npx tsx src/index.ts (or npm run dev) — tsx executes TypeScript without a build step. Linting uses eslint (npm run lint), formatting uses prettier (npm run format / format:check), type checking uses tsc --noEmit (npm run typecheck), and tests run with vitest using v8 coverage (npm run test / test:coverage). Husky and lint-staged enforce linting on pre-commit. For distribution, tsup bundles src/index.ts into ESM-only output targeting Node 20+, with a shebang banner, sourcemaps, and external dependencies not bundled.

Judge: Response B better matches the expectation. It correctly covers local dev (tsx/npm run dev), all three test commands (test/test:watch/test:coverage), linting, formatting with prettier, type checking, and distribution build details (tsup, ESM, shebang, sourcemaps, external deps). Response A omits prettier entirely and only mentions test:watch. Both responses miss Husky/lint-staged enforcement and Response B doesn't explicitly mention Node 20+ target or v8 coverage, but Response B is more complete overall (12/15 key points vs 9/15).

Metric Without instructions With instructions
Time 23.2s 15.2s
Tokens 15.9k 16.2k
Tool calls 6 4
case-3 · ❌ 75/100

Prompt: What patterns and conventions should I follow when adding new functionality to this codebase?

Expected: Place new CLI commands in src/commands/, core logic in src/services/, and TUI components in src/ui/. All commands must support --json and --quiet flags via the withGlobalOpts wrapper in cli.ts, and return structured results using the CommandResult type from utils/output.ts. Use outputResult() for dual JSON/human output and shouldLog() to gate stderr progress. File writes must use safeWriteFile() which prevents accidental overwrites unless --force is passed. ESM syntax is required everywhere, TypeScript is strict (ES2022 target, ESNext module). Area-specific instructions go in .github/instructions/{name}.instructions.md with YAML frontmatter. The default model for Copilot SDK operations is claude-sonnet-4.5.

Judge: Response B better matches the expectation by explicitly covering shouldLog() for gating stderr output, providing detailed CommandResult examples, and demonstrating the withGlobalOpts pattern. However, both responses fail to mention: (1) TUI components belong in src/ui/, (2) area-specific instructions go in .github/instructions/{name}.instructions.md with YAML frontmatter, and (3) the default model for Copilot SDK operations is claude-sonnet-4.5. Response B scores higher due to superior coverage of output discipline (outputResult + shouldLog), concrete code examples, and more comprehensive testing patterns, but still misses critical architectural and configuration requirements.

Metric Without instructions With instructions
Time 33.3s 29.6s
Tokens 22.2k 17.9k
Tool calls 14 8
case-4 · ❌ 75/100

Prompt: How does the AI readiness assessment work, and how can it be customized with policies?

Expected: The readiness service in src/services/readiness.ts evaluates repositories across 9 pillars (style-validation, build-system, testing, documentation, dev-environment, code-quality, observability, security, ai-tooling) and assigns a maturity level from 1 (Functional) to 5 (Autonomous). Each criterion has a scope — repo, app, or area — determining whether it runs once, per monorepo app, or per detected area. buildCriteria() returns 20+ built-in checks and buildExtras() adds optional ones. Policies loaded via src/services/policy.ts can customize the assessment: loadPolicy() reads JSON/TS/JS configs, and resolveChain() merges a chain of policies that can disable, override, or add criteria and set pass-rate thresholds. Results can be rendered as an interactive HTML report by src/services/visualReport.ts with dark/light theme toggle and expandable per-pillar details.

Judge: Response B slightly edges out Response A but both fail to fully match the expectation. Neither mentions the specific function names buildCriteria(), buildExtras(), loadPolicy(), or resolveChain() from the expectation. Critically, both completely omit any mention of src/services/visualReport.ts, the interactive HTML report generation, or the dark/light theme toggle feature. Response B earns a marginally higher score for explicitly stating the ≥80% default pass rate and providing clearer structure in the 'How It Works' section with analyzeRepo(). Both responses correctly cover the 9 pillars, maturity levels 1-5, scope types (repo/app/area), policy customization (disable/override/add), and security constraints. However, the omission of key implementation details and the entire visualization layer means neither response adequately matches the expectation.

Metric Without instructions With instructions
Time 49.9s 44.3s
Tokens 31.4k 31.1k
Tool calls 17 15
case-5 · ✅ 72/100

Prompt: How does Primer generate Copilot instructions, including for monorepos with multiple areas?

Expected: The instruction generation pipeline starts with the analyzer (src/services/analyzer.ts) which scans the repo to detect languages, frameworks, monorepo apps, and logical areas (frontend, backend, etc.) with glob patterns. For root-level instructions, generateCopilotInstructions() in src/services/instructions.ts creates a Copilot SDK session that explores the codebase using tools (glob, view, grep) and produces .github/copilot-instructions.md. For area-specific instructions, generateAreaInstructions() generates focused content per area, and buildAreaFrontmatter() creates YAML frontmatter with applyTo glob patterns so VS Code scopes them to the right files. These are written to .github/instructions/{sanitized-name}.instructions.md via writeAreaInstruction(). The instructions command supports --areas to generate all area instructions, --areas-only to skip the root file, and --area for a single area.

Judge: Response A better matches the expectation by including the CLI flags (--areas, --areas-only, --area ) explicitly mentioned in the requirement. Both responses correctly describe the Copilot SDK session exploration with glob/view/grep tools, YAML frontmatter with applyTo patterns, and .github/instructions/ output location. However, neither mentions the analyzer.ts starting point or specific function names (generateCopilotInstructions, generateAreaInstructions, buildAreaFrontmatter, writeAreaInstruction) from the expectation. Response A's inclusion of CLI usage with examples makes it more complete against the stated requirements, while Response B provides richer detection details but omits the CLI interface entirely.

Metric Without instructions With instructions
Time 33.6s 57.4s
Tokens 27.2k 31.7k
Tool calls 12 17
case-6 · ❌ 45/100

Prompt: What safety and security patterns does the codebase use for file operations and CLI output?

Expected: For file safety, src/utils/fs.ts provides safeWriteFile() which checks for existing files and only overwrites with an explicit force flag, validateCachePath() which rejects paths containing .. or symlinks to prevent path traversal, and fileExists() with symlink rejection. The repo.ts validators use regexes (GITHUB_REPO_RE, AZURE_REPO_RE) that reject traversal patterns in repo identifiers. For credential safety, git.ts and batch.ts use sanitizeError() to strip tokens from error messages before surfacing them. For structured output, utils/output.ts defines the CommandResult type with ok/status/data fields, outputResult() writes JSON to stdout or human text to stderr based on --json/--quiet flags, and shouldLog() gates progress output. This dual-mode pattern ensures all commands work both interactively and in headless automation pipelines.

Judge: Both responses cover file safety (safeWriteFile, validateCachePath, symlink rejection) and structured output (CommandResult, outputResult, stdout/stderr separation) adequately, but both completely omit critical security patterns explicitly mentioned in the expectation: credential safety via sanitizeError() in git.ts/batch.ts, repo validators using GITHUB_REPO_RE/AZURE_REPO_RE regexes to reject traversal patterns, fileExists() with symlink rejection, and shouldLog() for output gating. The responses are essentially equivalent in coverage (~50% of expected content), structure, and level of detail, making it impossible to determine which better matches the expectation.

Metric Without instructions With instructions
Time 26.3s 16.5s
Tokens 20.1k 15.8k
Tool calls 8 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant