feat: repo-first evidence pipeline (v0.10.0)#52
Open
brianruggieri wants to merge 25 commits intomainfrom
Open
Conversation
Remove candidate_profile_path parameter from export_fit_assessment(). Evidence highlights return empty list until commit evidence (D2) is wired. select_evidence_highlights() function preserved for future reuse.
…t_highlights flag
- ABSTRACT_SKILLS constant and _ABSTRACT_SIGNAL_RULES mapping - _apply_signal_rules() for repo signal-based inference - _build_abstract_skill_prompt() and _claude_infer_abstract_skills() - _build_commit_tagged_skills() stub for tier 2 - resolve_abstract_skills() three-tier orchestrator
There was a problem hiding this comment.
Pull request overview
Implements the v0.10.0 “repo-first evidence pipeline” redesign by shifting MergedEvidenceProfile.projects to a repo-derived RepoProject, adding commit evidence extraction primitives (filter + Claude/heuristic highlighter), and updating export/merge paths and tests to reflect session dormancy.
Changes:
- Introduce
RepoProject(replacing sessionProjectSummaryin merged profiles) and update merge/scoring call sites. - Add commit evidence extraction pipeline: heuristic commit filtering + Claude-powered (with fallback) highlight extraction.
- Update fit export + tests for repo-first project shape, abstract-skill resolution scaffolding, and v0.10.0 version bumps.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_repo_scanner.py | Adds coverage for raw commit fetching and optional highlight extraction in local repo scans. |
| tests/test_repo_project_schema.py | Adds tests for RepoProject schema and from_repo_evidence() factory behavior. |
| tests/test_repo_profile.py | Adds tests for CommitHighlight schema round-tripping and RepoEvidence defaults. |
| tests/test_merger.py | Updates merger tests to assert projects come from repo profiles (not sessions). |
| tests/test_merged_profile_projects.py | Verifies merged profiles accept RepoProject instances and serialize/deserialize. |
| tests/test_fit_exporter.py | Updates exporter tests for RepoProject-shaped projects, dormancy, repo_url/commit_url wiring, and abstract-skill helpers. |
| tests/test_commit_highlighter.py | Adds unit + slow integration coverage for commit highlight extraction and fallback. |
| tests/test_commit_filter.py | Adds tests for commit noise filtering, tiering, scoring, and slot budgeting. |
| src/claude_candidate/scoring/dimensions.py | Updates project-name usage to RepoProject.name. |
| src/claude_candidate/schemas/repo_profile.py | Adds RepoProject + CommitHighlight, and adds commit_highlights to RepoEvidence. |
| src/claude_candidate/schemas/merged_profile.py | Switches merged projects field type from ProjectSummary to RepoProject. |
| src/claude_candidate/schemas/init.py | Re-exports RepoProject from the schemas package. |
| src/claude_candidate/repo_scanner.py | Adds remote URL detection, optional highlight extraction, and raw commit fetching/parser helpers. |
| src/claude_candidate/merger.py | Populates merged projects from repo_profile.repos via RepoProject.from_repo_evidence(). |
| src/claude_candidate/fit_exporter.py | Adds repo quantitative fallback formatter + abstract skill resolution helpers; updates project selection for RepoProject shape; removes candidate_profile dependency from export. |
| src/claude_candidate/extractors/signal_merger.py | Marks session-based project building as dormant with updated docstrings. |
| src/claude_candidate/commit_highlighter.py | New: Claude-powered commit highlight extraction with heuristic fallback. |
| src/claude_candidate/commit_filter.py | New: heuristic pre-filter with tiering/noise detection and scoring. |
| src/claude_candidate/cli.py | Updates export-fit command to no longer pass candidate_profile_path into exporter. |
| src/claude_candidate/init.py | Bumps package version to 0.10.0. |
| pyproject.toml | Bumps project version to 0.10.0. |
| extension/manifest.json | Bumps extension version to 0.10.0. |
Comments suppressed due to low confidence (1)
src/claude_candidate/cli.py:338
- The
export-fitcommand no longer checks thatcandidate_profile.jsonexists, but it still unconditionally reads it viaCandidateProfile.from_json(candidate_path.read_text()). If the file is missing, this will raise a low-level exception/stack trace. Consider restoring an explicit existence check (or catching the error) to emit a clear CLI message and exit code.
data_dir = Path.home() / ".claude-candidate"
db_path = Path(db) if db else data_dir / "assessments.db"
candidate_path = data_dir / "candidate_profile.json"
# Validate paths
if not db_path.exists():
click.echo(f"Error: Database not found at {db_path}", err=True)
raise SystemExit(1)
# Load assessment from DB
async def _load():
store = AssessmentStore(db_path)
await store.initialize()
try:
return await store.get_assessment(assessment_id)
finally:
await store.close()
assessment = asyncio.run(_load())
if not assessment:
click.echo(f"Error: Assessment '{assessment_id}' not found.", err=True)
raise SystemExit(1)
# Build merged profile on the fly — no merged_profile.json needed
cp = CandidateProfile.from_json(candidate_path.read_text())
merged = _merge_profile(cp, quiet=True)
- fit_exporter.py: repo_url falls back to public_repo_url for legacy shapes - commit_filter.py: remove unused `field` import - commit_highlighter.py: catch ClaudeCLIError specifically before broad Exception - test_commit_filter.py: remove unused pytest import - test_repo_project_schema.py: remove unused pytest import
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the 6-decision repo-first evidence pipeline redesign from the 2026-03-30 grill session:
ProjectSummarywithRepoProjectas canonical project source inMergedEvidenceProfile.projects. Projects now come from scanned repos viamerge_triad(), not sessions.commit_filter.py) classifies commits into tiers and drops noise, then Claude extracts pithy highlight quotes with skill tags (commit_highlighter.py). Heuristic fallback when Claude is unavailable.repo_urlandcommit_urlflow throughselect_projects()andselect_evidence_highlights()to Hugo front matter.format_skill_repo_fallback()generates hiring-manager-aligned text from repo stats (repo count, timeline, test coverage, frameworks, CI) when no commit highlights exist.export_fit_assessment()no longer readscandidate_profile.json. Session scanning infrastructure preserved but dormant.New modules
src/claude_candidate/commit_filter.py— heuristic pre-filter with tier classificationsrc/claude_candidate/commit_highlighter.py— Claude-powered highlight extraction with fallbackStats
Test plan
.venv/bin/python -m pytest(1544 passed, 28 skipped).venv/bin/python -m pytest --run-slow.venv/bin/python tests/golden_set/benchmark_accuracy.pyReview fixes (f316048)
repo_urlfalls back topublic_repo_urlfor legacy ProjectSummary shapesfieldimportClaudeCLIErrorspecifically before broadExceptionpytestimportpytestimport