Skip to content

Feat/adaptive labels#3

Merged
lex0c merged 16 commits intomainfrom
feat/adaptive-labels
Apr 18, 2026
Merged

Feat/adaptive labels#3
lex0c merged 16 commits intomainfrom
feat/adaptive-labels

Conversation

@lex0c
Copy link
Copy Markdown
Owner

@lex0c lex0c commented Apr 18, 2026

No description provided.

lex0c and others added 5 commits April 18, 2026 18:41
Classification previously used hardcoded 180d age and 0.5 trend
cutoffs. That made the label set misleading for any repo outside a
narrow age range: a 4-year-old file in a 12-year-old codebase was
tagged legacy-hotspot even when it was newer than 70% of tracked
files. Now ChurnRisk derives the "old" and "declining" thresholds
from the dataset's own distribution (age P75, trend P25) when at
least classifyMinSample files are present, and falls back to the
absolute constants for tiny samples.

Every ChurnRiskResult also carries AgePercentile and TrendPercentile
(or -1 when the fallback path runs). The CLI suffixes the label with
"(age P92, trend P08)" and the HTML report shows the same pair as a
caption under the label badge, so readers can see how strongly a
file qualifies for its classification — a barely-legacy file and a
deeply-legacy one are no longer visually indistinguishable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
High-confidence path heuristics (vendor/, node_modules/, dist/,
*.min.js, package-lock.json, *.pb.go, etc.) are scanned after every
stats load. If the matched paths together account for at least 10%
of total repo churn, the CLI prints a stderr warning listing the
matches and a copy-pasteable extract --ignore invocation.

Conservative by design: the warning only fires above the noise
floor, so a single incidental lock file in a clean repo stays
silent. Statistical heuristics (churn/commit outliers, single-author
bulk updates) are deliberately deferred — false-positive risk is
higher than the path-based set and the current signal already
covers the common case.

Observed on WordPress fixture: dist/* alone accounts for ~58% of
total repo churn; running the suggested ignores removes the
distortion from every downstream stat (hotspots, churn-risk, bus
factor).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
METRICS.md: new "Adaptive thresholds" subsection in Churn Risk
explaining per-dataset P75/P25 derivation and the AgePercentile /
TrendPercentile fields; new "Vendor/generated path warning"
subsection describing the heuristic patterns, the 10% noise floor,
and why statistical signals are out of scope. Thresholds table
updated to flag classifyOldAgeDays/classifyDecliningTrend as
fallbacks and list the two new constants.

README.md: churn-risk sample output refreshed with the new
(age PXX, trend PYY) suffix; label rules rewritten around "younger
than most of the repo" rather than "< 180 days"; vendor-filter
section notes the stats warning so users know the tool will point
them at the patterns to add.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four cosmetic and semantic fixes to the adaptive-labels feature:

- AgePercentile and TrendPercentile are now *int with
  json:"age_percentile,omitempty" / json:"trend_percentile,omitempty".
  Small datasets (< classifyMinSample) serialize as omitted fields
  instead of a `-1` sentinel, which was leaking into downstream
  consumers.
- Label suffix format dropped zero-pad: `P%02d` → `P%d`. P0 / P5 /
  P100 read correctly without the P00 / P05 / P100 asymmetry.
- HTML hint text that describes the "Age P__ / Trend P__" caption is
  now guarded on the first row's AgePercentile. Small-dataset reports
  no longer promise captions they don't show.
- Added TestLabelWithPercentile covering all four pointer combos and
  the zero / three-digit rendering.

Template picks up a new derefInt helper in funcMap so
`{{derefInt .AgePercentile}}` resolves cleanly to the int value
without printing a pointer address.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When every file's history fits inside the 3-month trend window (short
repos, narrow --since slices), churnTrend returns the flat-signal 1.0
for all files. The adaptive P25 then lands on 1.0 and the
trend < P25 predicate matches nobody — no file reaches legacy-hotspot
via the trend check, even though the age predicate might qualify many.

That behavior is mathematically defensible (nothing in the
distribution to classify on) but surprising enough that a regression
test is worth having. TestChurnRiskAdaptiveDegenerateTrendDistribution
constructs 12 files with identical flat histories and asserts no file
lands in legacy-hotspot and at least one lands in silo. METRICS.md
gains a callout next to the adaptive-thresholds section so readers
hitting the shape know it's documented rather than a bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70e462388b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/suspect.go Outdated
lex0c and others added 3 commits April 18, 2026 19:06
The directory-segment matchers (hasPathSegment) classify nested paths
like pkg/vendor/foo.go and wp-includes/js/dist/bar.min.js as suspect,
but the warning previously emitted the short bucket label
("vendor/*", "dist/*") as the --ignore suggestion. extract.shouldIgnore
treats those as repo-root prefixes and does not match anything below
a subproject, so users following the suggested fix would re-extract
and see the same warning — the paths that triggered it were never
actually removed.

SuspectPattern gains a Suggest func(path) string and SuspectBucket
gains a Suggestions slice. Directory matchers return the specific
parent glob for each matched path (e.g. "pkg/vendor/*",
"services/auth/vendor/*"); suffix and basename matchers stay
canonical because extract's basename match already covers them at
any depth. DetectSuspectFiles dedupes suggestions per bucket and
printSuspectWarning dedupes across buckets before emitting.

Observed on WordPress: suggestion list now includes
js/dist/*, styles/dist/*, wp-includes/assets/dist/*, and the deeply
nested js/dist/vendor/* — paths that shouldIgnore actually matches.
Two regression tests pin the nested-dir behavior and confirm
suffix/basename patterns still collapse to a single glob.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
METRICS.md now explains why directory-segment buckets can emit
multiple --ignore globs: the matcher catches `dist` wherever it
appears, but extract.shouldIgnore treats `dist/*` as a repo-root
prefix, so the warning enumerates each distinct parent path (e.g.
`wp-includes/js/dist/*`) and tells the reader that suffix/basename
patterns stay canonical. Without the callout, users seeing a long
list of --ignore entries for a single "bucket" might think the
warning is noisy when it's actually being precise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Export extract.ShouldIgnore and add a cross-package test asserting
that, for every bucket DetectSuspectFiles produces, every matched
Path is covered by at least one entry in Suggestions when fed
through ShouldIgnore. Without this invariant, a future refactor of
either side (e.g. ShouldIgnore tightening its prefix semantics, or
SuspectPattern gaining a matcher without an aligned Suggest) would
silently break the warning's whole value proposition: the
copy-pasteable fix would stop matching the paths that triggered it,
and no existing stats or extract test would fire.

The test mixes root-level and deeply nested occurrences of every
pattern class — directory segments, suffix, basename, generated
extensions — so a regression in any single matcher shows up here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8b8b366a3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/suspect.go Outdated
lex0c and others added 2 commits April 18, 2026 19:18
The previous "excludes anything under that prefix" phrasing read as
"catches vendor at any depth", which is wrong — extract treats
`vendor/*` as a root prefix and lets `pkg/vendor/foo.go` through.
Monorepo users following the starter set hit this exact trap.

README now spells out the root-prefix behavior, points at the stats
warning as the right way to discover nested prefixes to add, and
contrasts that with suffix/basename patterns which do match at any
depth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The *.generated.* bucket used strings.Contains against the full path,
catching directory-name occurrences like src/foo.generated.v1/bar.go.
The emitted suggestion is always the literal --ignore '*.generated.*',
and extract.ShouldIgnore only applies that glob via the basename
match, so users would copy-paste the suggested fix and see the same
warning on the next run — bar.go never gets dropped.

Matcher is now basenameContains, which mirrors the semantics
ShouldIgnore actually uses. Real .generated. files still flag;
directories carrying the marker in their name no longer do.
Regression test covers both sides: the basename match fires, the
dir-only path stays silent.

containsSegment removed — it had no other callers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5aa140493f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/stats.go
churnTrend returns 0 for files with earlier-only history (the
strongest legacy-hotspot signal). When ≥25% of a dataset is dormant
— typical for mature repos with long tails of untouched paths — the
adaptive P25 collapses to 0 and the strict `trend < threshold` check
in classifyFile never fires. Every dormant concentrated file falls
through to silo, silently hiding exactly the alarm the rule is
supposed to raise.

computeBands now floors the adaptive threshold at
adaptiveDecliningTrendFloor (0.01): small enough not to widen the
declining band past the fallback constant, strictly positive so
trend=0 files still match. Regression test
TestChurnRiskAdaptiveDormantP25ZeroFlooring reproduces the mature-repo
shape (6 dormant + 6 active files) and pins that dormant
concentrated files land in legacy-hotspot, not silo.

METRICS.md notes the floor next to the P25 derivation and lists the
new constant in the thresholds table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 07916328cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/stats/stats.go Outdated
lex0c and others added 2 commits April 18, 2026 19:43
The previous formula (p * (len(sorted) - 1)) / 100 is not nearest-
rank — it truncates linear-position addressing and systematically
under-picks the cutoff for many {p, n} pairs. For N=10 at P75 it
returns idx 6 instead of the textbook nearest-rank idx 7, lowering
oldAgeThreshold by one sorted position and letting files in the
intended top quartile slip into active-core.

Extract nearestRankIndex(p, n) = ceil(p*n/100) - 1 as a shared
helper, wire percentileInt/percentileFloat through it, and add a
dedicated test covering the ceiling-driven cases the old formula
got wrong plus the p=0 / p=100 / n=1 edges. The existing
TestComputeBandsAdaptive assertion was updated to reflect the
corrected P75 (80, not 70) on its {10,20,…,1000} fixture.

Empirical impact on pi-hole, praat, WordPress: zero. The real
datasets have enough ties near the quartile boundaries that a
one-index shift lands on equal values, so label counts are
identical. The fix is still worth landing because the mismatch
between the comment claim and the actual method was a trap, and on
datasets without those ties the classification would be quietly off
from the documented semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The adaptive declining threshold is strict less-than (trend <
threshold) against a floor of 0.01. That means:
  - trend 0 (dormant, earlier-only sentinel) → declining
  - trend 0.001 → declining (below floor)
  - trend 0.01 → NOT declining (exactly at floor)
  - trend 0.011 → NOT declining (above floor)

Pinning this boundary explicitly because it's the subtlest corner of
the floor fix: a future refactor flipping < to <= would silently
pull borderline-active files into legacy-hotspot, and no existing
test would fire.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7725c5502

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/gitcortex/main.go Outdated
Comment thread cmd/gitcortex/main.go Outdated
lex0c and others added 2 commits April 18, 2026 20:01
printSuspectWarning trimmed buckets to the top-6 for display but
also used that trimmed slice to assemble the --ignore remediation
command. The noise-floor check is computed over every bucket in
DetectSuspectFiles, so repos matching more than six bucket types
would copy-paste a command that leaves the unshown buckets in
place — the next extract keeps those paths, their churn still
exceeds the 10% floor, and the warning keeps firing after the
"fix".

Suggestion aggregation moves to stats.CollectAllSuggestions, which
iterates every bucket with dedup. The display loop still trims to
maxShown but now prints "... and N more bucket(s) — see suggestion
below for full set" when buckets were hidden, so the user knows the
long command is intentional.

Regression test: construct a dataset with > 6 buckets and assert
every bucket's Suggestions appear in the aggregated list.

Observed on WordPress: 7 buckets detected, 6 shown, *.lock was
previously dropped from the --ignore line. Now included so the next
extract actually removes them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each suggestion glob was interpolated with naive '%s' single-quote
wrapping. Git paths can legally contain single quotes (and every
other shell metacharacter), so a matched prefix like
`services/foo's vendor/*` would render as
  --ignore 'services/foo's vendor/*'
which POSIX shells parse as two words — the rest of the copy-paste
command breaks after the stray quote. The "copy-pasteable" promise
in the warning silently fails on any repo with such a path.

ShellQuoteSingle wraps a string in POSIX single quotes and escapes
embedded single quotes via the classic `'\''` sequence. Unit tests
cover representative inputs; a round-trip test parses the quoted
form with a mini POSIX-shell unquoter so the pairing of emit / parse
is exercised directly, not just spot-checked.

Cross-verified against real bash: every tricky input (paths with
`'`, `$VAR`, backticks, `"`) round-trips cleanly through the
suggested command form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex0c
Copy link
Copy Markdown
Owner Author

lex0c commented Apr 18, 2026

@codex review

AgePercentile and TrendPercentile were tagged with explicit
snake_case names, so the emitted ChurnRiskResult object mixed
conventions: AgeDays, RecentChurn, Label, ... (PascalCase, from Go
defaults) alongside age_percentile, trend_percentile (snake_case,
from the tag). Downstream consumers parsing both styles in the same
object is avoidable friction I introduced in this branch.

Switched the tags to `json:",omitempty"` (leading comma) so the
fields keep the default PascalCase name while still omitting when
nil. Every key on the object now follows one convention.

The CSV column names stay snake_case — that's a separate schema,
matches the rest of the CSV headers already emitted elsewhere in
format.go, and CSV convention generally favors underscores anyway.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@lex0c lex0c merged commit 29da744 into main Apr 18, 2026
1 check passed
@lex0c lex0c deleted the feat/adaptive-labels branch April 18, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant