feat: fight clankers with ai by huang-julien · Pull Request #15 · MatteoGabriele/agentscan-action

huang-julien · 2026-03-29T08:21:22Z

Toook HEAVEY inspiration from MatteoGabriele/agentscan#46

wonder if we should move the logic within voight-kampff-test

Summary by CodeRabbit

New Features
- Added optional AI-powered analysis via GitHub Models; enable with ai-analysis: true
- New ai-model input to select the AI model (default: openai/gpt-4o-mini)
- AI assessment results now included in PR comments and exposed via ai-assessment output
- AI analysis is non-blocking; failures emit warnings without halting execution
Documentation
- Updated README with AI Analysis section and workflow examples
- Added required permissions: { models: read } documentation

coderabbitai · 2026-03-29T08:21:29Z

📝 Walkthrough

Walkthrough

The PR introduces AI-powered analysis capability to the action using GitHub Models. It adds two new inputs (ai-analysis and ai-model), implements a new AI classification module with request/response handling, outputs an ai-assessment field, and integrates AI results into PR comments as an optional, non-blocking enhancement to the existing heuristic-based classification.

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Added documentation for new `ai-analysis` and `ai-model` inputs, including a complete workflow example with required permissions and explanation of `ai-assessment` output behavior.
Configuration `action.yml`, `package.json`	Added `ai-analysis` and `ai-model` input definitions with defaults; added `ai-assessment` output declaration; introduced `voight-kampff-compactor` dependency.
AI Analysis Module `src/ai.ts`, `src/ai.test.ts`	New `ai.ts` module implementing AI classification pipeline with types, system prompt, event slimming, user prompt building, and GitHub Models API integration; comprehensive test suite validating request/response handling, error cases, and data compaction.
Action Integration `src/index.ts`, `src/index.test.ts`	Updated main action logic to conditionally invoke AI analysis when enabled, extract user metadata and events, set `ai-assessment` output, and append AI results to PR comments under "🤖 AI Assessment" section; added integration tests covering invocation gating, success/failure paths, and comment rendering.

Sequence Diagram

sequenceDiagram
    participant Action as GitHub Action
    participant Compactor as Data Compactor
    participant API as GitHub Models API
    participant Parser as JSON Parser

    Action->>Action: Collect account/event data
    Action->>Compactor: Call compactor() with metadata
    Compactor->>Compactor: Reduce data size
    Compactor-->>Action: Return compacted string
    Action->>Action: Build user prompt with compacted data
    Action->>API: POST /inference/chat/completions<br/>(system_prompt + user_prompt)
    API->>API: Classify account as organic/<br/>mixed/automation
    API-->>Action: Return JSON with classification<br/>& confidence
    Action->>Parser: Strip <think>...</think> tags
    Parser->>Parser: Parse JSON response
    Parser-->>Action: Return AIAnalysisResult
    Action->>Action: Set ai-assessment output<br/>& append to PR comment

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hoppy hop, the AI hops in place,
Classifying bots with GitHub grace,
Data compacted, models align,
Organic mixed automation—so divine!
Assessment flows where comments grow,
Non-blocking wisdom steals the show! 🤖✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fight clankers with ai' accurately reflects the main change: adding AI-powered GitHub account classification via the GitHub Models API. It clearly conveys the primary objective while remaining concise.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/index.test.ts (1)
151-154: ⚠️ Potential issue | 🟠 Major

Replace raw require("fs") calls with ESM imports for consistency.

This test file uses "type": "module" and already imports rmSync from fs via ESM, but mixes in raw require("fs") calls throughout. In Node.js ES modules, require() is not available without module.createRequire()—this pattern is fragile and inconsistent. Import the needed fs functions (mkdirSync, writeFileSync, readFileSync) at the top of the file instead.

Applies to lines: 151-154, 174-177, 198-199, 213-215, 229-232
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.test.ts` around lines 151 - 154, Tests use raw require("fs") which
is incompatible with ESM; replace all usages with ESM imports by adding at the
top: import { mkdirSync, writeFileSync, readFileSync } from 'fs' (rmSync is
already imported) and then update places that call require("fs").readFileSync /
writeFileSync / mkdirSync to use the imported readFileSync, writeFileSync, and
mkdirSync functions (e.g., where cacheFile and cacheData are read, and the other
read/write/mkdir calls later in the file).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai.ts`:
- Around line 179-196: The fetch to
"https://models.github.ai/inference/chat/completions" (creating const response)
has no cancellation or timeout; wrap the request with an AbortController, start
a timeout timer (e.g., configurable ms) that calls controller.abort(), pass
controller.signal into fetch, and clear the timer on success; catch the
abort/timeout (check for DOMException name === 'AbortError' or similar) and
return/trigger the heuristic fallback path instead of letting the action stall.
Reference the existing symbols used in the call (token, model, prompt,
SYSTEM_PROMPT, and the response variable) so the change plugs into the current
request flow and error handling.
- Around line 14-18: The prompt and AIAnalysisResult types disagree about score
semantics: update buildUserPrompt (and any prompt text near lines referencing
numeric bands) to stop hardcoding numeric ranges and instead either (A) map the
model-facing prompt to the repository's semantic (where input.analysis.score
uses the existing 0-100 heuristic used in tests) by explicitly stating how
scores are interpreted and asking the model to return a confidence that is
probability for the chosen classification, or (B) remove all numeric bands and
ask the model to return only classification and a confidence (0-1 or 0-100) with
a clear declared scale; ensure AIAnalysisResult.confidence represents confidence
in the returned label (not a polarity), and update any code that reads
input.analysis.score (e.g., buildUserPrompt, any prompt blocks around lines
33-36, 102-110, 156-159) to use the agreed-upon mapping so rendered "X%
confidence" reflects that same scale.

In `@src/index.ts`:
- Around line 90-92: The cache-hit path reuses cached analysis but leaves
accountCreatedAt, publicRepos, and userEvents at their defaults because those
fields are only set in the cache-miss branch; update the CacheEntry shape to
persist these fields (or fetch them before reusing cached analysis) and ensure
the code that handles a cache hit populates accountCreatedAt, publicRepos, and
userEvents before calling getAIAnalysis(); apply the same change to the other
occurrence around the code handling lines 186–197 so cached responses include
the full AI context.
- Around line 198-199: The current code JSON-encodes aiAssessment even when
null, producing "\"\"" and still logs success; update the logic around
core.setOutput("ai-assessment", ...) so that if aiAssessment === null you set
the output to an empty string (core.setOutput("ai-assessment", "")) instead of
JSON.stringify(aiAssessment ?? ""), and adjust the logging (the core.info call
after setOutput) to reflect the null case (e.g., log that no AI analysis was
produced) or skip the success message when aiAssessment is null; locate and
change the usage of aiAssessment and the core.setOutput/core.info calls in this
file.

---

Outside diff comments:
In `@src/index.test.ts`:
- Around line 151-154: Tests use raw require("fs") which is incompatible with
ESM; replace all usages with ESM imports by adding at the top: import {
mkdirSync, writeFileSync, readFileSync } from 'fs' (rmSync is already imported)
and then update places that call require("fs").readFileSync / writeFileSync /
mkdirSync to use the imported readFileSync, writeFileSync, and mkdirSync
functions (e.g., where cacheFile and cacheData are read, and the other
read/write/mkdir calls later in the file).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fff4e5f0-0d73-43f4-93a9-f7627bde0b10

📥 Commits

Reviewing files that changed from the base of the PR and between 6b02b79 and 99720c9.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (7)

README.md
action.yml
package.json
src/ai.test.ts
src/ai.ts
src/index.test.ts
src/index.ts

coderabbitai · 2026-03-31T21:13:46Z

src/ai.ts

+export type AIAnalysisResult = {
+  classification: "organic" | "mixed" | "automation";
+  confidence: number;
+  reasoning: string;
+};


⚠️ Potential issue | 🟠 Major

The prompt assumes a score scale that the rest of the PR does not use.

buildUserPrompt() forwards input.analysis.score verbatim, but the fixtures in src/ai.test.ts and src/index.test.ts mark classification: "organic" as score: 20. Here the prompt says organic is >= 70, and it also turns the returned confidence into a human↔automation polarity score rather than confidence in the selected label. At minimum the prompt and the rest of the PR disagree on what these numbers mean, which will skew the model and make the rendered X% confidence misleading. Align the prompt with the real heuristic semantics, or remove the hardcoded numeric bands entirely.

Also applies to: 33-36, 102-110, 156-159

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai.ts` around lines 14 - 18, The prompt and AIAnalysisResult types disagree about score semantics: update buildUserPrompt (and any prompt text near lines referencing numeric bands) to stop hardcoding numeric ranges and instead either (A) map the model-facing prompt to the repository's semantic (where input.analysis.score uses the existing 0-100 heuristic used in tests) by explicitly stating how scores are interpreted and asking the model to return a confidence that is probability for the chosen classification, or (B) remove all numeric bands and ask the model to return only classification and a confidence (0-1 or 0-100) with a clear declared scale; ensure AIAnalysisResult.confidence represents confidence in the returned label (not a polarity), and update any code that reads input.analysis.score (e.g., buildUserPrompt, any prompt blocks around lines 33-36, 102-110, 156-159) to use the agreed-upon mapping so rendered "X% confidence" reflects that same scale.

coderabbitai · 2026-03-31T21:13:46Z

src/ai.ts

+  const response = await fetch(
+    "https://models.github.ai/inference/chat/completions",
+    {
+      method: "POST",
+      headers: {
+        Authorization: `Bearer ${token}`,
+        "Content-Type": "application/json",
+      },
+      body: JSON.stringify({
+        model,
+        messages: [
+          { role: "system", content: SYSTEM_PROMPT },
+          { role: "user", content: prompt },
+        ],
+        temperature: 0.3,
+      }),
+    },
+  );


⚠️ Potential issue | 🟠 Major

Add a timeout to the inference request.

This AI step is supposed to be non-blocking, but fetch() here has no cancellation path. If the models endpoint stalls, the whole action stalls instead of falling back to the heuristic result.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai.ts` around lines 179 - 196, The fetch to "https://models.github.ai/inference/chat/completions" (creating const response) has no cancellation or timeout; wrap the request with an AbortController, start a timeout timer (e.g., configurable ms) that calls controller.abort(), pass controller.signal into fetch, and clear the timer on success; catch the abort/timeout (check for DOMException name === 'AbortError' or similar) and return/trigger the heuristic fallback path instead of letting the action stall. Reference the existing symbols used in the call (token, model, prompt, SYSTEM_PROMPT, and the response variable) so the change plugs into the current request flow and error handling.

coderabbitai · 2026-03-31T21:13:46Z

src/ai.ts

+      body: JSON.stringify({
+        model,
+        messages: [
+          { role: "system", content: SYSTEM_PROMPT },
+          { role: "user", content: prompt },
+        ],
+        temperature: 0.3,
+      }),


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the src/ai.ts file, specifically the lines mentioned cat -n src/ai.ts | head -230 | tail -80

Repository: MatteoGabriele/agentscan-action

Length of output: 2815

🏁 Script executed:

# Also search for AIAnalysisResult definition and usage rg "AIAnalysisResult" --context 5

Repository: MatteoGabriele/agentscan-action

Length of output: 4172

🏁 Script executed:

# Check the complete structure around lines 203-215 as well cat -n src/ai.ts | sed -n '200,220p'

Repository: MatteoGabriele/agentscan-action

Length of output: 737

🏁 Script executed:

# Look for any existing validation logic rg "JSON.parse|validate|schema" src/ai.ts --context 2

Repository: MatteoGabriele/agentscan-action

Length of output: 202

🌐 Web query:

GitHub Models API response_format json_object json_schema

💡 Result:

The GitHub Models API supports structured response formats including json_object and json_schema in the /inference/chat/completions endpoint. ## Usage in API Requests Include response_format in the POST body: - For simple JSON: "response_format": { "type": "json_object" } - For strict JSON Schema (supported on GPT-4o with api-version “2024-08-01-preview”): "response_format": { "type": "json_schema", "json_schema": { "name": "response_name", "schema": { /* your JSON schema object */ } } } Example full request (org endpoint): curl -X POST https://models.github.ai/orgs/ORG/inference/chat/completions \ -H "Authorization: Bearer $GITHUB_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Your prompt"}], "response_format": { "type": "json_schema", "json_schema": { "name": "example", "schema": { "type": "object", "properties": { "field1": {"type": "string"} }, "required": ["field1"], "additionalProperties": false } } } }' ## Key Details - Official docs confirm type can be text, json_object, or json_schema with nested json_schema object. {6} - JSON Schema introduced Jan 2025 in playground, extended to API; supports single object or array. Limited to GPT-4o initially. {1,17} - Matches OpenAI API format for compatibility. {23,27} - Use headers: Authorization: Bearer , X-GitHub-Api-Version: 2026-03-10, Accept: application/vnd.github+json. {6,13} - Available in playground UI and .prompt.yml files for GitHub Actions. {14,17}

Citations:

1: https://github.blog/changelog/2025-01-08-github-models-introduces-json-schema-support-for-response-formats/

2: https://docs.github.com/en/rest/models/inference

3: https://docs.github.com/en/github-models/quickstart

4: https://github.com/actions/ai-inference

5: https://github.blog/changelog/2025-01-08-github-models-introduces-json-schema-support-for-response-formats

6: https://developers.openai.com/docs/guides/structured-outputs

Use response_format with runtime validation for AI responses.

The GitHub Models API supports response_format with both json_object and json_schema, allowing you to enforce structured output at the API level. Currently, the code accepts any syntactically valid JSON without validating the required fields—JSON.parse(content) as AIAnalysisResult provides only TypeScript compile-time type casting, not runtime validation. If the model returns valid JSON missing classification, confidence, or reasoning fields, those will be undefined at runtime and leak into PR comments and outputs.

Add response_format to the request body (lines 187-194) and validate the parsed response against the AIAnalysisResult schema before returning (lines 203-215). The TODO comment on line 214 already acknowledges this gap.

coderabbitai · 2026-03-31T21:13:46Z

src/index.ts

+    let accountCreatedAt = "";
+    let publicRepos = 0;
+    let userEvents: Record<string, unknown>[] = [];


⚠️ Potential issue | 🟠 Major

Cache hits drop the extra AI context.

On the cache-hit path, accountCreatedAt, publicRepos, and userEvents stay at "", 0, and [], because they are only populated in the cache-miss branch. With ai-analysis=true, a warm cache therefore sends a much weaker prompt to getAIAnalysis() than a cold cache for the same user. Either persist those fields in CacheEntry, or fetch them before this call when cached analysis is reused.

Also applies to: 186-197

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/index.ts` around lines 90 - 92, The cache-hit path reuses cached analysis but leaves accountCreatedAt, publicRepos, and userEvents at their defaults because those fields are only set in the cache-miss branch; update the CacheEntry shape to persist these fields (or fetch them before reusing cached analysis) and ensure the code that handles a cache hit populates accountCreatedAt, publicRepos, and userEvents before calling getAIAnalysis(); apply the same change to the other occurrence around the code handling lines 186–197 so cached responses include the full AI context.

coderabbitai · 2026-03-31T21:13:46Z

src/index.ts

+        core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));
+        core.info("AI analysis completed");


⚠️ Potential issue | 🟡 Minor

Treat a null AI result as blank, not "\"\""

If getAIAnalysis() returns null, this writes a JSON-encoded empty string and still logs success. Downstream steps expecting either blank or a JSON object will get the wrong shape.

🔧 Suggested fix

- core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? "")); - core.info("AI analysis completed"); + core.setOutput( + "ai-assessment", + aiAssessment ? JSON.stringify(aiAssessment) : "", + ); + if (aiAssessment) { + core.info("AI analysis completed"); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));

core.info("AI analysis completed");

core.setOutput(

"ai-assessment",

aiAssessment ? JSON.stringify(aiAssessment) : "",

);

if (aiAssessment) {

core.info("AI analysis completed");

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/index.ts` around lines 198 - 199, The current code JSON-encodes aiAssessment even when null, producing "\"\"" and still logs success; update the logic around core.setOutput("ai-assessment", ...) so that if aiAssessment === null you set the output to an empty string (core.setOutput("ai-assessment", "")) instead of JSON.stringify(aiAssessment ?? ""), and adjust the logging (the core.info call after setOutput) to reflect the null case (e.g., log that no AI analysis was produced) or skip the success message when aiAssessment is null; locate and change the usage of aiAssessment and the core.setOutput/core.info calls in this file.

huang-julien · 2026-04-01T10:34:56Z

draft, let's move the GH call to unveil

feat: fight clanker with ai

a0ec3a3

huang-julien changed the title ~~feat: fight clanker with ai~~ feat: fight clankers with ai Mar 29, 2026

huang-julien added 3 commits March 29, 2026 10:58

fix: update promptand slim events

494aa2c

update prompt and set deepseek by default

982a2a6

feat: go back to 4o-mini

99720c9

huang-julien marked this pull request as ready for review March 31, 2026 21:01

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

huang-julien marked this pull request as draft April 1, 2026 10:34

huang-julien mentioned this pull request Apr 2, 2026

feat: add ai to detect clankers unveil-project/identity#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fight clankers with ai#15

feat: fight clankers with ai#15
huang-julien wants to merge 4 commits intoMatteoGabriele:mainfrom
huang-julien:feat/fight_clanker_with_ai

huang-julien commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

huang-julien commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));
		core.info("AI analysis completed");

Conversation

huang-julien commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

huang-julien commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

huang-julien commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2026 •

edited

Loading