Skip to content

feat: fight clankers with ai#15

Draft
huang-julien wants to merge 4 commits intoMatteoGabriele:mainfrom
huang-julien:feat/fight_clanker_with_ai
Draft

feat: fight clankers with ai#15
huang-julien wants to merge 4 commits intoMatteoGabriele:mainfrom
huang-julien:feat/fight_clanker_with_ai

Conversation

@huang-julien
Copy link
Copy Markdown
Contributor

@huang-julien huang-julien commented Mar 29, 2026

Toook HEAVEY inspiration from MatteoGabriele/agentscan#46

wonder if we should move the logic within voight-kampff-test

Summary by CodeRabbit

  • New Features

    • Added optional AI-powered analysis via GitHub Models; enable with ai-analysis: true
    • New ai-model input to select the AI model (default: openai/gpt-4o-mini)
    • AI assessment results now included in PR comments and exposed via ai-assessment output
    • AI analysis is non-blocking; failures emit warnings without halting execution
  • Documentation

    • Updated README with AI Analysis section and workflow examples
    • Added required permissions: { models: read } documentation

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 29, 2026

📝 Walkthrough

Walkthrough

The PR introduces AI-powered analysis capability to the action using GitHub Models. It adds two new inputs (ai-analysis and ai-model), implements a new AI classification module with request/response handling, outputs an ai-assessment field, and integrates AI results into PR comments as an optional, non-blocking enhancement to the existing heuristic-based classification.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added documentation for new ai-analysis and ai-model inputs, including a complete workflow example with required permissions and explanation of ai-assessment output behavior.
Configuration
action.yml, package.json
Added ai-analysis and ai-model input definitions with defaults; added ai-assessment output declaration; introduced voight-kampff-compactor dependency.
AI Analysis Module
src/ai.ts, src/ai.test.ts
New ai.ts module implementing AI classification pipeline with types, system prompt, event slimming, user prompt building, and GitHub Models API integration; comprehensive test suite validating request/response handling, error cases, and data compaction.
Action Integration
src/index.ts, src/index.test.ts
Updated main action logic to conditionally invoke AI analysis when enabled, extract user metadata and events, set ai-assessment output, and append AI results to PR comments under "🤖 AI Assessment" section; added integration tests covering invocation gating, success/failure paths, and comment rendering.

Sequence Diagram

sequenceDiagram
    participant Action as GitHub Action
    participant Compactor as Data Compactor
    participant API as GitHub Models API
    participant Parser as JSON Parser

    Action->>Action: Collect account/event data
    Action->>Compactor: Call compactor() with metadata
    Compactor->>Compactor: Reduce data size
    Compactor-->>Action: Return compacted string
    Action->>Action: Build user prompt with compacted data
    Action->>API: POST /inference/chat/completions<br/>(system_prompt + user_prompt)
    API->>API: Classify account as organic/<br/>mixed/automation
    API-->>Action: Return JSON with classification<br/>& confidence
    Action->>Parser: Strip <think>...</think> tags
    Parser->>Parser: Parse JSON response
    Parser-->>Action: Return AIAnalysisResult
    Action->>Action: Set ai-assessment output<br/>& append to PR comment
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hoppy hop, the AI hops in place,
Classifying bots with GitHub grace,
Data compacted, models align,
Organic mixed automation—so divine!
Assessment flows where comments grow,
Non-blocking wisdom steals the show! 🤖✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fight clankers with ai' accurately reflects the main change: adding AI-powered GitHub account classification via the GitHub Models API. It clearly conveys the primary objective while remaining concise.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@huang-julien huang-julien changed the title feat: fight clanker with ai feat: fight clankers with ai Mar 29, 2026
@huang-julien huang-julien marked this pull request as ready for review March 31, 2026 21:01
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/index.test.ts (1)

151-154: ⚠️ Potential issue | 🟠 Major

Replace raw require("fs") calls with ESM imports for consistency.

This test file uses "type": "module" and already imports rmSync from fs via ESM, but mixes in raw require("fs") calls throughout. In Node.js ES modules, require() is not available without module.createRequire()—this pattern is fragile and inconsistent. Import the needed fs functions (mkdirSync, writeFileSync, readFileSync) at the top of the file instead.

Applies to lines: 151-154, 174-177, 198-199, 213-215, 229-232

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.test.ts` around lines 151 - 154, Tests use raw require("fs") which
is incompatible with ESM; replace all usages with ESM imports by adding at the
top: import { mkdirSync, writeFileSync, readFileSync } from 'fs' (rmSync is
already imported) and then update places that call require("fs").readFileSync /
writeFileSync / mkdirSync to use the imported readFileSync, writeFileSync, and
mkdirSync functions (e.g., where cacheFile and cacheData are read, and the other
read/write/mkdir calls later in the file).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai.ts`:
- Around line 179-196: The fetch to
"https://models.github.ai/inference/chat/completions" (creating const response)
has no cancellation or timeout; wrap the request with an AbortController, start
a timeout timer (e.g., configurable ms) that calls controller.abort(), pass
controller.signal into fetch, and clear the timer on success; catch the
abort/timeout (check for DOMException name === 'AbortError' or similar) and
return/trigger the heuristic fallback path instead of letting the action stall.
Reference the existing symbols used in the call (token, model, prompt,
SYSTEM_PROMPT, and the response variable) so the change plugs into the current
request flow and error handling.
- Around line 14-18: The prompt and AIAnalysisResult types disagree about score
semantics: update buildUserPrompt (and any prompt text near lines referencing
numeric bands) to stop hardcoding numeric ranges and instead either (A) map the
model-facing prompt to the repository's semantic (where input.analysis.score
uses the existing 0-100 heuristic used in tests) by explicitly stating how
scores are interpreted and asking the model to return a confidence that is
probability for the chosen classification, or (B) remove all numeric bands and
ask the model to return only classification and a confidence (0-1 or 0-100) with
a clear declared scale; ensure AIAnalysisResult.confidence represents confidence
in the returned label (not a polarity), and update any code that reads
input.analysis.score (e.g., buildUserPrompt, any prompt blocks around lines
33-36, 102-110, 156-159) to use the agreed-upon mapping so rendered "X%
confidence" reflects that same scale.

In `@src/index.ts`:
- Around line 90-92: The cache-hit path reuses cached analysis but leaves
accountCreatedAt, publicRepos, and userEvents at their defaults because those
fields are only set in the cache-miss branch; update the CacheEntry shape to
persist these fields (or fetch them before reusing cached analysis) and ensure
the code that handles a cache hit populates accountCreatedAt, publicRepos, and
userEvents before calling getAIAnalysis(); apply the same change to the other
occurrence around the code handling lines 186–197 so cached responses include
the full AI context.
- Around line 198-199: The current code JSON-encodes aiAssessment even when
null, producing "\"\"" and still logs success; update the logic around
core.setOutput("ai-assessment", ...) so that if aiAssessment === null you set
the output to an empty string (core.setOutput("ai-assessment", "")) instead of
JSON.stringify(aiAssessment ?? ""), and adjust the logging (the core.info call
after setOutput) to reflect the null case (e.g., log that no AI analysis was
produced) or skip the success message when aiAssessment is null; locate and
change the usage of aiAssessment and the core.setOutput/core.info calls in this
file.

---

Outside diff comments:
In `@src/index.test.ts`:
- Around line 151-154: Tests use raw require("fs") which is incompatible with
ESM; replace all usages with ESM imports by adding at the top: import {
mkdirSync, writeFileSync, readFileSync } from 'fs' (rmSync is already imported)
and then update places that call require("fs").readFileSync / writeFileSync /
mkdirSync to use the imported readFileSync, writeFileSync, and mkdirSync
functions (e.g., where cacheFile and cacheData are read, and the other
read/write/mkdir calls later in the file).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fff4e5f0-0d73-43f4-93a9-f7627bde0b10

📥 Commits

Reviewing files that changed from the base of the PR and between 6b02b79 and 99720c9.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • README.md
  • action.yml
  • package.json
  • src/ai.test.ts
  • src/ai.ts
  • src/index.test.ts
  • src/index.ts

Comment on lines +14 to +18
export type AIAnalysisResult = {
classification: "organic" | "mixed" | "automation";
confidence: number;
reasoning: string;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The prompt assumes a score scale that the rest of the PR does not use.

buildUserPrompt() forwards input.analysis.score verbatim, but the fixtures in src/ai.test.ts and src/index.test.ts mark classification: "organic" as score: 20. Here the prompt says organic is >= 70, and it also turns the returned confidence into a human↔automation polarity score rather than confidence in the selected label. At minimum the prompt and the rest of the PR disagree on what these numbers mean, which will skew the model and make the rendered X% confidence misleading. Align the prompt with the real heuristic semantics, or remove the hardcoded numeric bands entirely.

Also applies to: 33-36, 102-110, 156-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai.ts` around lines 14 - 18, The prompt and AIAnalysisResult types
disagree about score semantics: update buildUserPrompt (and any prompt text near
lines referencing numeric bands) to stop hardcoding numeric ranges and instead
either (A) map the model-facing prompt to the repository's semantic (where
input.analysis.score uses the existing 0-100 heuristic used in tests) by
explicitly stating how scores are interpreted and asking the model to return a
confidence that is probability for the chosen classification, or (B) remove all
numeric bands and ask the model to return only classification and a confidence
(0-1 or 0-100) with a clear declared scale; ensure AIAnalysisResult.confidence
represents confidence in the returned label (not a polarity), and update any
code that reads input.analysis.score (e.g., buildUserPrompt, any prompt blocks
around lines 33-36, 102-110, 156-159) to use the agreed-upon mapping so rendered
"X% confidence" reflects that same scale.

Comment on lines +179 to +196
const response = await fetch(
"https://models.github.ai/inference/chat/completions",
{
method: "POST",
headers: {
Authorization: `Bearer ${token}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model,
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: prompt },
],
temperature: 0.3,
}),
},
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a timeout to the inference request.

This AI step is supposed to be non-blocking, but fetch() here has no cancellation path. If the models endpoint stalls, the whole action stalls instead of falling back to the heuristic result.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ai.ts` around lines 179 - 196, The fetch to
"https://models.github.ai/inference/chat/completions" (creating const response)
has no cancellation or timeout; wrap the request with an AbortController, start
a timeout timer (e.g., configurable ms) that calls controller.abort(), pass
controller.signal into fetch, and clear the timer on success; catch the
abort/timeout (check for DOMException name === 'AbortError' or similar) and
return/trigger the heuristic fallback path instead of letting the action stall.
Reference the existing symbols used in the call (token, model, prompt,
SYSTEM_PROMPT, and the response variable) so the change plugs into the current
request flow and error handling.

Comment on lines +187 to +194
body: JSON.stringify({
model,
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: prompt },
],
temperature: 0.3,
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the src/ai.ts file, specifically the lines mentioned
cat -n src/ai.ts | head -230 | tail -80

Repository: MatteoGabriele/agentscan-action

Length of output: 2815


🏁 Script executed:

# Also search for AIAnalysisResult definition and usage
rg "AIAnalysisResult" --context 5

Repository: MatteoGabriele/agentscan-action

Length of output: 4172


🏁 Script executed:

# Check the complete structure around lines 203-215 as well
cat -n src/ai.ts | sed -n '200,220p'

Repository: MatteoGabriele/agentscan-action

Length of output: 737


🏁 Script executed:

# Look for any existing validation logic
rg "JSON.parse|validate|schema" src/ai.ts --context 2

Repository: MatteoGabriele/agentscan-action

Length of output: 202


🌐 Web query:

GitHub Models API response_format json_object json_schema

💡 Result:

The GitHub Models API supports structured response formats including json_object and json_schema in the /inference/chat/completions endpoint. ## Usage in API Requests Include response_format in the POST body: - For simple JSON: "response_format": { "type": "json_object" } - For strict JSON Schema (supported on GPT-4o with api-version “2024-08-01-preview”): "response_format": { "type": "json_schema", "json_schema": { "name": "response_name", "schema": { /* your JSON schema object */ } } } Example full request (org endpoint): curl -X POST https://models.github.ai/orgs/ORG/inference/chat/completions \ -H "Authorization: Bearer $GITHUB_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Your prompt"}], "response_format": { "type": "json_schema", "json_schema": { "name": "example", "schema": { "type": "object", "properties": { "field1": {"type": "string"} }, "required": ["field1"], "additionalProperties": false } } } }' ## Key Details - Official docs confirm type can be text, json_object, or json_schema with nested json_schema object. {6} - JSON Schema introduced Jan 2025 in playground, extended to API; supports single object or array. Limited to GPT-4o initially. {1,17} - Matches OpenAI API format for compatibility. {23,27} - Use headers: Authorization: Bearer , X-GitHub-Api-Version: 2026-03-10, Accept: application/vnd.github+json. {6,13} - Available in playground UI and .prompt.yml files for GitHub Actions. {14,17}

Citations:


Use response_format with runtime validation for AI responses.

The GitHub Models API supports response_format with both json_object and json_schema, allowing you to enforce structured output at the API level. Currently, the code accepts any syntactically valid JSON without validating the required fields—JSON.parse(content) as AIAnalysisResult provides only TypeScript compile-time type casting, not runtime validation. If the model returns valid JSON missing classification, confidence, or reasoning fields, those will be undefined at runtime and leak into PR comments and outputs.

Add response_format to the request body (lines 187-194) and validate the parsed response against the AIAnalysisResult schema before returning (lines 203-215). The TODO comment on line 214 already acknowledges this gap.

Comment on lines +90 to +92
let accountCreatedAt = "";
let publicRepos = 0;
let userEvents: Record<string, unknown>[] = [];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cache hits drop the extra AI context.

On the cache-hit path, accountCreatedAt, publicRepos, and userEvents stay at "", 0, and [], because they are only populated in the cache-miss branch. With ai-analysis=true, a warm cache therefore sends a much weaker prompt to getAIAnalysis() than a cold cache for the same user. Either persist those fields in CacheEntry, or fetch them before this call when cached analysis is reused.

Also applies to: 186-197

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 90 - 92, The cache-hit path reuses cached analysis
but leaves accountCreatedAt, publicRepos, and userEvents at their defaults
because those fields are only set in the cache-miss branch; update the
CacheEntry shape to persist these fields (or fetch them before reusing cached
analysis) and ensure the code that handles a cache hit populates
accountCreatedAt, publicRepos, and userEvents before calling getAIAnalysis();
apply the same change to the other occurrence around the code handling lines
186–197 so cached responses include the full AI context.

Comment on lines +198 to +199
core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));
core.info("AI analysis completed");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Treat a null AI result as blank, not "\"\""

If getAIAnalysis() returns null, this writes a JSON-encoded empty string and still logs success. Downstream steps expecting either blank or a JSON object will get the wrong shape.

🔧 Suggested fix
-        core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));
-        core.info("AI analysis completed");
+        core.setOutput(
+          "ai-assessment",
+          aiAssessment ? JSON.stringify(aiAssessment) : "",
+        );
+        if (aiAssessment) {
+          core.info("AI analysis completed");
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
core.setOutput("ai-assessment", JSON.stringify(aiAssessment ?? ""));
core.info("AI analysis completed");
core.setOutput(
"ai-assessment",
aiAssessment ? JSON.stringify(aiAssessment) : "",
);
if (aiAssessment) {
core.info("AI analysis completed");
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 198 - 199, The current code JSON-encodes
aiAssessment even when null, producing "\"\"" and still logs success; update the
logic around core.setOutput("ai-assessment", ...) so that if aiAssessment ===
null you set the output to an empty string (core.setOutput("ai-assessment", ""))
instead of JSON.stringify(aiAssessment ?? ""), and adjust the logging (the
core.info call after setOutput) to reflect the null case (e.g., log that no AI
analysis was produced) or skip the success message when aiAssessment is null;
locate and change the usage of aiAssessment and the core.setOutput/core.info
calls in this file.

@huang-julien huang-julien marked this pull request as draft April 1, 2026 10:34
@huang-julien
Copy link
Copy Markdown
Contributor Author

draft, let's move the GH call to unveil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant