Skip to content

entropyvortex/repoguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RepoGuard

One more defense (or attempt of) before you git clone.

In 2026, stars are fake, READMEs are polished, and malware hides in postinstall scripts. RepoGuard uses an OpenAI-compatible LLM (xAI Grok by default) to analyze a repository before you clone it — giving you a risk score, red flags, and a plain-English explanation in seconds.

This is an experimental ouroboros.

It can catch malicious repos… but a clever attacker can also prompt-inject it.

Not production-ready. Pure experiment.

Hate something? Fork it and send a PR. Empty criticism is just noise.

Philosophy

  • One file. No magic. The whole tool is a ~900-line bash script you can read end-to-end in a single sitting before trusting it.
  • No unusual dependencies. The only non-stock installs are curl and jq; everything else (awk, sed, tr, mktemp, stat, grep, getconf) is standard POSIX and already on any developer machine.
  • Transparent trust boundary. Every byte that leaves your machine is documented below. No telemetry, no analytics.
  • CI-friendly. Exit codes are a stable contract, so repoguard check ... || exit $? does exactly what you expect.

What it actually does

Given a GitHub repository URL, repoguard check:

  1. Fetches the repo's public metadata and security settings, open Dependabot PRs, and public security advisories via the GitHub API.
  2. Fetches README.md, SECURITY.md, and common manifest/install files (package.json, setup.py, pyproject.toml, requirements.txt, install.sh, postinstall.js, and a couple of workflow files) from raw.githubusercontent.com.
  3. Sends that bundle to an OpenAI-compatible chat completions endpoint with a security-auditor system prompt.
  4. Parses the model's JSON verdict and prints a risk score (0–100), a verdict (Safe / Warning / Danger), red flags, and a recommended action.
  5. Exits with a code derived from the score so CI pipelines and the optional git wrapper can gate on it.

Provider support

GitHub only at the moment. GitLab and Bitbucket URLs are rejected with a clear error rather than silently producing a noise-only analysis. Provider parity is a small, welcome PR.

Requirements

  • bash 4+ (macOS ships with 3.2 — brew install bash if needed)
  • curl and jq
  • An API key for any OpenAI-compatible chat completion endpoint. The default base URL is xAI's https://api.x.ai/v1, but you can point it at any compatible endpoint during repoguard setup.
  • Optional: a GITHUB_TOKEN environment variable to lift GitHub's anonymous rate limit from 60 req/hour to 5000 req/hour. A single check spends a dozen-ish requests, so unauthenticated use hits the ceiling quickly.

Install

Recommended: audit first

curl -fsSL https://raw.githubusercontent.com/entropyvortex/repoguard/HEAD/install.sh -o install.sh
less install.sh        # read it
bash install.sh        # then run it

One-liner (if you trust the source)

curl -fsSL https://raw.githubusercontent.com/entropyvortex/repoguard/HEAD/install.sh | bash

Both paths download the main repoguard script into ~/.local/bin/repoguard (mode 0755), create ~/.config/repoguard (mode 0700), and add ~/.local/bin to your PATH in ~/.bashrc or ~/.zshrc if it isn't already. For fish, the installer prints a manual set -gx PATH line to add to ~/.config/fish/config.fish.

After install, open a new shell (or run hash -r / rehash) so the command-hash cache picks up the new binary.

Configure

repoguard setup

You'll be prompted for a base URL, model, API key, and the warn/abort risk thresholds. The config file is written with mode 600 so other users on the machine cannot read your API key.

Usage

repoguard check https://github.com/suspicious/repo

Flags:

Flag Effect
--verbose, -v Print every step to stderr
--non-interactive No colors, no prompts — ideal for CI
--json Emit the raw model verdict on stdout for piping to jq

CI / pipelines

Exit codes are a stable contract:

Code Meaning
0 Safe (score below warn threshold)
1 Warning (score ≥ warn threshold)
2 Abort (score ≥ abort threshold OR hard error)
64 Usage error
78 Configuration error

Typical pipeline usage:

repoguard check --non-interactive --json "$TARGET" || exit $?

Git wrapper (optional)

repoguard enable-git-wrapper

This installs a ~30-line shim at ~/.local/bin/git that intercepts git clone <url>, runs repoguard check first, and refuses the clone if the score crosses your abort threshold. Every other git command (git status, git commit, git push, ...) passes straight through to the real git binary. The absolute path of that real git is discovered at install time and hardcoded into the shim, so the wrapper works identically on macOS (Apple Silicon / Intel Homebrew), Linux, and Nix — no runtime PATH games, no self-recursion risk.

To remove it:

repoguard disable-git-wrapper

Check status and see which git is active:

repoguard status

Privacy and trust boundary

RepoGuard is a security tool, so the data flow needs to be explicit.

What leaves your machine when you run repoguard check:

  1. Public metadata, README, SECURITY.md, and a handful of manifest/install files for the target repository, fetched from GitHub's public endpoints (or authenticated endpoints if you set GITHUB_TOKEN).
  2. That bundle, plus the system prompt, sent to the LLM base URL you configured during setup. The URL of the repository you are checking is therefore disclosed to that LLM endpoint. If you enable the git wrapper, every repo you git clone is disclosed this way.
  3. Your API key, sent to the LLM endpoint in the Authorization: Bearer header. It is never transmitted anywhere else.

What stays on your machine:

  • Your API key and configuration (~/.config/repoguard/config.json, mode 600).
  • Any private repository contents — RepoGuard only fetches public GitHub endpoints.
  • The LLM's response, which is printed locally and never persisted or forwarded.

If you do not want a third-party LLM provider to see the names of the repositories you clone, do not enable the git wrapper — use repoguard check explicitly on URLs you choose to submit.

Known limitations

RepoGuard's verdict is a signal, not a proof. The tool sends an attacker-controlled README and manifests to an LLM and trusts its JSON response. A sufficiently persuasive malicious README can prompt-inject the model into returning a low risk score. Use the verdict as one input into your decision — do not bypass manual review on the strength of a green result alone. See SECURITY.md for the full threat model.

Security reporting

Found a flaw that compromises the trust boundary above, or a way to get RepoGuard to execute attacker-controlled content? Please open a GitHub security advisory rather than a public issue. See SECURITY.md for scope, disclosure timeline, and the list of known out-of-scope limitations.

License

MIT — see LICENSE.

Contributions welcome: provider parity (GitLab, Bitbucket, Gitea), better prompt engineering, CI workflows, and tests are all high-value PRs.

About

Stop malware and supply-chain attacks before you git clone. AI-powered repo guard powered by Grok that detects fake stars, hidden risks, and shady repos.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors

Languages