ai-codebase-explorer ingests a GitHub repository and lets you ask questions about the codebase from a single interactive command.
Pipeline:
- Clone or update repository.
- Parse code files into chunks.
- Build a dependency graph from Python + JS/TS imports.
- Embed chunks and persist them in SQLite.
- Retrieve relevant chunks for a user question.
- Ask the LLM to answer strictly from retrieved evidence.
This supports questions like:
- "Where is authentication implemented?"
- "How does the API talk to the database?"
- "Which files control user login?"
python -m pip install -r requirements.txtSet GEMINI_API_KEY in .env.
copy .env.example .envRun:
python app.pyThe app will prompt for:
- repo URL/path
- repo name
- questions in a loop (type
exitto quit)
- Embeddings are stored in SQLite as JSON arrays for portability.
- Vector retrieval currently performs in-process cosine similarity.
- Dependency graph extraction is implemented for:
- Python (
import,from ... import ...) - JavaScript/TypeScript (
import,export ... from,require)
- Python (
- Answers are prompted to be source-grounded and include evidence references.