CodeSentinel uses the OpenAI Python SDK, making it compatible with any API that adheres to the OpenAI chat completions specification.
Running a local LLM ensures your code never leaves your machine.
-
Load a model (e.g.,
gpt-oss-20b). -
Go to the Local Server tab and click Start Server.
-
In
config.yaml, set:openai_base_url: "http://localhost:1234/v1" ai_model: "your-model-id" # Copy from LM Studio
-
Start a simple service using
llama-server, with the following recommended parameters:llama-server -m ~/model.gguf --ctx-size 32768 --parallel 1 --port 8900 -
In
config.yaml, set:openai_base_url: "http://localhost:1234/v1" ai_model: "your-model-id" # Copy from llama.cpp
Actually, for the llama-server service started in the manner described above, you can enter any model name you want.
However, you cannot leave it blank, as the system will prevent it from running with an empty model name.
-
Get an API key from the OpenAI Dashboard.
-
Set your environment variable:
export OPENAI_API_KEY="sk-...". -
In
config.yaml, set:openai_base_url: "https://api.openai.com/v1" ai_model: "gpt-4o"
Simply update the openai_base_url and ai_model in config.yaml to match the provider's documentation.