angular
diff --git a/‎README.md‎
Lines changed: 39 additions & 32 deletions b/‎README.md‎
Lines changed: 39 additions & 32 deletions
diff --git a/‎package.json‎
Lines changed: 2 additions & 1 deletion b/‎package.json‎
Lines changed: 2 additions & 1 deletion
@@ -5,21 +5,21 @@ Models (LLMs).
 
 You can use this tool to make evidence-based decisions relating to AI-generated code. For example:
 
-* 🔄 Iterate on a system prompt to find most effective instructions for your project.
-* ⚖️ Compare the code quality of code produced by different models.
-* 📈 Monitor generated code quality over time as models and agents evolve.
+- 🔄 Iterate on a system prompt to find most effective instructions for your project.
+- ⚖️ Compare the code quality of code produced by different models.
+- 📈 Monitor generated code quality over time as models and agents evolve.
 
 Web Codegen Scorer is different from other code benchmarks in that it focuses specifically on _web_
 code and relies primarily on well-established measures of code quality.
 
 ## Features
 
-* ⚙️ Configure your evaluations with different models, frameworks, and tools.
-* ✍️ Specify system instructions and add MCP servers.
-* 📋 Use built-in checks for build success, runtime errors, accessibility, security, LLM rating, and
+- ⚙️ Configure your evaluations with different models, frameworks, and tools.
+- ✍️ Specify system instructions and add MCP servers.
+- 📋 Use built-in checks for build success, runtime errors, accessibility, security, LLM rating, and
   coding best practices. (More built-in checks coming soon!)
-* 🔧 Automatically attempt to repair issues detected during code generating.
-* 📊 View and compare results with an intuitive report viewer UI.
+- 🔧 Automatically attempt to repair issues detected during code generating.
+- 📊 View and compare results with an intuitive report viewer UI.
 
 ## Setup
 
@@ -40,6 +40,13 @@ export OPENAI_API_KEY="YOUR_API_KEY_HERE" # If you're using OpenAI models
 export ANTHROPIC_API_KEY="YOUR_API_KEY_HERE" # If you're using Anthropic models
 ```
 
+> [!NOTE]
+> Web Codegen Scorer supports locals models via Ollama as well. In order to use them, you must have a running Ollama server with the respective model(s) installed. By default, the tool is listening on port `11434` for the server. However, you can change that port by setting the `OLLAMA_PORT` environment variable.
+>
+> Be aware that using local models might sometimes lead to execution errors due to the output not conforming to our desired format. Unfortunately, this is a present-day limitation of these models. That being said, you can treat the feature as experimental.
+>
+> Currently supported models: `gemma3:4b`, `gemma3:12b`, `codegemma:7b`
+
 3. **Run an eval:**
 
    You can run your first eval using our Angular example with the following command:
@@ -63,11 +70,11 @@ You can customize the `web-codegen-scorer eval` script with the following flags:
 
 - `--env=<path>` (alias: `--environment`): (**Required**) Specifies the path from which to load the
   environment config.
-    - Example: `web-codegen-scorer eval --env=foo/bar/my-env.mjs`
+  - Example: `web-codegen-scorer eval --env=foo/bar/my-env.mjs`
 
 - `--model=<name>`: Specifies the model to use when generating code. Defaults to the value of
   `DEFAULT_MODEL_NAME`.
-    - Example: `web-codegen-scorer eval --model=gemini-2.5-flash --env=<config path>`
+  - Example: `web-codegen-scorer eval --model=gemini-2.5-flash --env=<config path>`
 
 - `--runner=<name>`: Specifies the runner to use to execute the eval. Supported runners are
   `genkit` (default) or `gemini-cli`.
@@ -77,47 +84,47 @@ You can customize the `web-codegen-scorer eval` script with the following flags:
   `.web-codegen-scorer/llm-output` directory (e.g., `.web-codegen-scorer/llm-output/todo-app.ts`).
   This is useful for re-running assessments or debugging the build/repair process without incurring
   LLM costs for the initial generation.
-    - **Note:** You typically need to run `web-codegen-scorer eval` once without `--local` to
-      generate the initial files in `.web-codegen-scorer/llm-output`.
-    - The `web-codegen-scorer eval:local` script is a shortcut for
-      `web-codegen-scorer eval --local`.
+  - **Note:** You typically need to run `web-codegen-scorer eval` once without `--local` to
+    generate the initial files in `.web-codegen-scorer/llm-output`.
+  - The `web-codegen-scorer eval:local` script is a shortcut for
+    `web-codegen-scorer eval --local`.
 
 - `--limit=<number>`: Specifies the number of application prompts to process. Defaults to `5`.
-    - Example: `web-codegen-scorer eval --limit=10 --env=<config path>`
+  - Example: `web-codegen-scorer eval --limit=10 --env=<config path>`
 
 - `--output-directory=<name>` (alias: `--output-dir`): Specifies which directory to output the
   generated code under which is useful for debugging. By default, the code will be generated in a
   temporary directory.
-    - Example: `web-codegen-scorer eval --output-dir=test-output --env=<config path>`
+  - Example: `web-codegen-scorer eval --output-dir=test-output --env=<config path>`
 
 - `--concurrency=<number>`: Sets the maximum number of concurrent AI API requests. Defaults to `5` (
   as defined by `DEFAULT_CONCURRENCY` in `src/config.ts`).
-    - Example: `web-codegen-scorer eval --concurrency=3 --env=<config path>`
+  - Example: `web-codegen-scorer eval --concurrency=3 --env=<config path>`
 
 - `--report-name=<name>`: Sets the name for the generated report directory. Defaults to a
   timestamp (e.g., `2023-10-27T10-30-00-000Z`). The name will be sanitized (non-alphanumeric
   characters replaced with hyphens).
-    - Example: `web-codegen-scorer eval --report-name=my-custom-report --env=<config path>`
+  - Example: `web-codegen-scorer eval --report-name=my-custom-report --env=<config path>`
 
 - `--rag-endpoint=<url>`: Specifies a custom RAG (Retrieval-Augmented Generation) endpoint URL. The
   URL must contain a `PROMPT` substring, which will be replaced with the user prompt.
-    - Example:
-      `web-codegen-scorer eval --rag-endpoint="http://localhost:8080/my-rag-endpoint?query=PROMPT" --env=<config path>`
+  - Example:
+    `web-codegen-scorer eval --rag-endpoint="http://localhost:8080/my-rag-endpoint?query=PROMPT" --env=<config path>`
 
 - `--prompt-filter=<name>`: String used to filter which prompts should be run. By default, a random
   sample (controlled by `--limit`) will be taken from the prompts in the current environment.
   Setting this can be useful for debugging a specific prompt.
-    - Example: `web-codegen-scorer eval --prompt-filter=tic-tac-toe --env=<config path>`
+  - Example: `web-codegen-scorer eval --prompt-filter=tic-tac-toe --env=<config path>`
 
 - `--skip-screenshots`: Whether to skip taking screenshots of the generated app. Defaults to
   `false`.
-    - Example: `web-codegen-scorer eval --skip-screenshots --env=<config path>`
+  - Example: `web-codegen-scorer eval --skip-screenshots --env=<config path>`
 
 - `--labels=<label1> <label2>`: Metadata labels that will be attached to the run.
-    - Example: `web-codegen-scorer eval --labels my-label another-label --env=<config path>`
+  - Example: `web-codegen-scorer eval --labels my-label another-label --env=<config path>`
 
 - `--mcp`: Whether to start an MCP for the evaluation. Defaults to `false`.
-    - Example: `web-codegen-scorer eval --mcp --env=<config path>`
+  - Example: `web-codegen-scorer eval --mcp --env=<config path>`
 
 - `--help`: Prints out usage information about the script.
 
@@ -132,11 +139,11 @@ If you've cloned this repo and want to work on the tool, you have to install its
 running `pnpm install`.
 Once they're installed, you can run the following commands:
 
-* `pnpm run release-build` - Builds the package in the `dist` directory for publishing to npm.
-* `pnpm run eval` - Runs an eval from source.
-* `pnpm run report` - Runs the report app from source.
-* `pnpm run init` - Runs the init script from source.
-* `pnpm run format` - Formats the source code using Prettier.
+- `pnpm run release-build` - Builds the package in the `dist` directory for publishing to npm.
+- `pnpm run eval` - Runs an eval from source.
+- `pnpm run report` - Runs the report app from source.
+- `pnpm run init` - Runs the init script from source.
+- `pnpm run format` - Formats the source code using Prettier.
 
 ## FAQ
 
@@ -166,7 +173,7 @@ Yes! We plan to both expand the number of built-in checks and the variety of cod
 
 Our roadmap includes:
 
-* Including _interaction testing_ in the rating, to ensure the generated code performs any requested
+- Including _interaction testing_ in the rating, to ensure the generated code performs any requested
   behaviors.
-* Measure Core Web Vitals.
-* Measuring the effectiveness of LLM-driven edits on an existing codebase.
+- Measure Core Web Vitals.
+- Measuring the effectiveness of LLM-driven edits on an existing codebase.
@@ -70,6 +70,7 @@
     "file-type": "^21.0.0",
     "genkit": "^1.19.1",
     "genkitx-anthropic": "0.23.1",
+    "genkitx-ollama": "^1.19.2",
     "gpt-tokenizer": "^3.0.1",
     "handlebars": "^4.7.8",
     "limiter": "^3.0.0",
@@ -78,8 +79,8 @@
     "p-queue": "^8.1.0",
     "puppeteer": "^24.10.1",
     "sass": "^1.89.2",
-    "stylelint": "^16.21.1",
     "strict-csp": "^1.1.1",
+    "stylelint": "^16.21.1",
     "stylelint-config-recommended-scss": "^16.0.0",
     "tinyglobby": "^0.2.14",
     "tsx": "^4.20.3",