ci: add eval tasks #13

Workflow file for this run

.github/workflows/eval-tasks.yaml at dcbebb7

	# Copyright 2025 Google LLC
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.

	# Generated by dev/tasks/generate-github-actions

	name: eval-tasks
	on:
	pull_request:
	types: [opened, synchronize, reopened]
	branches: [main]
	schedule:
	- cron: "0 /4 * *"
	jobs:
	run-eval-tasks:
	runs-on: ubuntu-latest
	timeout-minutes: 60
	steps:
	- uses: actions/checkout@v4
	- name: Create k8s Kind Cluster
	uses: helm/kind-action@a1b0e391336a6ee6713a0583f8c6240d70863de3 # @v1.12.0
	with:
	cluster_name: ${{ github.workflow }}
	- name: Build kubectl-ai
	run: go build && chmod +x kubectl-ai
	- name: Install ollama
	run: curl -fsSL https://ollama.com/install.sh \| sh
	- name: Run ollama ${{ vars.OLLAMA_MODEL }}
	run: \|
	# ./ollama serve &
	ollama pull ${{ vars.OLLAMA_MODEL }}
	ollama run ${{ vars.OLLAMA_MODEL }}
	- name: Run prompt on k8s-bench
	run: \|
	cd k8s-bench
	go build
	chmod +x k8s-bench
	./k8s-bench run --agent-bin=../cmd/kubectl-ai --llm-provider="ollama" --models="${{ vars.OLLAMA_MODEL }}" --task-pattern=scale --output-dir=./
	- name: Analyse results
	run: \|
	cd k8s-bench
	./k8s-bench analyze --input-dir=./ --results-filepath=./results.md
	cat ./results.md >> $GITHUB_STEP_SUMMARY

	concurrency:
	group: ${{ github.workflow }}-${{ github.head_ref \|\| github.ref }}
	cancel-in-progress: false

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add eval tasks #13

Workflow file

ci: add eval tasks #13

Jobs

Run details

Workflow file for this run