Skip to content

Initial self hosted runner provision script, to be tested#543

Open
lukhsaankumar wants to merge 4 commits intomainfrom
provision-runner-script
Open

Initial self hosted runner provision script, to be tested#543
lukhsaankumar wants to merge 4 commits intomainfrom
provision-runner-script

Conversation

@lukhsaankumar
Copy link
Collaborator

@lukhsaankumar lukhsaankumar commented Nov 12, 2025

Provisioning Script for Self-Hosted GitHub Actions Runner (Ubuntu)

Purpose

Install and configure a hardened, idempotent GitHub Actions runner for SIL testing, with optional SocketCAN support.

Relates to #307

Key Features

  • Installs runner binaries and registers with GH_RUNNER_TOKEN
  • Creates least-privileged user: github-runner (no sudo)
  • Security tooling:
    • ufw, fail2ban, apparmor, auditd, unattended-upgrades
    • Systemd service hardening
  • Optional SocketCAN:
    • CAN_TYPE=virtual → creates vcan0
    • CAN_TYPE=physical → installs can-utils, serial CAN tools
  • Installs Python tools: python-can, cantools
  • Applies AppArmor profile + systemd limits (anti-fork bomb, disk protection)
  • Idempotent: Safe to re-run

Note: To be tested on a Linux Based OS that supports SocketCAN

SIL Self-Hosted Runner & Workflow Setup

This section explains:

  • How to configure GitHub repository settings (branch protection, labels, required checks).
  • How to attach the SIL runner correctly.
  • What security guarantees our workflow + provisioning script actually provide.

1. GitHub Repo & Branch Protection Setup

1.1 Protected branches

In Settings → Branches → Branch protection rules, add a rule for your main integration branch (e.g. main):

Branch name pattern

  • main (or release/* etc. as needed)

Require a pull request before merging

  • Require pull request reviews before merging
    • Minimum number of approvals: 1 (or more, depending on your policy)
    • Optionally: Require review from Code Owners

Require status checks to pass before merging

  • Enable "Require status checks to pass before merging"
  • Add the following required check:
    • SIL check
  • Enable "Require branches to be up to date before merging"
    (prevents merging if new commits land on main before the PR is re-tested)

Restrict who can push to matching branches

  • "Restrict who can push to matching branches"
  • Allow only:
    • Your core team (e.g., a "Maintainers" team)
    • GitHub Apps such as Dependabot if needed

This guarantees that no one can push directly to main and that SIL must pass on the PR before merge.

1.2 Labels & CI gate (ci:run-sil)

We gate the expensive / sensitive SIL job behind a label.

In your repo, go to Issues → Labels and create:

  • Name: ci:run-sil
  • Description: "Allow SIL job to run on self-hosted runner for this PR"

Who can use it:

  • Only maintainers / trusted reviewers should add this label to PRs.
  • Ensure only certain people can add labels to PRs

Recommended process:

  1. Contributor opens PR.
  2. Reviewer looks at changes.
  3. If it's safe to run SIL on the self-hosted runner, reviewer adds ci:run-sil.

Effect on workflow:

  • The SIL workflow only runs when:
    • Manually triggered via workflow_dispatch, OR
    • The PR has the label ci:run-sil.
  • So unlabeled PRs = SIL job skipped, even if they change the scripts that the runner executes.

1.3 Status checks on PRs

Once the workflow has run at least once, GitHub will start displaying:

  • SIL check as a status on pull requests.

For each PR to a protected branch:

  • Before ci:run-sil label:
    • SIL check → state: skipped
  • After ci:run-sil label:
    • SIL check runs on the self-hosted runner.
    • Must end in success to allow merge (due to branch protection).

2. Workflow Overview

Current workflow:

name: SIL check

on:
  pull_request:
    types: [opened, synchronize, reopened, labeled]
  workflow_dispatch:

jobs:
  sil:
    # Only run when:
    # - Manually triggered (workflow_dispatch), OR
    # - PR has the label "ci:run-sil"
    if: ${{ 
          github.event_name == 'workflow_dispatch' ||
          (github.event_name == 'pull_request' &&
          contains(github.event.pull_request.labels.*.name, 'ci:run-sil'))
        }}


    # Target only your hardened runner
    runs-on: [self-hosted, linux, sil, socketcan]

    # Minimal required permissions
    permissions:
      contents: read

    # Prevent runaway SIL jobs
    timeout-minutes: 2

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        
      - name: Run dummy SIL (0 = pass, 1 = fail)
        run: |
          # Per-step resource guardrails
          ulimit -c 0        # disable core dumps
          ulimit -n 65536    # bump file descriptor limit if needed

          ./scripts/run_sil_dummy.sh

Current SIL Dummy Script: run_sil_dummy.sh

#!/usr/bin/env bash
set -euo pipefail

# For demo: pass
echo "[dummy-sil] OK"

# Simulate Failing SIL: exit 1
# Simulate Passing SIL: exit 0
exit 0

Key points:

  • Uses runner labels: self-hosted, linux, sil, socketcan → ensures only your hardened SIL runner picks this job up.
  • Uses minimal GitHub token permissions:
    • contents: read only.
  • Uses timeout-minutes: 2 to stop long-running or stuck SIL jobs.
  • Uses label logic to gate execution on the self-hosted runner.

3. Runner Provisioning Overview

The self-hosted runner is created by a provisioning script (run once on a fresh Ubuntu VM):

Creates a dedicated user: github-runner

  • No password, no sudo, no privileged groups.

Installs:

  • GitHub Actions runner binaries under /opt/actions-runner
  • Python + CAN utilities (python3, python-can, cantools, can-utils)

Registers the runner:

  • Scope: single repository (not organization-wide)
  • Runner labels: self-hosted, linux, sil, socketcan

Works as a systemd service:

  • actions.runner.<owner>-<repo>.sil-runner-01.service

Sets up virtual SocketCAN if CAN_TYPE=virtual:

  • vcan0 interface via vcan0.service
  • CAN kernel modules: can, can_raw, vcan, can_dev

Applies system hardening:

  • ufw firewall: deny incoming by default, limit SSH.
  • fail2ban to ban repeated SSH brute force attempts.
  • unattended-upgrades for security updates.
  • auditd + rules for monitoring runner directory and runner execs.
  • AppArmor profile for the runner binary.
  • systemd service sandboxing and resource limits.
  • Per-user limits via /etc/security/limits.d/github-runner.conf.

4. Threats Mitigated by This Design

4.1 Untrusted PRs compromising the runner

Threat:
A random contributor (or fork) opens a PR with malicious code in scripts that the SIL job runs.

Mitigations:

  • SIL job only runs when:
    • PR is explicitly labeled ci:run-sil by a maintainer, OR
    • Workflow is manually triggered (workflow_dispatch).
  • For fork PRs, by default:
    • No repo secrets are exposed.
    • GITHUB_TOKEN has read-only permissions.
  • The self-hosted runner is scoped only to this repo:
    • Malicious code cannot be used to silently build/deploy from other repos.

4.2 Secret exfiltration from GitHub / runner

Threat:
Malicious workflow steps or PR code attempt to steal:

  • Repo secrets
  • GITHUB_TOKEN with excessive permissions
  • Host-level credentials (SSH keys, sudo tokens, etc.)

Mitigations:

  • Workflow uses minimal permissions:
    • contents: read only.
  • No explicit use of secrets.* in the SIL workflow = nothing sensitive exported by design.
  • Self-hosted runner user has no sudo:
    • sudo -l -U github-runner shows no commands allowed.
  • AppArmor + systemd deny access to:
    • /etc/shadow
    • /etc/sudoers*
    • /root/**
    • Other system locations
  • Firewall (ufw) doesn't block outbound by default, but combined with:
    • No secrets
    • No sudo/root filesystem access
    • → greatly reduces the value of exfiltrating environment variables.

4.3 File system boundary and privilege escalation

Threat:
Malicious job attempts:

  • tee /root/pwned
  • cat /etc/shadow
  • Writing into /etc or /boot
  • Modifying runner service/system configs.

Mitigations:

  • Runner process runs as github-runner, not root.
  • No sudo access for github-runner.
  • AppArmor profile denies access to:
    • /etc/shadow
    • /etc/sudoers*
    • /root/**
    • Write access to /etc/**, /boot/**, /sys/**.
  • systemd service has:
    • ProtectSystem=strict
    • ProtectHome=yes
    • ReadWritePaths=/opt/actions-runner/_work /opt/actions-runner/_diag
  • /opt/actions-runner is owned by github-runner, with only _work/_diag writable.

Result: Scripts can only meaningfully write inside the runner's _work/_diag directories, not the OS.

4.4 Denial-of-service (DoS) via runaway jobs

Threat:

  • Infinite loops consuming CPU.
  • Massive process forks.
  • Excessive file creation or disk usage.

Mitigations:

  • Workflow:
    • timeout-minutes: 2 → GitHub forcibly cancels jobs taking too long.
  • systemd unit overrides:
    • CPUQuota=150%
    • MemoryMax=2G
    • TasksMax=256
    • LimitNPROC=512
    • LimitNOFILE=4096
    • TimeoutStopSec=30
  • Per-user limits in /etc/security/limits.d/github-runner.conf:
    • soft nproc 256, hard nproc 512
    • soft cpu 60, hard cpu 120
  • Cron cleanup:
    • Weekly cleaning of old work dirs in /opt/actions-runner/_work.

Result: Malicious SIL code can waste some resources temporarily, but it cannot permanently exhaust the VM, fill the disk indefinitely, or fork-bomb the host.

4.5 Brute-force attacks or external SSH probing

Threat:
Internet-wide scanning or bots trying to brute-force SSH into the VM.

Mitigations:

  • ufw:
    • Default deny incoming, allow outgoing.
    • ufw limit 22/tcp: rate-limits SSH attempts.
  • fail2ban + sshd jail:
    • Monitors SSH logs.
    • Bans IPs with repeated failed login attempts.
  • Only SSH is intentionally exposed; no extra ports.

4.6 Silent long-term compromise / tampering

Threat:

An attacker gets code execution on the runner and tries to persist or hide.

Mitigations:

  • auditd rules:
    • Watches /opt/actions-runner/ for write activity (-w /opt/actions-runner -p wa -k runner_changes).
    • Audits all execve syscalls run by uid of github-runner (runner_exec key).
  • AppArmor profile + systemd restrictions:
    • Make it hard to alter system binaries or configs.
  • unattended-upgrades:
    • Keeps system security patches up-to-date automatically.

This doesn't guarantee no compromise, but makes persistence and stealth significantly harder.

4.7 CAN / SIL environment isolation

Threat:

Malicious SIL code reconfigures CAN interfaces in a way that affects other services or future runs.

Mitigations:

  • For CAN_TYPE=virtual:
    • Only vcan0 is brought up by vcan0.service.
    • Access to /sys/class/net/vcan0/** via AppArmor for read only.
  • CAN tools are installed, but:
    • Runner user is still constrained by AppArmor + filesystem boundaries.
    • Virtual CAN (vcan0) is safe for testing logic without touching physical hardware.

5. Operational Checklist

When onboarding a new maintainer or setting up a new SIL runner, they should know:

  • How to run the provisioning script on a fresh Ubuntu VM.
  • That the runner appears under repo Settings → Actions → Runners with labels:
    • self-hosted, linux, sil, socketcan
  • How to:
    • Add ci:run-sil label to PRs when they're ready for SIL.
    • Check that SIL check passes before merging.
  • That:
    • Runner user github-runner has no sudo.
    • Firewall (ufw), fail2ban, auditd, unattended-upgrades are all active.
    • vcan0 exists and is up for virtual CAN SIL tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments