Initial self hosted runner provision script, to be tested by lukhsaankumar · Pull Request #543 · macformula/racecar

lukhsaankumar · 2025-11-12T18:13:44Z

Provisioning Script for Self-Hosted GitHub Actions Runner (Ubuntu)

Purpose

Install and configure a hardened, idempotent GitHub Actions runner for SIL testing, with optional SocketCAN support.

Relates to #307

Key Features

Installs runner binaries and registers with GH_RUNNER_TOKEN
Creates least-privileged user: github-runner (no sudo)
Security tooling:
- ufw, fail2ban, apparmor, auditd, unattended-upgrades
- Systemd service hardening
Optional SocketCAN:
- CAN_TYPE=virtual → creates vcan0
- CAN_TYPE=physical → installs can-utils, serial CAN tools
Installs Python tools: python-can, cantools
Applies AppArmor profile + systemd limits (anti-fork bomb, disk protection)
Idempotent: Safe to re-run

Note: To be tested on a Linux Based OS that supports SocketCAN

SIL Self-Hosted Runner & Workflow Setup

This section explains:

How to configure GitHub repository settings (branch protection, labels, required checks).
How to attach the SIL runner correctly.
What security guarantees our workflow + provisioning script actually provide.

1. GitHub Repo & Branch Protection Setup

1.1 Protected branches

In Settings → Branches → Branch protection rules, add a rule for your main integration branch (e.g. main):

Branch name pattern

main (or release/* etc. as needed)

Require a pull request before merging

Require pull request reviews before merging
- Minimum number of approvals: 1 (or more, depending on your policy)
- Optionally: Require review from Code Owners

Require status checks to pass before merging

Enable "Require status checks to pass before merging"
Add the following required check:
- SIL check
Enable "Require branches to be up to date before merging"
(prevents merging if new commits land on main before the PR is re-tested)

Restrict who can push to matching branches

"Restrict who can push to matching branches"
Allow only:
- Your core team (e.g., a "Maintainers" team)
- GitHub Apps such as Dependabot if needed

This guarantees that no one can push directly to main and that SIL must pass on the PR before merge.

1.2 Labels & CI gate (ci:run-sil)

We gate the expensive / sensitive SIL job behind a label.

In your repo, go to Issues → Labels and create:

Name: ci:run-sil
Description: "Allow SIL job to run on self-hosted runner for this PR"

Who can use it:

Only maintainers / trusted reviewers should add this label to PRs.
Ensure only certain people can add labels to PRs

Recommended process:

Contributor opens PR.
Reviewer looks at changes.
If it's safe to run SIL on the self-hosted runner, reviewer adds ci:run-sil.

Effect on workflow:

The SIL workflow only runs when:
- Manually triggered via workflow_dispatch, OR
- The PR has the label ci:run-sil.
So unlabeled PRs = SIL job skipped, even if they change the scripts that the runner executes.

1.3 Status checks on PRs

Once the workflow has run at least once, GitHub will start displaying:

SIL check as a status on pull requests.

For each PR to a protected branch:

Before ci:run-sil label:
- SIL check → state: skipped
After ci:run-sil label:
- SIL check runs on the self-hosted runner.
- Must end in success to allow merge (due to branch protection).

2. Workflow Overview

Current workflow:

name: SIL check

on:
  pull_request:
    types: [opened, synchronize, reopened, labeled]
  workflow_dispatch:

jobs:
  sil:
    # Only run when:
    # - Manually triggered (workflow_dispatch), OR
    # - PR has the label "ci:run-sil"
    if: ${{ 
          github.event_name == 'workflow_dispatch' ||
          (github.event_name == 'pull_request' &&
          contains(github.event.pull_request.labels.*.name, 'ci:run-sil'))
        }}


    # Target only your hardened runner
    runs-on: [self-hosted, linux, sil, socketcan]

    # Minimal required permissions
    permissions:
      contents: read

    # Prevent runaway SIL jobs
    timeout-minutes: 2

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        
      - name: Run dummy SIL (0 = pass, 1 = fail)
        run: |
          # Per-step resource guardrails
          ulimit -c 0        # disable core dumps
          ulimit -n 65536    # bump file descriptor limit if needed

          ./scripts/run_sil_dummy.sh

Current SIL Dummy Script: run_sil_dummy.sh

#!/usr/bin/env bash
set -euo pipefail

# For demo: pass
echo "[dummy-sil] OK"

# Simulate Failing SIL: exit 1
# Simulate Passing SIL: exit 0
exit 0

Key points:

Uses runner labels: self-hosted, linux, sil, socketcan → ensures only your hardened SIL runner picks this job up.
Uses minimal GitHub token permissions:
- contents: read only.
Uses timeout-minutes: 2 to stop long-running or stuck SIL jobs.
Uses label logic to gate execution on the self-hosted runner.

3. Runner Provisioning Overview

The self-hosted runner is created by a provisioning script (run once on a fresh Ubuntu VM):

Creates a dedicated user: github-runner

No password, no sudo, no privileged groups.

Installs:

GitHub Actions runner binaries under /opt/actions-runner
Python + CAN utilities (python3, python-can, cantools, can-utils)

Registers the runner:

Scope: single repository (not organization-wide)
Runner labels: self-hosted, linux, sil, socketcan

Works as a systemd service:

actions.runner.<owner>-<repo>.sil-runner-01.service

Sets up virtual SocketCAN if CAN_TYPE=virtual:

vcan0 interface via vcan0.service
CAN kernel modules: can, can_raw, vcan, can_dev

Applies system hardening:

ufw firewall: deny incoming by default, limit SSH.
fail2ban to ban repeated SSH brute force attempts.
unattended-upgrades for security updates.
auditd + rules for monitoring runner directory and runner execs.
AppArmor profile for the runner binary.
systemd service sandboxing and resource limits.
Per-user limits via /etc/security/limits.d/github-runner.conf.

4. Threats Mitigated by This Design

4.1 Untrusted PRs compromising the runner

Threat:
A random contributor (or fork) opens a PR with malicious code in scripts that the SIL job runs.

Mitigations:

SIL job only runs when:
- PR is explicitly labeled ci:run-sil by a maintainer, OR
- Workflow is manually triggered (workflow_dispatch).
For fork PRs, by default:
- No repo secrets are exposed.
- GITHUB_TOKEN has read-only permissions.
The self-hosted runner is scoped only to this repo:
- Malicious code cannot be used to silently build/deploy from other repos.

4.2 Secret exfiltration from GitHub / runner

Threat:
Malicious workflow steps or PR code attempt to steal:

Repo secrets
GITHUB_TOKEN with excessive permissions
Host-level credentials (SSH keys, sudo tokens, etc.)

Mitigations:

Workflow uses minimal permissions:
- contents: read only.
No explicit use of secrets.* in the SIL workflow = nothing sensitive exported by design.
Self-hosted runner user has no sudo:
- sudo -l -U github-runner shows no commands allowed.
AppArmor + systemd deny access to:
- /etc/shadow
- /etc/sudoers*
- /root/**
- Other system locations
Firewall (ufw) doesn't block outbound by default, but combined with:
- No secrets
- No sudo/root filesystem access
- → greatly reduces the value of exfiltrating environment variables.

4.3 File system boundary and privilege escalation

Threat:
Malicious job attempts:

tee /root/pwned
cat /etc/shadow
Writing into /etc or /boot
Modifying runner service/system configs.

Mitigations:

Runner process runs as github-runner, not root.
No sudo access for github-runner.
AppArmor profile denies access to:
- /etc/shadow
- /etc/sudoers*
- /root/**
- Write access to /etc/**, /boot/**, /sys/**.
systemd service has:
- ProtectSystem=strict
- ProtectHome=yes
- ReadWritePaths=/opt/actions-runner/_work /opt/actions-runner/_diag
/opt/actions-runner is owned by github-runner, with only _work/_diag writable.

Result: Scripts can only meaningfully write inside the runner's _work/_diag directories, not the OS.

4.4 Denial-of-service (DoS) via runaway jobs

Threat:

Infinite loops consuming CPU.
Massive process forks.
Excessive file creation or disk usage.

Mitigations:

Workflow:
- timeout-minutes: 2 → GitHub forcibly cancels jobs taking too long.
systemd unit overrides:
- CPUQuota=150%
- MemoryMax=2G
- TasksMax=256
- LimitNPROC=512
- LimitNOFILE=4096
- TimeoutStopSec=30
Per-user limits in /etc/security/limits.d/github-runner.conf:
- soft nproc 256, hard nproc 512
- soft cpu 60, hard cpu 120
Cron cleanup:
- Weekly cleaning of old work dirs in /opt/actions-runner/_work.

Result: Malicious SIL code can waste some resources temporarily, but it cannot permanently exhaust the VM, fill the disk indefinitely, or fork-bomb the host.

4.5 Brute-force attacks or external SSH probing

Threat:
Internet-wide scanning or bots trying to brute-force SSH into the VM.

Mitigations:

ufw:
- Default deny incoming, allow outgoing.
- ufw limit 22/tcp: rate-limits SSH attempts.
fail2ban + sshd jail:
- Monitors SSH logs.
- Bans IPs with repeated failed login attempts.
Only SSH is intentionally exposed; no extra ports.

4.6 Silent long-term compromise / tampering

Threat:

An attacker gets code execution on the runner and tries to persist or hide.

Mitigations:

auditd rules:
- Watches /opt/actions-runner/ for write activity (-w /opt/actions-runner -p wa -k runner_changes).
- Audits all execve syscalls run by uid of github-runner (runner_exec key).
AppArmor profile + systemd restrictions:
- Make it hard to alter system binaries or configs.
unattended-upgrades:
- Keeps system security patches up-to-date automatically.

This doesn't guarantee no compromise, but makes persistence and stealth significantly harder.

4.7 CAN / SIL environment isolation

Threat:

Malicious SIL code reconfigures CAN interfaces in a way that affects other services or future runs.

Mitigations:

For CAN_TYPE=virtual:
- Only vcan0 is brought up by vcan0.service.
- Access to /sys/class/net/vcan0/** via AppArmor for read only.
CAN tools are installed, but:
- Runner user is still constrained by AppArmor + filesystem boundaries.
- Virtual CAN (vcan0) is safe for testing logic without touching physical hardware.

5. Operational Checklist

When onboarding a new maintainer or setting up a new SIL runner, they should know:

How to run the provisioning script on a fresh Ubuntu VM.
That the runner appears under repo Settings → Actions → Runners with labels:
- self-hosted, linux, sil, socketcan
How to:
- Add ci:run-sil label to PRs when they're ready for SIL.
- Check that SIL check passes before merging.
That:
- Runner user github-runner has no sudo.
- Firewall (ufw), fail2ban, auditd, unattended-upgrades are all active.
- vcan0 exists and is up for virtual CAN SIL tests.

Initial self hosted runner provision script, to be tested

817d223

lukhsaankumar requested a review from BlakeFreer November 12, 2025 18:13

lukhsaankumar self-assigned this Nov 12, 2025

lukhsaankumar requested review from CameronBeneteau, TylerStAmour, langei and samparent97 as code owners November 12, 2025 18:13

lukhsaankumar added 3 commits November 12, 2025 13:14

Rename file

a8e6022

Fix runner script, works on Linux VM, idempotent, add comments

fd83b9d

Add current workflow and dummy script for SIL

f93f34f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial self hosted runner provision script, to be tested#543

Initial self hosted runner provision script, to be tested#543
lukhsaankumar wants to merge 4 commits intomainfrom
provision-runner-script

lukhsaankumar commented Nov 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

lukhsaankumar commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Provisioning Script for Self-Hosted GitHub Actions Runner (Ubuntu)

Purpose

Relates to #307

Key Features

Note: To be tested on a Linux Based OS that supports SocketCAN

SIL Self-Hosted Runner & Workflow Setup

1. GitHub Repo & Branch Protection Setup

1.1 Protected branches

1.2 Labels & CI gate (ci:run-sil)

1.3 Status checks on PRs

2. Workflow Overview

3. Runner Provisioning Overview

4. Threats Mitigated by This Design

4.1 Untrusted PRs compromising the runner

4.2 Secret exfiltration from GitHub / runner

4.3 File system boundary and privilege escalation

4.4 Denial-of-service (DoS) via runaway jobs

4.5 Brute-force attacks or external SSH probing

4.6 Silent long-term compromise / tampering

4.7 CAN / SIL environment isolation

5. Operational Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

lukhsaankumar commented Nov 12, 2025 •

edited

Loading