Skip to content

Move ARM64 (linux/arm64) support from Alpha to Beta #4521

@markmandel

Description

@markmandel

Summary

ARM64 support was introduced in v1.23.0 (May 2022) and has remained in Alpha for nearly three years. All Agones components produce ARM64 container images via Docker buildx, and pre-built SDK binaries are shipped for linux.arm64 and darwin.arm64. The missing piece is continuous integration on real ARM hardware — without it we cannot assert correctness or catch regressions.

This issue tracks the work needed to graduate ARM64 to Beta.

Criteria for Beta

Per the feature stage definitions:

  • No known critical bugs.
  • Automated testing covers the feature.
  • Documentation is up to date.

Proposed CI Approach

Add a small GKE Standard cluster with ARM64 (T2A) nodes to the existing e2e test infrastructure, running a targeted subset of tests on every PR/merge — the same cadence as the existing x86 clusters.

Cluster spec (suggested):

  • Machine type: t2a-standard-4 (Ampere Altra, ARM64)
  • Node count: 3 nodes (smoke suite is small; full suite uses 10 — 3 is sufficient)
  • Region: one of the existing e2e regions (e.g. us-central1 — T2A is available there)
  • GKE release channel: RAPID (same as other e2e clusters)
  • Provisioned via Terraform alongside the existing modules in build/terraform/e2e/

Test subset (smoke suite):

Rather than running the full e2e suite (~45 min per cluster), run a curated smoke suite that validates every major component is functional on ARM64:

Test What it validates
TestCreateConnect GameServer reaches Ready, SDK sidecar works, UDP connectivity
TestAllocatorWithSelectors Allocator gRPC service end-to-end
TestRestAllocatorWithSelectors Allocator HTTP/REST service end-to-end
TestGameServerShutdown Graceful shutdown / SDK Shutdown() call
TestGameServerReserve Reserve state transitions
TestFleetScaleUpAndDown (minimal) Fleet controller and rolling updates
Ping test ping component responds on ARM64

The subset can be specified via a -run regex flag in the e2e runner, keeping the ARM CI slot fast (target: under 20 minutes).

Tasks

  • Add Terraform module for a GKE ARM64 (T2A) e2e cluster
  • Add ARM64 entry to ci/e2e-test-cloudbuild.yaml child-build matrix
  • Define and document the ARM64 smoke-test suite (regex or build tag)
  • Confirm simple-game-server test image builds and runs correctly on ARM64
  • Run the smoke suite consistently green across ≥ 2 weeks / several PRs
  • Update site/content/en/docs/Installation/_index.md to change ARM64 stage from AlphaBeta
  • Add a blog/changelog entry noting the graduation

Notes

  • T2A nodes are not available in all GCP regions; us-central1 is a confirmed availability zone.
  • GKE Autopilot does not currently support ARM64 node pools, so Standard is the right choice here.
  • The ARM64 smoke suite intentionally excludes controller-failure and HA tests — those remain x86-only until full Beta parity is established.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions