feat: track allocations per fleet by nrwiersma · Pull Request #4513 · agones-dev/agones

nrwiersma · 2026-04-14T06:59:59Z

What type of PR is this?

Uncomment only one /kind <> line, press enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug
/kind cleanup
/kind documentation
/kind feature
/kind hotfix
/kind release

What this PR does / Why we need it:

This PR adds allocations tracking on a Fleet level to provide real allocation metrics to the Fleet Autoscaler. This provides better metrics to base custom autoscaling off of while also providing a good metric to the system

Which issue(s) this PR fixes:

Closes #4452

Special notes for your reviewer:

Initially I thought to use a RWMutex and atomic.Int64, but the locking logic got really verbose and I dont think, even in active clusters, that the contention on a Mutex is likely to be a real issue, given how low the actual lock time will be. If this proves to no longer be true, it is simple to change it.

It also occurred to me that queuing the Fleet when an Allocation is observed is likely not needed, as this will always result in either a change in AllocatedReplicas, ReadyReplicas or Replicas on the GameServerSet which will itself queue the Fleet. I will run some manual tests to prove this.

markmandel · 2026-04-15T15:39:33Z

/gcbrun

agones-bot · 2026-04-15T15:47:56Z

Build Failed 😭

Build Id: 0200e9ed-b8cb-4f9c-82a3-22a40ca1032e

Status: FAILURE

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

markmandel · 2026-04-16T21:54:27Z

/gcbrun

agones-bot · 2026-04-16T23:13:40Z

Build Succeeded 🥳

Build Id: 52edf034-8467-427b-90f9-765fd733d9c6

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.58.0-dev-30678d3
image: us-docker.pkg.dev/agones-images/ci/agones-extensions:1.58.0-dev-30678d3
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.58.0-dev-30678d3-linux
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.58.0-dev-30678d3
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.58.0-dev-30678d3
image: us-docker.pkg.dev/agones-images/ci/agones-processor:1.58.0-dev-30678d3
Linux C++ SDK (build): agonessdk-1.58.0-dev-30678d3-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.58.0-dev-30678d3.zip

A preview of the website (the last 30 builds are retained):

https://30678d3-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4513/head:pr_4513 && git checkout pr_4513
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.58.0-dev-30678d3

markmandel

Thanks for doing this -- sorry it took a bit to get to it.

We'll need some docs at https://agones.dev/site/docs/guides/metrics/ -- but also will need to be feature shortcoded -- https://agones.dev/site/docs/contribute/documentation-editing-contribution/

markmandel · 2026-04-24T04:19:41Z

+	gs.Status.State = agonesv1.GameServerStateAllocated
+	ctrl.gsWatch.Modify(gs)
+
+	require.Eventually(t, func() bool {


If you use https://pkg.go.dev/github.com/stretchr/testify@v1.11.1/require#EventuallyWithT you can check multiple assertions, and also get better error messages.

markmandel · 2026-04-24T04:20:24Z

 	ReservedReplicas int32 `json:"reservedReplicas"`
 	// AllocatedReplicas are the number of Allocated GameServer replicas
 	AllocatedReplicas int32 `json:"allocatedReplicas"`
+    // Allocations is a counter of the number of allocations observed.


New docs will need feature shortcodes please:
https://agones.dev/site/docs/contribute/documentation-editing-contribution/

markmandel · 2026-04-24T04:29:38Z

 		},
 	})

+	_, _ = gsInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{


You'll wan tto sync GameServers at, since you are now watching their events:

https://github.com/nrwiersma/agones/blob/30678d321529ca2fd3bdd27f58b8bd2a4b5c8e96/pkg/fleets/controller.go#L255

I thought about that. I decided not to in the end as it does not stop the controller from working. We do not handle Add on the EventHandler (just Update), so the sync would basically block for no reason. Will gladly add it if you still want though.

Yeah, I'd recommend it - I've seen some weirdness when not syncing -- it'll probably mean nothing much, but also means that if other people edit this later on, or do other controller things they aren't left wondering "when do I sync, or not sync" - better to always sync 👍🏻

Fair. The global rule for me is, if it is in the processNextItem loop anywhere, it gets synced. But will add it in.

markmandel · 2026-04-24T04:33:33Z

+	defer cancel()
+
+	c.allocsMu.Lock()
+	defer c.allocsMu.Unlock()


Nice one on making sure it gets processed on shutdown.

Probably doesn't make a huge difference, but holding the whole mutex over the network calls may cause some interesting long term locking over shutdown. Might be worth taking a snapshot - or the very least, moving to a RWLock, to avoid some potential contention.

markmandel · 2026-04-24T04:39:27Z

+	count, ok := c.allocs[key]
+	if !ok {
+		c.allocs[key] = 1
+		return
+	}
+	c.allocs[key] = count + 1


Suggested change

count, ok := c.allocs[key]

if !ok {

c.allocs[key] = 1

return

}

c.allocs[key] = count + 1

c.allocs[key]++

maps will 0 value here, just as a suggestion

markmandel · 2026-04-24T04:41:12Z

 		}
 	}

+	fCopy.Status.Allocations += c.getAllocations(fleet.ObjectMeta.Namespace, fCopy.ObjectMeta.Name)


If UpdateStatus fails, the worker queue re-enqueues the fleet, but on retry the counter is already 0 — those allocations are permanently lost.

My suggestion would be to capture the delta, attempt the update, and only zero out the counter on success

Good catch, had not seen that failure path.

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

markmandel · 2026-04-24T16:12:58Z

/gcbrun

agones-bot · 2026-04-24T17:11:30Z

Build Succeeded 🥳

Build Id: 6178bfd6-30e7-4060-ba43-0e3864f400b8

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.58.0-dev-a8be28c
image: us-docker.pkg.dev/agones-images/ci/agones-extensions:1.58.0-dev-a8be28c
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.58.0-dev-a8be28c-linux
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.58.0-dev-a8be28c
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.58.0-dev-a8be28c
image: us-docker.pkg.dev/agones-images/ci/agones-processor:1.58.0-dev-a8be28c
Linux C++ SDK (build): agonessdk-1.58.0-dev-a8be28c-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.58.0-dev-a8be28c.zip

A preview of the website (the last 30 builds are retained):

https://a8be28c-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4513/head:pr_4513 && git checkout pr_4513
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.58.0-dev-a8be28c

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

markmandel · 2026-04-25T19:11:05Z

/gcbrun

markmandel · 2026-04-25T19:11:40Z

/gcbrun

agones-bot · 2026-04-25T20:12:19Z

Build Succeeded 🥳

Build Id: c09603eb-c5f9-4bac-84cf-41668d315953

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.58.0-dev-6a3b9cb
image: us-docker.pkg.dev/agones-images/ci/agones-extensions:1.58.0-dev-6a3b9cb
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.58.0-dev-6a3b9cb-linux
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.58.0-dev-6a3b9cb
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.58.0-dev-6a3b9cb
image: us-docker.pkg.dev/agones-images/ci/agones-processor:1.58.0-dev-6a3b9cb
Linux C++ SDK (build): agonessdk-1.58.0-dev-6a3b9cb-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.58.0-dev-6a3b9cb.zip

A preview of the website (the last 30 builds are retained):

https://6a3b9cb-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/4513/head:pr_4513 && git checkout pr_4513
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.58.0-dev-6a3b9cb

github-actions Bot added kind/feature New features for Agones size/M labels Apr 14, 2026

nrwiersma force-pushed the allocations branch 2 times, most recently from 5dea128 to 38ae676 Compare April 14, 2026 07:05

nrwiersma marked this pull request as ready for review April 14, 2026 09:44

nrwiersma force-pushed the allocations branch from 559187f to 30678d3 Compare April 16, 2026 06:50

markmandel reviewed Apr 24, 2026

View reviewed changes

github-actions Bot added the size/L label Apr 24, 2026

nrwiersma added 8 commits April 24, 2026 11:15

feat: add allocations tracking and metric

e21def2

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

chore: update CRD and install

959052b

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

feat: add e2e test

31fecb2

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

chore: update docs

2a44696

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

fix: minor issues

73fae65

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

fix: refactor allocations to handle contention

a2ddc94

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

fix: better test errors

056981f

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

fix: docs feature shortcoded

a8be28c

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

nrwiersma force-pushed the allocations branch from 8ca6649 to a8be28c Compare April 24, 2026 09:16

fix: wait for gs sync

d83f1e9

Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>

nrwiersma force-pushed the allocations branch from 7fca2d5 to d83f1e9 Compare April 24, 2026 17:57

Merge branch 'main' into allocations

6a3b9cb

Conversation

nrwiersma commented Apr 14, 2026

Uh oh!

markmandel commented Apr 15, 2026

Uh oh!

agones-bot commented Apr 15, 2026

Uh oh!

markmandel commented Apr 16, 2026

Uh oh!

agones-bot commented Apr 16, 2026

Uh oh!

markmandel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markmandel commented Apr 24, 2026

Uh oh!

agones-bot commented Apr 24, 2026

Uh oh!

markmandel commented Apr 25, 2026

Uh oh!

markmandel commented Apr 25, 2026

Uh oh!

agones-bot commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants