newcloudtechnologies · oshokin · Apr 22, 2026 · Apr 22, 2026 · Apr 22, 2026 · Apr 22, 2026
diff --git a/README.md b/README.md
@@ -2,14 +2,15 @@
 
 [![Go Reference](https://pkg.go.dev/badge/github.com/newcloudtechnologies/memlimiter.svg)](https://pkg.go.dev/github.com/newcloudtechnologies/memlimiter)
 [![Go Report Card](https://goreportcard.com/badge/github.com/newcloudtechnologies/memlimiter)](https://goreportcard.com/report/github.com/newcloudtechnologies/memlimiter)
-![Coverage](https://img.shields.io/badge/Coverage-81.7%25-brightgreen)
+![Coverage](https://img.shields.io/badge/Coverage-82.8%25-brightgreen)
 ![CI](https://github.com/newcloudtechnologies/memlimiter/actions/workflows/CI.yml/badge.svg)
 
 `memlimiter` helps a Go service avoid OOM by combining adaptive GC tuning and request throttling under memory pressure.
 
 It observes process memory (`RSS`) and Go heap pressure (`runtime.MemStats.NextGC`) and turns that into:
 
 - dynamic `debug.SetGCPercent` tuning,
+- optional `debug.SetMemoryLimit` application on service start,
 - request shedding / backpressure via middleware.
 
 By default, stats come from:
@@ -62,7 +63,9 @@ where:
 - $RSS_{limit}$ is a hard limit for service's physical memory (`RSS`) consumption (so that exceeding this limit will highly likely result in OOM);
 - $CGO$ is a total size of heap allocations made beyond `Cgo` borders (within `C`/`C++`/.... libraries).
 
-A few notes about $CGO$ component. Allocations made outside of the Go allocator, of course, are not controlled by the Go runtime in any way. At the same time, the memory consumption limit is common for both Go and non-Go allocators. Therefore, if non-Go allocations grow, all we can do is shrink the memory budget for Go allocations (which is why we subtract $CGO$ from the denominator of the previous expression). If your service uses `Cgo`, you need to figure out how much memory is allocated “on the other side” – **otherwise MemLimiter won’t be able to save your service from OOM**.
+A few notes about $CGO$ component. Allocations made outside of the Go allocator, of course, are not controlled by the Go runtime in any way. At the same time, the memory consumption limit is common for both Go and non-Go allocators. Therefore, if non-Go allocations grow, all we can do is shrink the memory budget for Go allocations (which is why we subtract $CGO$ from the denominator of the previous expression). If your service uses `Cgo`, you need to figure out how much memory is allocated "on the other side" - **otherwise MemLimiter won't be able to save your service from OOM**.
+
+When reported `$CGO >= RSS_{limit}$`, MemLimiter treats Go budget as exhausted and immediately switches to conservative control mode.
 
 If the service doesn't use `Cgo`, the $Utilization$ formula is simplified to:
 $$Utilization = \frac {NextGC} {RSS_{limit}}$$
@@ -80,29 +83,29 @@ You can adjust the proportional component control signal strength using a coeffi
 The control signal is always saturated to prevent extremal values:
 
 $$ Output = \begin{cases}
-\displaystyle 100 \ \ \ K_{p} \gt 100 \\
-\displaystyle 0 \ \ \ \ \ \ \ K_{p} \lt 100 \\
+\displaystyle 99 \ \ \ K_{p} \gt 99 \\
+\displaystyle 0 \ \ \ \ \ \ \ K_{p} \lt 0 \\
 \displaystyle K_{p} \ \ \ \ otherwise \\
 \end{cases}$$
 
 Finally we convert the dimensionless quantity $Output$ into specific $GOGC$ (for the further use in [`debug.SetGCPercent`](https://pkg.go.dev/runtime/debug#SetGCPercent)) and $Throttling$ (percentage of suppressed requests) values, however, only if the $Utilization$ exceeds the specified limits:
 
-$$ GC = \begin{cases}
-\displaystyle Output \ \ \ Utilization \gt DangerZoneGC \\
-\displaystyle 100 \ \ \ \ \ \ \ \ \ \ otherwise \\
+$$ GOGC = \begin{cases}
+\displaystyle max(MinGOGC, 100 - round(Output)) \ \ \ Utilization \ge DangerZoneGOGC \\
+\displaystyle 100 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
 \end{cases}$$
 
 $$ Throttling = \begin{cases}
-\displaystyle Output \ \ \ Utilization \gt DangerZoneThrottling \\
-\displaystyle 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
+\displaystyle round(Output) \ \ \ Utilization \ge DangerZoneThrottling \\
+\displaystyle 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
 \end{cases}$$
 
 ## Architecture
 
 The MemLimiter comprises two main parts:
 
-1. **Core** implementing the memory budget controller and backpressure subsystems. Core relies on actual statistics provided by `stats.ServiceStatsSubscription`. In a critical situation, core may gracefully terminate the application with `utils.ApplicationTerminator`.
-2. **Middleware** providing request throttling feature for various web frameworks. Every time the server receives a request, it uses middleware to ask the MemLimiter’s core for permission to process this request. Currently, only `GRPC` is supported, but `Middleware` is an easily extensible interface, and PRs are welcome.
+1. **Core** implementing the memory budget controller and backpressure subsystems. Core relies on actual statistics provided by `stats.ServiceStatsSubscription`.
+2. **Middleware** providing request throttling feature for various web frameworks. Every time the server receives a request, it uses middleware to ask the MemLimiter's core for permission to process this request. Currently, only `gRPC` is supported, but `Middleware` is an easily extensible interface, and PRs are welcome.
 
 ![Architecture](docs/architecture.png)
 
@@ -122,44 +125,110 @@ You must also provide your own `stats.ServiceStatsSubscription` and `stats.Servi
 
 ### Tuning
 
-There are several key settings in MemLimiter [configuration](controller/nextgc/config.go):
-
-- `RSSLimit`
-- `DangerZoneGC`
-- `DangerZoneThrottling`
-- `Period`
-- `WindowSize`
-- `Coefficient` ($C_{p}$)
+There are several key settings in MemLimiter configuration (see [top-level config](config.go) and [controller config](controller/nextgc/config.go)):
+
+- `go_memory_limit` (optional, top-level)
+- `controller_nextgc.rss_limit`
+- `controller_nextgc.danger_zone_gogc`
+- `controller_nextgc.danger_zone_throttling`
+- `controller_nextgc.min_gogc`
+- `controller_nextgc.period`
+- `controller_nextgc.component_proportional.window_size`
+- `controller_nextgc.component_proportional.coefficient` ($C_{p}$)
+
+Example:
+
+```json
+{
+  "go_memory_limit": "800M",
+  "controller_nextgc": {
+    "rss_limit": "1G",
+    "danger_zone_gogc": 50,
+    "danger_zone_throttling": 90,
+    "min_gogc": 10,
+    "period": "100ms",
+    "component_proportional": {
+      "coefficient": 1,
+      "window_size": 20
+    }
+  }
+}
+```
 
 You have to pick them empirically for your service. The settings must correspond to the business logic features of a particular service and to the workload expected.
 
-We made a series of performance tests with [Allocator][test/allocator] - an example service which does nothing but allocations that reside in memory for some time. We used different settings, applied the same load and tracked the RSS of a process.
+We made a series of performance tests with [Allocator](test/allocator) - an example service which does nothing but allocations that reside in memory for some time. We used different settings, applied the same load and tracked runtime behavior.
 
-Settings ranges:
+Current `make allocator-analyze` scenario matrix:
+- One unlimited baseline (`memlimiter` disabled).
+- One limited baseline without Go soft limit (`go_memory_limit = 0`).
+- Several limited cases with `go_memory_limit = 800MiB`, including a stricter safety floor (`min_gogc = 30`) case.
+
+Common settings in this matrix:
 - $RSS_{limit} = {1G}$
-- $DangerZoneGC = 50%$
-- $DangerZoneThrottling = 90%$
+- $DangerZoneGOGC = 50\%$
+- $DangerZoneThrottling = 90\%$
 - $Period = 100ms$
 - $WindowSize = 20$
-- $C_{p} \in \\{0, 0.5, 1, 5, 10, 50, 100\\}$
 
-These plots may give you some inspiration on how $C_{p}$ value affects the physical memory consumption other things being equal:
+Scenario-specific values:
+- $go\_memory\_limit \in \{0, 800MiB\}$
+- $MinGOGC \in \{10, 30\}$
+- $C_{p} \in \{0.5, 5, 10, 50\}$
+
+Load profile (same for all scenarios):
+- $RPS = 120$
+- $AllocationSize = 1MiB$
+- $PauseDuration = 6s$
+- $RequestTimeout = 1m$
+- $LoadDuration = 60s$
+
+Current analyzer run outputs are generated under `/tmp/allocator/allocator_<HHMMSS>/` (images below are curated examples from `docs/`):
 
 ![Control params](docs/control_params.png)
 
-And the summary plot with RSS consumption dependence on $C_{p}$ value:
+And the summary RSS plot across tested scenarios:
+
+![RSS](docs/rss.png)
+
+Observed OOM behavior in this run:
+- Without MemLimiter (`unlimited=true`), the process terminates around ~16s under the 1GiB container limit.
+- With MemLimiter enabled, all limited scenarios sustain the full 60s load window.
+
+Additional plots for new controls (`go_memory_limit` and `min_gogc`) are generated by `make allocator-analyze` in the same run directory. Curated examples are stored under `docs/`:
+
+`gogc_floor_hits.png`:
+
+![GOGC floor hits](docs/gogc_floor_hits.png)
+
+What it means:
+- It shows, per scenario, the share of samples where `GOGC` is clamped by `min_gogc`.
+- Higher values mean the safety floor is actively protecting the process from dropping to overly aggressive GC values.
+- In this run, the strict case (`C_p=50`, `min_gogc=30`) hits the floor for ~78% of samples.
+
+`memory_limits_overlay.png`:
+
+![Memory limits overlay](docs/memory_limits_overlay.png)
+
+What it means:
+- It shows `RSS` and `Go runtime memory` (tracked as `MemStats.Sys - MemStats.HeapReleased`) with configured limits over time.
+- `go_memory_limit` is a soft limit, so short-term overshoot is possible under bursty/high-allocation load.
+- If overshoot is large and persistent, allocation pressure is stronger than GC control for this workload.
+- If `RSS` stays high while `Go runtime memory` is low, pressure likely comes from non-Go allocations (`Cgo`/external memory), so better external accounting and/or stronger throttling is needed.
 
-![RSS](docs/rss_hl.png)
+General observations from these experiments:
+- In the latest stress run, disabling MemLimiter (`unlimited` baseline) terminates around 16s under the 1GiB container limit, while limited scenarios complete the full 60s load.
+- `go_memory_limit=800MiB` adds extra GC pressure as a soft target; in this stress test it is not a hard ceiling for `Go runtime memory`.
+- `min_gogc` protects against extreme GC aggressiveness by clamping controller output in red-zone periods.
+- A stricter floor (`min_gogc=30`) with aggressive `C_p=50` shifts control toward stronger throttling (up to 99%) instead of further GC tightening.
 
-The general conclusion is that:
-- The higher the $C_{p}$ is, the lower the $RSS$ consumption.
-- Too low and too high $C_{p}$ values cause self-oscillation of control parameters.
-- Disabling MemLimiter causes OOM.
+Runtime settings changed by MemLimiter are restored on `Service.Quit()`:
+- `GOGC` (`debug.SetGCPercent`)
+- `go_memory_limit` (if configured via `debug.SetMemoryLimit`)
 
 ## TODO
 
 - Extend middleware.Middleware to support more frameworks.
-- Add GOGC limitations to prevent death spirals.
 - Support popular Cgo allocators like Jemalloc or TCMalloc, parse their stats to provide information about Cgo memory consumption.
 
 Your PRs are welcome!

diff --git a/backpressure/interface.go b/backpressure/interface.go
@@ -28,4 +28,6 @@ type Operator interface {
 	AllowRequest() bool
 	// GetStats returns statistics of Backpressure subsystem.
 	GetStats() (*stats.BackpressureStats, error)
+	// Quit gracefully terminates backpressure subsystem and restores runtime settings.
+	Quit()
 }
diff --git a/backpressure/mock.go b/backpressure/mock.go
@@ -23,3 +23,25 @@ func (m *OperatorMock) SetControlParameters(value *stats.ControlParameters) erro
 
 	return args.Error(0)
 }
+
+func (m *OperatorMock) AllowRequest() bool {
+	args := m.Called()
+
+	return args.Bool(0)
+}
+
+func (m *OperatorMock) GetStats() (*stats.BackpressureStats, error) {
+	args := m.Called()
+
+	raw := args.Get(0)
+	if raw == nil {
+		return nil, args.Error(1)
+	}
+
+	//nolint:forcetypeassert // Mocked method.
+	return raw.(*stats.BackpressureStats), args.Error(1)
+}
+
+func (m *OperatorMock) Quit() {
+	m.Called()
+}
diff --git a/backpressure/operator.go b/backpressure/operator.go
@@ -23,6 +23,8 @@ type operatorImpl struct {
 
 	notificationChan      chan<- *stats.MemLimiterStats
 	lastControlParameters atomic.Value
+	initialGOGC           atomic.Int64
+	initialGOGCStored     atomic.Bool
 	logger                logr.Logger
 }
 
@@ -85,7 +87,10 @@ func (b *operatorImpl) SetControlParameters(value *stats.ControlParameters) erro
 	}
 
 	// Tune GC pace.
-	debug.SetGCPercent(value.GOGC)
+	oldGOGC := debug.SetGCPercent(value.GOGC)
+	if b.initialGOGCStored.CompareAndSwap(false, true) {
+		b.initialGOGC.Store(int64(oldGOGC))
+	}
 
 	b.logger.Info("control parameters changed", value.ToKeysAndValues()...)
 
@@ -110,3 +115,10 @@ func (b *operatorImpl) SetControlParameters(value *stats.ControlParameters) erro
 
 	return nil
 }
+
+// Quit gracefully terminates backpressure subsystem.
+func (b *operatorImpl) Quit() {
+	if b.initialGOGCStored.Load() {
+		debug.SetGCPercent(int(b.initialGOGC.Load()))
+	}
+}
diff --git a/backpressure/operator_test.go b/backpressure/operator_test.go
@@ -7,6 +7,7 @@
 package backpressure
 
 import (
+	"runtime/debug"
 	"testing"
 
 	"github.com/go-logr/logr/testr"
@@ -32,3 +33,24 @@ func TestOperator(t *testing.T) {
 
 	require.Equal(t, params, notification.Backpressure.ControlParameters)
 }
+
+func TestOperatorQuitRestoresGOGC(t *testing.T) {
+	const expectedInitialGOGC = 73
+
+	originalBeforeTest := debug.SetGCPercent(expectedInitialGOGC)
+	defer debug.SetGCPercent(originalBeforeTest)
+
+	logger := testr.New(t)
+	op := NewOperator(logger)
+
+	err := op.SetControlParameters(&stats.ControlParameters{
+		GOGC:                 21,
+		ThrottlingPercentage: NoThrottling,
+	})
+	require.NoError(t, err)
+
+	op.Quit()
+
+	prev := debug.SetGCPercent(expectedInitialGOGC)
+	require.Equal(t, expectedInitialGOGC, prev)
+}
diff --git a/config.go b/config.go
@@ -8,12 +8,17 @@ package memlimiter
 
 import (
 	"errors"
+	"math"
 
 	"github.com/newcloudtechnologies/memlimiter/controller/nextgc"
+	"github.com/newcloudtechnologies/memlimiter/utils/config/bytes"
 )
 
 // Config - high-level MemLimiter config.
 type Config struct {
+	// GoMemoryLimit optionally sets Go runtime soft memory limit via debug.SetMemoryLimit.
+	// Zero means disabled.
+	GoMemoryLimit bytes.Bytes `json:"go_memory_limit"`
 	// ControllerNextGC - NextGC-based controller
 	ControllerNextGC *nextgc.ControllerConfig `json:"controller_nextgc"` //nolint:tagliatelle
 	// TODO:
@@ -32,5 +37,9 @@ func (c *Config) Prepare() error {
 		return errors.New("empty ControllerNextGC")
 	}
 
+	if c.GoMemoryLimit.Value > uint64(math.MaxInt64) {
+		return errors.New("GoMemoryLimit exceeds int64 range")
+	}
+
 	return nil
 }
diff --git a/config_test.go b/config_test.go
@@ -7,8 +7,11 @@
 package memlimiter
 
 import (
+	"math"
 	"testing"
 
+	"github.com/newcloudtechnologies/memlimiter/controller/nextgc"
+	"github.com/newcloudtechnologies/memlimiter/utils/config/bytes"
 	"github.com/stretchr/testify/require"
 )
 
@@ -22,4 +25,20 @@ func TestConfig(t *testing.T) {
 		c := &Config{ControllerNextGC: nil}
 		require.Error(t, c.Prepare())
 	})
+
+	t.Run("go memory limit in range", func(t *testing.T) {
+		c := &Config{
+			ControllerNextGC: &nextgc.ControllerConfig{},
+			GoMemoryLimit:    bytes.Bytes{Value: uint64(math.MaxInt64)},
+		}
+		require.NoError(t, c.Prepare())
+	})
+
+	t.Run("go memory limit out of range", func(t *testing.T) {
+		c := &Config{
+			ControllerNextGC: &nextgc.ControllerConfig{},
+			GoMemoryLimit:    bytes.Bytes{Value: uint64(math.MaxInt64) + 1},
+		}
+		require.Error(t, c.Prepare())
+	})
 }