Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 101 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,15 @@

[![Go Reference](https://pkg.go.dev/badge/github.com/newcloudtechnologies/memlimiter.svg)](https://pkg.go.dev/github.com/newcloudtechnologies/memlimiter)
[![Go Report Card](https://goreportcard.com/badge/github.com/newcloudtechnologies/memlimiter)](https://goreportcard.com/report/github.com/newcloudtechnologies/memlimiter)
![Coverage](https://img.shields.io/badge/Coverage-81.7%25-brightgreen)
![Coverage](https://img.shields.io/badge/Coverage-82.8%25-brightgreen)
![CI](https://github.com/newcloudtechnologies/memlimiter/actions/workflows/CI.yml/badge.svg)

`memlimiter` helps a Go service avoid OOM by combining adaptive GC tuning and request throttling under memory pressure.

It observes process memory (`RSS`) and Go heap pressure (`runtime.MemStats.NextGC`) and turns that into:

- dynamic `debug.SetGCPercent` tuning,
- optional `debug.SetMemoryLimit` application on service start,
- request shedding / backpressure via middleware.

By default, stats come from:
Expand Down Expand Up @@ -62,7 +63,9 @@ where:
- $RSS_{limit}$ is a hard limit for service's physical memory (`RSS`) consumption (so that exceeding this limit will highly likely result in OOM);
- $CGO$ is a total size of heap allocations made beyond `Cgo` borders (within `C`/`C++`/.... libraries).

A few notes about $CGO$ component. Allocations made outside of the Go allocator, of course, are not controlled by the Go runtime in any way. At the same time, the memory consumption limit is common for both Go and non-Go allocators. Therefore, if non-Go allocations grow, all we can do is shrink the memory budget for Go allocations (which is why we subtract $CGO$ from the denominator of the previous expression). If your service uses `Cgo`, you need to figure out how much memory is allocated “on the other side” – **otherwise MemLimiter won’t be able to save your service from OOM**.
A few notes about $CGO$ component. Allocations made outside of the Go allocator, of course, are not controlled by the Go runtime in any way. At the same time, the memory consumption limit is common for both Go and non-Go allocators. Therefore, if non-Go allocations grow, all we can do is shrink the memory budget for Go allocations (which is why we subtract $CGO$ from the denominator of the previous expression). If your service uses `Cgo`, you need to figure out how much memory is allocated "on the other side" - **otherwise MemLimiter won't be able to save your service from OOM**.

When reported `$CGO >= RSS_{limit}$`, MemLimiter treats Go budget as exhausted and immediately switches to conservative control mode.

If the service doesn't use `Cgo`, the $Utilization$ formula is simplified to:
$$Utilization = \frac {NextGC} {RSS_{limit}}$$
Expand All @@ -80,29 +83,29 @@ You can adjust the proportional component control signal strength using a coeffi
The control signal is always saturated to prevent extremal values:

$$ Output = \begin{cases}
\displaystyle 100 \ \ \ K_{p} \gt 100 \\
\displaystyle 0 \ \ \ \ \ \ \ K_{p} \lt 100 \\
\displaystyle 99 \ \ \ K_{p} \gt 99 \\
\displaystyle 0 \ \ \ \ \ \ \ K_{p} \lt 0 \\
\displaystyle K_{p} \ \ \ \ otherwise \\
\end{cases}$$

Finally we convert the dimensionless quantity $Output$ into specific $GOGC$ (for the further use in [`debug.SetGCPercent`](https://pkg.go.dev/runtime/debug#SetGCPercent)) and $Throttling$ (percentage of suppressed requests) values, however, only if the $Utilization$ exceeds the specified limits:

$$ GC = \begin{cases}
\displaystyle Output \ \ \ Utilization \gt DangerZoneGC \\
\displaystyle 100 \ \ \ \ \ \ \ \ \ \ otherwise \\
$$ GOGC = \begin{cases}
\displaystyle max(MinGOGC, 100 - round(Output)) \ \ \ Utilization \ge DangerZoneGOGC \\
\displaystyle 100 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
\end{cases}$$

$$ Throttling = \begin{cases}
\displaystyle Output \ \ \ Utilization \gt DangerZoneThrottling \\
\displaystyle 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
\displaystyle round(Output) \ \ \ Utilization \ge DangerZoneThrottling \\
\displaystyle 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\
\end{cases}$$

## Architecture

The MemLimiter comprises two main parts:

1. **Core** implementing the memory budget controller and backpressure subsystems. Core relies on actual statistics provided by `stats.ServiceStatsSubscription`. In a critical situation, core may gracefully terminate the application with `utils.ApplicationTerminator`.
2. **Middleware** providing request throttling feature for various web frameworks. Every time the server receives a request, it uses middleware to ask the MemLimiters core for permission to process this request. Currently, only `GRPC` is supported, but `Middleware` is an easily extensible interface, and PRs are welcome.
1. **Core** implementing the memory budget controller and backpressure subsystems. Core relies on actual statistics provided by `stats.ServiceStatsSubscription`.
2. **Middleware** providing request throttling feature for various web frameworks. Every time the server receives a request, it uses middleware to ask the MemLimiter's core for permission to process this request. Currently, only `gRPC` is supported, but `Middleware` is an easily extensible interface, and PRs are welcome.

![Architecture](docs/architecture.png)

Expand All @@ -122,44 +125,110 @@ You must also provide your own `stats.ServiceStatsSubscription` and `stats.Servi

### Tuning

There are several key settings in MemLimiter [configuration](controller/nextgc/config.go):

- `RSSLimit`
- `DangerZoneGC`
- `DangerZoneThrottling`
- `Period`
- `WindowSize`
- `Coefficient` ($C_{p}$)
There are several key settings in MemLimiter configuration (see [top-level config](config.go) and [controller config](controller/nextgc/config.go)):

- `go_memory_limit` (optional, top-level)
- `controller_nextgc.rss_limit`
- `controller_nextgc.danger_zone_gogc`
- `controller_nextgc.danger_zone_throttling`
- `controller_nextgc.min_gogc`
- `controller_nextgc.period`
- `controller_nextgc.component_proportional.window_size`
- `controller_nextgc.component_proportional.coefficient` ($C_{p}$)

Example:

```json
{
"go_memory_limit": "800M",
"controller_nextgc": {
"rss_limit": "1G",
"danger_zone_gogc": 50,
"danger_zone_throttling": 90,
"min_gogc": 10,
"period": "100ms",
"component_proportional": {
"coefficient": 1,
"window_size": 20
}
}
}
```

You have to pick them empirically for your service. The settings must correspond to the business logic features of a particular service and to the workload expected.

We made a series of performance tests with [Allocator][test/allocator] - an example service which does nothing but allocations that reside in memory for some time. We used different settings, applied the same load and tracked the RSS of a process.
We made a series of performance tests with [Allocator](test/allocator) - an example service which does nothing but allocations that reside in memory for some time. We used different settings, applied the same load and tracked runtime behavior.

Settings ranges:
Current `make allocator-analyze` scenario matrix:
- One unlimited baseline (`memlimiter` disabled).
- One limited baseline without Go soft limit (`go_memory_limit = 0`).
- Several limited cases with `go_memory_limit = 800MiB`, including a stricter safety floor (`min_gogc = 30`) case.

Common settings in this matrix:
- $RSS_{limit} = {1G}$
- $DangerZoneGC = 50%$
- $DangerZoneThrottling = 90%$
- $DangerZoneGOGC = 50\%$
- $DangerZoneThrottling = 90\%$
- $Period = 100ms$
- $WindowSize = 20$
- $C_{p} \in \\{0, 0.5, 1, 5, 10, 50, 100\\}$

These plots may give you some inspiration on how $C_{p}$ value affects the physical memory consumption other things being equal:
Scenario-specific values:
- $go\_memory\_limit \in \{0, 800MiB\}$
- $MinGOGC \in \{10, 30\}$
- $C_{p} \in \{0.5, 5, 10, 50\}$

Load profile (same for all scenarios):
- $RPS = 120$
- $AllocationSize = 1MiB$
- $PauseDuration = 6s$
- $RequestTimeout = 1m$
- $LoadDuration = 60s$

Current analyzer run outputs are generated under `/tmp/allocator/allocator_<HHMMSS>/` (images below are curated examples from `docs/`):

![Control params](docs/control_params.png)

And the summary plot with RSS consumption dependence on $C_{p}$ value:
And the summary RSS plot across tested scenarios:

![RSS](docs/rss.png)

Observed OOM behavior in this run:
- Without MemLimiter (`unlimited=true`), the process terminates around ~16s under the 1GiB container limit.
- With MemLimiter enabled, all limited scenarios sustain the full 60s load window.

Additional plots for new controls (`go_memory_limit` and `min_gogc`) are generated by `make allocator-analyze` in the same run directory. Curated examples are stored under `docs/`:

`gogc_floor_hits.png`:

![GOGC floor hits](docs/gogc_floor_hits.png)

What it means:
- It shows, per scenario, the share of samples where `GOGC` is clamped by `min_gogc`.
- Higher values mean the safety floor is actively protecting the process from dropping to overly aggressive GC values.
- In this run, the strict case (`C_p=50`, `min_gogc=30`) hits the floor for ~78% of samples.

`memory_limits_overlay.png`:

![Memory limits overlay](docs/memory_limits_overlay.png)

What it means:
- It shows `RSS` and `Go runtime memory` (tracked as `MemStats.Sys - MemStats.HeapReleased`) with configured limits over time.
- `go_memory_limit` is a soft limit, so short-term overshoot is possible under bursty/high-allocation load.
- If overshoot is large and persistent, allocation pressure is stronger than GC control for this workload.
- If `RSS` stays high while `Go runtime memory` is low, pressure likely comes from non-Go allocations (`Cgo`/external memory), so better external accounting and/or stronger throttling is needed.

![RSS](docs/rss_hl.png)
General observations from these experiments:
- In the latest stress run, disabling MemLimiter (`unlimited` baseline) terminates around 16s under the 1GiB container limit, while limited scenarios complete the full 60s load.
- `go_memory_limit=800MiB` adds extra GC pressure as a soft target; in this stress test it is not a hard ceiling for `Go runtime memory`.
- `min_gogc` protects against extreme GC aggressiveness by clamping controller output in red-zone periods.
- A stricter floor (`min_gogc=30`) with aggressive `C_p=50` shifts control toward stronger throttling (up to 99%) instead of further GC tightening.

The general conclusion is that:
- The higher the $C_{p}$ is, the lower the $RSS$ consumption.
- Too low and too high $C_{p}$ values cause self-oscillation of control parameters.
- Disabling MemLimiter causes OOM.
Runtime settings changed by MemLimiter are restored on `Service.Quit()`:
- `GOGC` (`debug.SetGCPercent`)
- `go_memory_limit` (if configured via `debug.SetMemoryLimit`)

## TODO

- Extend middleware.Middleware to support more frameworks.
- Add GOGC limitations to prevent death spirals.
- Support popular Cgo allocators like Jemalloc or TCMalloc, parse their stats to provide information about Cgo memory consumption.

Your PRs are welcome!
Expand Down
2 changes: 2 additions & 0 deletions backpressure/interface.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,6 @@ type Operator interface {
AllowRequest() bool
// GetStats returns statistics of Backpressure subsystem.
GetStats() (*stats.BackpressureStats, error)
// Quit gracefully terminates backpressure subsystem and restores runtime settings.
Quit()
}
22 changes: 22 additions & 0 deletions backpressure/mock.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,25 @@ func (m *OperatorMock) SetControlParameters(value *stats.ControlParameters) erro

return args.Error(0)
}

func (m *OperatorMock) AllowRequest() bool {
args := m.Called()

return args.Bool(0)
}

func (m *OperatorMock) GetStats() (*stats.BackpressureStats, error) {
args := m.Called()

raw := args.Get(0)
if raw == nil {
return nil, args.Error(1)
}

//nolint:forcetypeassert // Mocked method.
return raw.(*stats.BackpressureStats), args.Error(1)
}

func (m *OperatorMock) Quit() {
m.Called()
}
14 changes: 13 additions & 1 deletion backpressure/operator.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ type operatorImpl struct {

notificationChan chan<- *stats.MemLimiterStats
lastControlParameters atomic.Value
initialGOGC atomic.Int64
initialGOGCStored atomic.Bool
logger logr.Logger
}

Expand Down Expand Up @@ -85,7 +87,10 @@ func (b *operatorImpl) SetControlParameters(value *stats.ControlParameters) erro
}

// Tune GC pace.
debug.SetGCPercent(value.GOGC)
oldGOGC := debug.SetGCPercent(value.GOGC)
if b.initialGOGCStored.CompareAndSwap(false, true) {
b.initialGOGC.Store(int64(oldGOGC))
}

b.logger.Info("control parameters changed", value.ToKeysAndValues()...)

Expand All @@ -110,3 +115,10 @@ func (b *operatorImpl) SetControlParameters(value *stats.ControlParameters) erro

return nil
}

// Quit gracefully terminates backpressure subsystem.
func (b *operatorImpl) Quit() {
if b.initialGOGCStored.Load() {
debug.SetGCPercent(int(b.initialGOGC.Load()))
}
}
22 changes: 22 additions & 0 deletions backpressure/operator_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
package backpressure

import (
"runtime/debug"
"testing"

"github.com/go-logr/logr/testr"
Expand All @@ -32,3 +33,24 @@ func TestOperator(t *testing.T) {

require.Equal(t, params, notification.Backpressure.ControlParameters)
}

func TestOperatorQuitRestoresGOGC(t *testing.T) {
const expectedInitialGOGC = 73

originalBeforeTest := debug.SetGCPercent(expectedInitialGOGC)
defer debug.SetGCPercent(originalBeforeTest)

logger := testr.New(t)
op := NewOperator(logger)

err := op.SetControlParameters(&stats.ControlParameters{
GOGC: 21,
ThrottlingPercentage: NoThrottling,
})
require.NoError(t, err)

op.Quit()

prev := debug.SetGCPercent(expectedInitialGOGC)
require.Equal(t, expectedInitialGOGC, prev)
}
9 changes: 9 additions & 0 deletions config.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,17 @@ package memlimiter

import (
"errors"
"math"

"github.com/newcloudtechnologies/memlimiter/controller/nextgc"
"github.com/newcloudtechnologies/memlimiter/utils/config/bytes"
)

// Config - high-level MemLimiter config.
type Config struct {
// GoMemoryLimit optionally sets Go runtime soft memory limit via debug.SetMemoryLimit.
// Zero means disabled.
GoMemoryLimit bytes.Bytes `json:"go_memory_limit"`
// ControllerNextGC - NextGC-based controller
ControllerNextGC *nextgc.ControllerConfig `json:"controller_nextgc"` //nolint:tagliatelle
// TODO:
Expand All @@ -32,5 +37,9 @@ func (c *Config) Prepare() error {
return errors.New("empty ControllerNextGC")
}

if c.GoMemoryLimit.Value > uint64(math.MaxInt64) {
return errors.New("GoMemoryLimit exceeds int64 range")
}

return nil
}
19 changes: 19 additions & 0 deletions config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,11 @@
package memlimiter

import (
"math"
"testing"

"github.com/newcloudtechnologies/memlimiter/controller/nextgc"
"github.com/newcloudtechnologies/memlimiter/utils/config/bytes"
"github.com/stretchr/testify/require"
)

Expand All @@ -22,4 +25,20 @@ func TestConfig(t *testing.T) {
c := &Config{ControllerNextGC: nil}
require.Error(t, c.Prepare())
})

t.Run("go memory limit in range", func(t *testing.T) {
c := &Config{
ControllerNextGC: &nextgc.ControllerConfig{},
GoMemoryLimit: bytes.Bytes{Value: uint64(math.MaxInt64)},
}
require.NoError(t, c.Prepare())
})

t.Run("go memory limit out of range", func(t *testing.T) {
c := &Config{
ControllerNextGC: &nextgc.ControllerConfig{},
GoMemoryLimit: bytes.Bytes{Value: uint64(math.MaxInt64) + 1},
}
require.Error(t, c.Prepare())
})
}
Loading
Loading