Skip to content

document recommended resources for operator snapshot agent #27902

@tgross

Description

@tgross

Nomad Enterprise has the nomad operator snapshot agent command, which you can use to periodically snapshot your cluster and ship backups to AWS S3 or whatever. We recommend deploying nomad operator snapshot agent as a Nomad job, but we don't have a recommended resource configuration that's been actually benchmarked.

CPU probably doesn't matter much because at worst that just slows things down, but memory is harder to handwave because of OOM. Even if we assume we're efficiently streaming, the RSS is going to include the paged-in parts of the Nomad binary. But this hasn't been specifically optimized the way we've done for logmon, etc. The bulk of the functionality is over in https://github.com/hashicorp/raft-snapshotagent (internal repo), which in turn pulls in Azure/AWS/GCP clients, which are all pretty large.

Let's benchmark this on a real cluster with a real external object storage, so that we can make accurate recommendations here.

Ref: #27890
Ref (internal): https://hashicorp.atlassian.net/browse/NMD-1426

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions