NexusFlow interacts with Linux kernel primitives (sched_setaffinity, perf_event_open, mmap, Unix SCM_RIGHTS) and may expose an HTTP dashboard that can spawn allow-listed subprocesses. This document is the trust anchor for security architects, platform engineers, and compliance reviewers.
| Asset | Description |
|---|---|
| Host CPU / memory | Correctness and performance of co-located workloads |
| Secrets in env / files | Inherit to child processes spawned via run, dag, plasma, dashboard /api/exec |
| POSIX shm segments | Fast IPC blobs under /dev/shm |
| Unix sockets | Plasma control; file mode defines who can connect on-disk |
| Telemetry | Prometheus text files; perf counter metadata |
| Actor | Objective |
|---|---|
| Local unprivileged user | Read/write peers’ shm if mis-permissioned; connect to loose Unix sockets; abuse dashboard |
| Workload in same UID | Map same /dev/shm name; inherit fds if passed recklessly |
| Network attacker | Reach dashboard, gRPC daemon, or misbound services |
Mechanism:
pkg/shmcreates files with0600(owner read/write only).- Random suffixes or exclusive named paths reduce accidental collisions.
- No encryption at rest inside shm; treat contents as trusted process memory only.
Guarantee: Process A (UID X) cannot mmap NexusFlow shm of process B (UID Y) unless the OS file permissions are weakened by the operator (e.g. chmod, shared user namespace misuse).
Non-goals: NexusFlow does not implement cryptographic segmentation between peers on the same UID. Use separate users, containers, or VMs for tenant isolation.
NUMA pinning reduces migration noise but does not:
- Flush micro-architectural buffers between untrusted peers on sibling hyperthreads
- Replace confidential computing (SEV, TDX) or scheduler policies for classified mixed workloads
Guidance: For adversarial multi-tenant CPUs, combine physical core reservation, nosmt, cgroups, and organizational segmentation; NexusFlow is a performance tool first.
- Build from tagged releases and verify checksums
- Pin container base images and rebuild on CVE
- Python SDK:
pip install -efrom a vetted path;NEXUSFLOW_BINmust point to a knownnexusflowbinary
| Feature | Typical requirement |
|---|---|
nexusflow run (sched_setaffinity + optional setpriority) |
CPU mask: typically allowed for same user; may be restricted by cgroups or systemd AllowedCPUs. --priority high sets a lower nice (e.g. −10) and often needs CAP_SYS_NICE or superuser. |
taskset (DAG) |
Same as above for target PIDs |
perf sample / perf_event_open |
Often needs kernel.perf_event_paranoid <= 1 (site policy) or CAP_PERFMON / CAP_SYS_ADMIN depending on kernel |
/dev/shm create |
Writable /dev/shm for the effective UID |
| Plasma Unix listen | Create socket in path owned / writable by service user; avoid world-writable run dirs on shared login nodes |
| Dashboard HTTP(S) | Bind to port <1024 may need CAP_NET_BIND_SERVICE; prefer ≥1024 or reverse proxy |
nexusflow daemon (gRPC) |
TCP listener: anyone who can connect can call CreateCell, AttachPID, RunInCell, WatchL3, SetHugepages, EvictForeign as the daemon’s UID. Treat like root-equivalent on the box unless fronted by mTLS, Unix forward, or localhost + strict firewall. |
cgroup v2 cells (pkg/cgv2) |
Writing cpuset.cpus, cpuset.mems, cgroup.procs under /sys/fs/cgroup typically requires superuser or delegated cgroup subtree owned by the service user. Misconfiguration can disrupt other workloads on the host. |
EvictForeign (pkg/evict) |
kill(2) (default SIGSTOP) on PIDs whose affinity intersects the target CPU set. Needs CAP_KILL for foreign processes (or same user), and can cause loss of service for other tenants. |
SetHugepages / hugepages set |
Writes /sys/kernel/mm/hugepages/.../nr_hugepages: normally root or CAP_SYS_ADMIN. Can affect global memory pressure and boot-time policy. |
LLC perf (WatchL3) |
Same constraints as perf_event_open (see perf sample row); samples on a lead CPU for the cell. |
- Run as dedicated service user (non-root) with no shell login
- Dashboard:
127.0.0.1OR TLS +Authorization: Bearer+--allow-cidr - systemd:
NoNewPrivileges=yes,PrivateTmp=yes,ProtectSystem=strict, drop capabilities not needed - Containers: read-only root where possible; explicit
/dev/shmsize limit - Never expose Plasma socket to global-writable directories on multi-tenant hosts
- Set
timeout_secon dashboard exec to bound runaway commands - Daemon (gRPC): bind
127.0.0.1, or front with mTLS / SSH tunnel / Unix forward; treat RPC surface like host admin access - Eviction / hugepages: grant
CAP_KILL/ sysctl/sysfs rights only to a narrow automation identity; document SIGCONT for stopped foreign workloads
POST /api/exec executes argv arrays drawn from an allow-list of command roots (topology, run, dag, shm, perf, plasma, hugepages, …).
| Risk | Control |
|---|---|
| Remote code execution | Equivalent to local CLI access—treat as privileged |
| Token bypass | Always set --auth-token for non-loopback |
| SSRF / relay | Restrict --allow-cidr to admin networks |
| DoS | Subprocess wall-clock timeout; HTTP IdleTimeout |
| Area | NexusFlow behavior | Enterprise mapping |
|---|---|---|
| Who changed affinity? | OS auditd can trace execve of nexusflow; dashboard can be fronted by auth proxy with request logs |
|
| Pipeline timings | DAG Prometheus metrics → SIEM / TSDB | |
| Plasma branch / fd | Coordinator logs significant events; centralize via journald / sidecar | |
| SOC2 / ISO | NexusFlow does not auto-produce control evidence; you map controls to logging, access, change tickets |
We take security reports seriously.
Preferred: GitHub Private vulnerability reporting (enable in repo Settings → Security).
Alternative: Email the project maintainers at the address listed in the repository MAINTAINERS file or organization contact page. If none is published, open a non-public support channel your organization uses for OSS intake.
Include:
- Affected component (CLI, dashboard, Plasma, shm, kubermetrics, daemon gRPC, cgroup cells, eviction, hugepages, etc.)
- Reproduction steps on a stock Linux distro
- Impact (confidentiality / integrity / availability)
- Optional: patch idea or diff
If your policy requires encrypted mail, publish a security.asc key in-repo and reference the fingerprint here. Placeholder: add security@yourdomain.com + key ID when available.
| Phase | Target |
|---|---|
| Acknowledgement | ≤ 5 business days |
| Triage | Severity + affected versions |
| Fix | Patched release or documented mitigation |
| Credit | CVE / advisory + reporter credit if desired |
- Shared UID shm peers can read each other’s segments if they guess names—use random shm from CLI or unguessable secrets in filenames under your policy.
- kubermetrics manifests deploy workloads with strong defaults (non-root, readOnlyRootFilesystem, NetworkPolicy); review before forking for your tenancy model.
- hwloc / sysfs parsers assume non-malicious kernel output; do not run untrusted fake sysfs in the same trust domain as production secrets.
- Daemon NUMA memory policy is implemented via cgroup v2
cpuset.mems(and CPU masks), not a rawmbind(2)syscall in-process; callers who need explicitmbindmust combine NexusFlow with their own memory policy or future API extensions. - Eviction is best-effort: races, immutable affinities, and namespaced PIDs can limit effectiveness; SIGSTOP can leave processes frozen until
SIGCONT—document ops procedures before enabling automation.
ARCHITECTURE.md— data plane and placementREADME.md— benchmarks and entry pointsnexusflow/README.md— operational hardening (dashboard + CLI surfaces)