Skip to content

Conversation

@ytsssun
Copy link
Contributor

@ytsssun ytsssun commented Nov 6, 2025

Issue number:

Closes #713

Description of changes:
Adds apiclient network configure <URI> command to configure network settings at runtime. The implementation includes:

  • API Server: New /network/configure POST endpoint that validates and writes net.toml content to /.bottlerocket/net.toml
  • API Client: New network configure subcommand supporting file:// and base64: URI schemes
  • Input validation: UTF-8 validation, base64 decoding, and file access with proper error handling
  • Documentation: Updated README template and module documentation

Testing done:

1. Tested the change via bootstrap-commands using base64 scheme
Started metal-dev with command

./tools/start-local-vm --variant metal-dev --arch $(uname -m) --inject-file net.toml --inject-file user-data.toml

My initial net.toml only has configurations for a single interface

version = 3

[enp0s16]
dhcp4 = true
primary = true

My user-data.toml uses bootstrap-commands

[settings.bootstrap-commands.setup-network-inline]
commands = [
  ["apiclient", "network", "configure", "base64:dmVyc2lvbiA9IDMKCltlbnAwczE2XQpkaGNwNCA9IHRydWUKZGhjcDYgPSBmYWxzZQpwcmltYXJ5ID0gdHJ1ZQoKW2VucDBzMTddCmRoY3A0ID0gdHJ1ZQpkaGNwNiA9IGZhbHNlCg=="],
  ["apiclient", "reboot"]
]
mode = "once"
essential = true

Where the base64 encoded net.toml is

echo "dmVyc2lvbiA9IDMKCltlbnAwczE2XQpkaGNwNCA9IHRydWUKZGhjcDYgPSBmYWxzZQpwcmltYXJ5ID0gdHJ1ZQoKW2VucDBzMTddCmRoY3A0ID0gdHJ1ZQpkaGNwNiA9IGZhbHNlCg==" | 
base64 -d

version = 3

[enp0s16]
dhcp4 = true
dhcp6 = false
primary = true

[enp0s17]
dhcp4 = true
dhcp6 = false

Confirmed that the node automatically rebooted and the net.toml was written to the desired path /.bottlerocket/net.toml

bash-5.2# cat /.bottlerocket/net.toml 
version = 3

[enp0s16]
dhcp4 = true
dhcp6 = false
primary = true

[enp0s17]
dhcp4 = true
dhcp6 = false

And both interfaces were brought up

ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
3: enp0s17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default 
    link/ether 02:42:f1:1e:86:ce brd ff:ff:ff:ff:ff:ff
bash-5.2# cat /etc/systemd/network/10-enp0s16.network

2. Tested the change via bootstrap-container using file:// scheme
Used similar setup as 1. with metal-dev. Only difference is the user-data.toml

[settings.bootstrap-containers.setup-network-file]
source = "public.ecr.aws/bottlerocket/bottlerocket-bootstrap:v0.2.8"
mode = "once"
user-data = "IyEvdXNyL2Jpbi9lbnYgYmFzaApzZXQgLWV1CgojIENoZWNrIHJlYm9vdCBtYXJrZXIgZmlyc3QgLSBza2lwIGlmIGFscmVhZHkgcHJvY2Vzc2VkCklOU1RBTkNFX1JFQk9PVEVEPS8uYm90dGxlcm9ja2V0L2Jvb3RzdHJhcC1jb250YWluZXJzL2N1cnJlbnQvcmVib290ZWQKaWYgWyAtZiAiJElOU1RBTkNFX1JFQk9PVEVEIiBdOyB0aGVuCiAgICBlY2hvICJDb25maWd1cmF0aW9uIGFscmVhZHkgcHJvY2Vzc2VkIgogICAgZXhpdCAwCmZpCgplY2hvICJTZXR0aW5nIHVwIG5ldHdvcmsgY29uZmlndXJhdGlvbiB2aWEgZmlsZTovLyBzY2hlbWUuLi4iCgojIENyZWF0ZSB0aGUgZXhhY3QgbmV0d29yayBjb25maWd1cmF0aW9uCmNhdCA+IC90bXAvbmV0d29yay1jb25maWcudG9tbCA8PCAnSU5ORVJFT0YnCnZlcnNpb24gPSAzCgpbZW5wMHMxNl0KZGhjcDQgPSB0cnVlCmRoY3A2ID0gZmFsc2UKcHJpbWFyeSA9IHRydWUKCltlbnAwczE3XQpkaGNwNCA9IHRydWUKZGhjcDYgPSBmYWxzZQpJTk5FUkVPRgoKZWNobyAiTmV0d29yayBjb25maWd1cmF0aW9uIGNyZWF0ZWQ6IgpjYXQgL3RtcC9uZXR3b3JrLWNvbmZpZy50b21sCgplY2hvICJBcHBseWluZyBuZXR3b3JrIGNvbmZpZ3VyYXRpb24gdXNpbmcgZmlsZTovLyBzY2hlbWUuLi4iCmFwaWNsaWVudCBuZXR3b3JrIGNvbmZpZ3VyZSBmaWxlOi8vL3RtcC9uZXR3b3JrLWNvbmZpZy50b21sCgplY2hvICJOZXR3b3JrIGNvbmZpZ3VyYXRpb24gYXBwbGllZCBzdWNjZXNzZnVsbHkiCgojIE1hcmsgYXMgcHJvY2Vzc2VkIHRvIHByZXZlbnQgcmVib290IGxvb3AKdG91Y2ggIiRJTlNUQU5DRV9SRUJPT1RFRCIKCmVjaG8gIlJlYm9vdGluZyB0byBhcHBseSBuZXR3b3JrIGNoYW5nZXMuLi4iCmFwaWNsaWVudCByZWJvb3QK"

Where the encoded user-data here is:

#!/usr/bin/env bash
set -eu

# Check reboot marker first - skip if already processed
INSTANCE_REBOOTED=/.bottlerocket/bootstrap-containers/current/rebooted
if [ -f "$INSTANCE_REBOOTED" ]; then
    echo "Configuration already processed"
    exit 0
fi

echo "Setting up network configuration via file:// scheme..."

# Create the exact network configuration
cat > /tmp/network-config.toml << 'INNEREOF'
version = 3

[enp0s16]
dhcp4 = true
dhcp6 = false
primary = true

[enp0s17]
dhcp4 = true
dhcp6 = false
INNEREOF

echo "Network configuration created:"
cat /tmp/network-config.toml

echo "Applying network configuration using file:// scheme..."
apiclient network configure file:///tmp/network-config.toml

echo "Network configuration applied successfully"

# Mark as processed to prevent reboot loop
touch "$INSTANCE_REBOOTED"

echo "Rebooting to apply network changes..."
apiclient reboot

With this approach a network-config.toml file was created within the container at /tmp/network-config.toml and we then used apiclient network configure file:///tmp/network-config.toml to configure the net.toml.

Note that for bootstrap-containers, a marker file is needed to prevent reboot loop. The reboot happens during the execution of the bootstrap script so the subsequent process that turns the once mode to off will never happen.

systemd-inhibit command we used for bootstrap-commands is something we can look into which could be a way to avoid the usage of marker file.

@KCSesh - Contribution:

1. Re-Rean the bootstrap-commands using base64 scheme

2. Re-ran: Tested the change via bootstrap-container using file:// scheme
Note: Running this test multiple times required: --force-extract, to clean PRIVATE partition marker file

3. Validated STDIN

apiclient network configure <<EOF
version = 2
[eth0]
dhcp4 = true
primary = true
EOF

4. Tested invalid input

Invalid Version:

bash-5.2# journalctl -u bootstrap-commands
Nov 11 23:35:45 localhost systemd[1]: Starting Bootstrap Commands...
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1351]: 23:35:46 [INFO] Processing bootstrap command 'setup-network-invalid-version' ...
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1351]: 23:35:46 [WARN] Bootstrap command failed to execute 'apiclient' failed - stderr: Failed to configure network: Failed to POST network configuration to '/actions/network/configure': Status 400 when POSTing /actions/network/configure: Invalid network configuration: Failed to parse network config from stdin: Invalid network configuration: Unknown network config version: 99
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1351]: Bootstrap command 'setup-network-invalid-version' failed.
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal systemd-inhibit[1345]: /usr/bin/bootstrap-commands failed with exit status 1.
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: bootstrap-commands.service: Main process exited, code=exited, status=1/FAILURE
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: bootstrap-commands.service: Failed with result 'exit-code'.
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: Failed to start Bootstrap Commands.

bash-5.2# journalctl -u apiserver
...
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal apiserver[1251]: 23:35:46 [INFO] Invoking netdog commit to validate and write network configuration
Nov 11 23:35:46 ip-10-0-2-15.us-west-2.compute.internal apiserver[1251]: 23:35:46 [ERROR] Netdog commit failed: Network configuration validation failed: Failed to parse network config from stdin: Invalid network configuration: Unknown network config version: 99

Unknown field:

bash-5.2# journalctl -u bootstrap-commands
Nov 11 23:37:54 localhost systemd[1]: Starting Bootstrap Commands...
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: 23:37:54 [INFO] Processing bootstrap command 'setup-network-unknown-field' ...
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: 23:37:54 [WARN] Bootstrap command failed to execute 'apiclient' failed - stderr: Failed to configure network: Failed to POST network configuration to '/actions/network/configure': Status 400 when POSTing /actions/network/configure: Invalid network configuration: Failed to parse network config from stdin: Failed to parse network config: TOML parse error at line 2, column 7
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]:   |
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: 2 | hello=hi
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]:   |       ^
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: invalid string
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: expected `"`, `'`
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal bootstrap-commands[1336]: Bootstrap command 'setup-network-unknown-field' failed.
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal systemd-inhibit[1330]: /usr/bin/bootstrap-commands failed with exit status 1.
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: bootstrap-commands.service: Main process exited, code=exited, status=1/FAILURE
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: bootstrap-commands.service: Failed with result 'exit-code'.
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal systemd[1]: Failed to start Bootstrap Commands.

bash-5.2# journalctl -u apiserver
...
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]: 23:37:54 [INFO] Invoking netdog commit to validate and write network configuration
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]: 23:37:54 [ERROR] Netdog commit failed: Network configuration validation failed: Failed to parse network config from stdin: Failed to parse network config: TOML parse error at line 2, column 7
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]:   |
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]: 2 | hello=hi
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]:   |       ^
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]: invalid string
Nov 11 23:37:54 ip-10-0-2-15.us-west-2.compute.internal apiserver[1238]: expected `"`, `'`
bash-5.2#

Invalid stdin:

# Starting point: 
bash-5.2# cat /.bottlerocket/net.toml
version = 3

[enp0s16]
dhcp4 = true
dhcp6 = false
primary = true

[enp0s17]
dhcp4 = true
dhcp6 = false

# Apply incorrect net.toml:
bash-5.2# apiclient network configure <<EOF
version = 3

[enp0s17]
dhcp5 = true
dhcp6 = false
EOF

[  191.895006] apiserver[1238]: 16:06:56 [ERROR] netdog commit failed: network configuration validation failed: Failed to parse network config from stdin: Failed to parse network config: data did not match any variant of untagged enum NetworkDeviceV1
Failed to configure network: Failed to POST network configuration to '/actions/network/configure': Status 400 when POSTing /actions/network/configure: Invalid network configuration: Failed to parse network config from stdin: Failed to parse network config: data did not match any variant of untagged enum NetworkDeviceV1

# No change to file: 
bash-5.2# cat /.bottlerocket/net.toml
version = 3

[enp0s16]
dhcp4 = true
dhcp6 = false
primary = true

[enp0s17]
dhcp4 = true
dhcp6 = false
bash-5.2#

5. Verified new tmp approach, with admin-container and inotify on metal:

[root@admin]# apiclient network configure <<EOF
> version = 3
> [enp0s16]
> dhcp4 = true
> dhcp6 = false
> primary = true
> EOF
IN_CREATE: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_OPEN: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_MODIFY: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_OPEN: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_MODIFY: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_CLOSE_WRITE: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_MOVED_FROM: /.bottlerocket/rootfs/.bottlerocket/.tmpZkeAI5
IN_MOVED_TO: /.bottlerocket/rootfs/.bottlerocket/net.toml
IN_CLOSE_WRITE: /.bottlerocket/rootfs/.bottlerocket/net.toml

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@ytsssun ytsssun marked this pull request as draft November 6, 2025 07:11
@ytsssun ytsssun marked this pull request as ready for review November 6, 2025 20:39
@ytsssun ytsssun force-pushed the apiclient-network-configure branch 2 times, most recently from f5579da to f9a988d Compare November 7, 2025 23:32
@ytsssun
Copy link
Contributor Author

ytsssun commented Nov 7, 2025

Force pushed to address comments from @cbgbt and @arnaldo2792

@ytsssun ytsssun force-pushed the apiclient-network-configure branch from f9a988d to 0ad1427 Compare November 7, 2025 23:41
Copy link
Contributor

@cbgbt cbgbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! just a few suggestions.

Comment on lines +11 to +12
///
/// The configuration will be applied at the next boot - a reboot is required for changes to take effect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wouldn't put this here as this is relevant to the user (which won't read this), and rather should be in the website docs or the main apiclient documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally found it helpful, without this comment I spent time looking around after seeing the apiserver implementation to try to understand if we were missing the code to actually apply the config after saving it.

But I've also been on the unpopular side of this same tension many times, so I'm curious to see how others think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinion here and I could see the tradeoffs. But this change is made to address comments here - #714 (comment)

So I would like to keep it as is now to avoid extra churns.

@ytsssun ytsssun force-pushed the apiclient-network-configure branch 2 times, most recently from cc67f0e to d5a8454 Compare November 8, 2025 07:34
@ytsssun
Copy link
Contributor Author

ytsssun commented Nov 8, 2025

Force pushed to address comments from @cbgbt and @arnaldo2792 :

  1. Remove comments that is redundant where the code is self-explanatory.
  2. Documents wording update.

@ytsssun ytsssun requested a review from cbgbt November 8, 2025 07:37
@KCSesh
Copy link
Contributor

KCSesh commented Nov 12, 2025

^ This is a decent diff URL:
https://github.com/bottlerocket-os/bottlerocket-core-kit/compare/d5a84546284dd157011259800a3debbeac32ecde..4facba9d46b4f48afc45a1d75471fbda13139cf4

Above I added 2 minor changes:

  • Add netdog commit which gets invoked by apiserver after apiclient does initial parse logic. Netdog commit now does the write to /.bottlerocket/net.toml
  • Add stdin support to apiclient network configure

I retested both valid and invalid net.toml files on metal-dev (will update the description).

@KCSesh KCSesh force-pushed the apiclient-network-configure branch from 4facba9 to 2b89a4a Compare November 12, 2025 02:51
@KCSesh
Copy link
Contributor

KCSesh commented Nov 12, 2025

^ The above push is just a rebase to get back on/near the tip of develop

@KCSesh KCSesh requested a review from bcressey November 12, 2025 02:53
@KCSesh KCSesh force-pushed the apiclient-network-configure branch from 2b89a4a to 12e3f9c Compare November 12, 2025 15:23
@KCSesh
Copy link
Contributor

KCSesh commented Nov 12, 2025

^ Fix comment typo, renamed 1 error, and created a const.

@KCSesh KCSesh force-pushed the apiclient-network-configure branch from 12e3f9c to cfe4cee Compare November 12, 2025 18:02
Copy link
Contributor

@cbgbt cbgbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I reviewed this yesterday but never hit submit.

Do you think you could fix the PartialEq comment left earlier?

ytsssun and others added 3 commits November 12, 2025 19:16
Implements 'apiclient network configure <input-source>' to enable runtime
network configuration for Bottlerocket. stdin is the default input, but also
supports file://, base64: input sources with client-side URI processing.
Content is sent to the API server for further processing.

Signed-off-by: Yutong Sun <yutongsu@amazon.com>
Signed-off-by: Kyle Sessions <kssessio@amazon.com>
Adds 'netdog commit' subcommand that validates and writes network
configuration from stdin to /.bottlerocket/net.toml.

Signed-off-by: Kyle Sessions <kssessio@amazon.com>
Adds POST /actions/network/configure endpoint to write network configuration
content directly to /.bottlerocket/net.toml. Includes error handling for
UTF-8 validation, directory creation, and file writing operations.

The endpoint integrates with netdog commit functionality to apply network
configuration changes.

Signed-off-by: Yutong Sun <yutongsu@amazon.com>
Signed-off-by: Kyle Sessions <kssessio@amazon.com>
@KCSesh KCSesh force-pushed the apiclient-network-configure branch from cfe4cee to d6e763e Compare November 12, 2025 19:17
Add PartialEq derivation to Subcommand and all related argument structs
to enable direct equality comparison in tests. Simplifies network configure
tests by replacing match statements with assert_eq comparisons.

Signed-off-by: Kyle Sessions <kssessio@amazon.com>
@KCSesh
Copy link
Contributor

KCSesh commented Nov 12, 2025

@cbgbt

Do you think you could fix the PartialEq comment left earlier?

Sorry missed this - fixed in the latest push

Along with the other nits (update docstring and remove what comments from netdog commit)

@KCSesh KCSesh merged commit 7bc89c5 into bottlerocket-os:develop Nov 12, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add network configure API to apiclient

5 participants