- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13
Container Lifecycle and Upgrade Improvements #1151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e6be413    to
    a2cf0bc      
    Compare
  
    3467ea9    to
    9a6b5c1      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
I left some comments. Mostly cosmetic code stuff and some theoretical security issues. Nothing that you really need to address if you don't want to.
Also. I don't understand why the indentation is totally broken in the container script. Most lines contains a mix of tabs and spaces. It's hard to read on GH and my editor is contemplating kill -9 $$ ;)
| 
 Thanks, awesome review! 
 I've addressed most of them and pushed a separate commit for easy review. The branch also includes a few new commits to fix issues in the 'copy' command helper for the shell and CLI. Stuff we forgot about for a year almost. Would be great if you could take a look at that too. 
 Well, Emacs default indentation for shell scripts, as most of us use, have a four space base indent, but replace leading 8-char spaces with tabs. I think one of the main reasons for your review issues is that GitHub has changed (!) the tab width since 2021 from 8 to 4 bytes, but this is possible to change in your https://github.com/settings/appearance settings   | 
| 
 Great. 
 Sure thing! 
 I strongly disagree towards mixing tabs and spaces for indentation. Personally I prefer spaces, which most documentation suggest (google et al). But growing up in #bash on IRC getting yelled at by GreyCat, I can live with tabs too. If you look at the file in the current state you will see that there's a mix of everything. Some lines are pure space (8+), some are pure tabs and some are a mix of tabs and spaces. | 
| 
 I'm sorry this turned into a tabs-vs-spaces discussion, that was not my intention. I was just stating a fact that most of our scripts have this mix of tabs vs spaces because of us Emacs users. 
 Yes, and I'd rather not reformat the entire file at this point because that would make  | 
| Great work! Just have a small question. | 
Fixes #1146 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This rectifies an omission from the initial yang model. Not all charachters are supported in container and volume names. E.g., simply attempting to create a volume or container with a space in the name causes this error message from podman: podman: Error: running volume create option: names must match [a-zA-Z0-9][a-zA-Z0-9_.-]*: invalid argument In addition to the regexp, the new 'ident' type also enforces a minimum and maximum length. Sure, technically a single char is allowed, but let's be reasonable, and who in their right mind wants an identifier > 64 chars? We have description for that. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Not only great for debugging, but also allows users to start their containers manually in another way. But yeah, mostly for debug. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
The default timeout for 'podman stop foo' is 10 seconds, which for heavily loaded systems with intricate shutdown process is *waaaay* too short. Increase it to the container script default 30s, which coincidentally is also the container@.conf template's kill delay. Fixes #1149 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Fixes #1148 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This commit adds metadata to track loaded OCI archives to allow skipping 'delete + load' of OCI images when restarting either the container or the system as a whole. The sha256 of all loaded OCI archives is stored in a sidecar file in our downloads directory. Then we verify the checksum of the OCI archives against their same-named sidecar to determine if the OCI archive is already loaded or not. Additionally, the instance using the image is labled with metadata to detect changes in the container configuration. This in turn allow skipping the delete + create phase also of the instance. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This commit reverts 477f7ae and bb19d06, which intended to fix an issue with lingering old images, see #1098. However, as detailed in #1147, this caused severe side effects while working with multiple larger containers. Basically, the prune operation of one container removed images of other containers that are just being created in parallel. Instead of using the podman prune command we can use the meta datain the start script to pinpoint exactly which image(s) to remove, including any downloaded OCI archives when the container instance is removed. Fixes #1147 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
When loading big OCI images at boot the `podman load` process completely monopolizes all cores of an Arm Cortex-A72. It blocks on I/O, sure, but with 'nice' we can get some attention at least to more critical services at boot. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Usually, when your system is up and running properly, you want to clean up anything unused from your previous experiments. This change alllows that by calling the interactive 'podman image prune -a -f' command from the CLI command 'container remove all' Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Container instances that run with mutable images, e.g., tagged with `:latest` or similar non-versioned tags, can be upgraded without changing the config. This commit fixes two issues found with this support: - force container image re-fetch on upgrade, even if the file exists locally - surgically remove old image from container store after upgrade Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
One unique feature of Infix OS is that you can embed your OCI archive in the rootfs.squashfs. This way your container instance does not need to download anything from the network. To upgrade you drop in a new image and rebuild Infix, when the new Infix boots the container script 'setup' command will recognize that the OCI archive has changed and will reload it into the container store. This patch is an improvement of the way too generic 'podman image prune' command used previously. Instead of looking for any dangling image, we now surgically remove the old image. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Shell script: - Factor out big portions of code into more logical helper functions - Simplify calling setup script by checking for remote image first - Simplify meta/sha up-to-date handling and clarify terminology - Consistent use of -f instead of -e in file-exists checks - Fix unsafe use of 'mktemp -u' C code: - Clarify meta/sha terminology: rename meta-sha256 -> meta-image-sha256 - Refactor weird archive_offset() function to local_path() helper - Factor out helper function calc_sha() - Check len of sha256 >= 64 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Already supported in the CLI. This makes it official, and quite handy for users that run mutable containers. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This commit adds four (small) container images to the Infamy test container which are used in the new container upgrade test. The test verifies that a mutable container can be upgraded and that old images are properly cleaned up from the container store. Fixes #624 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Rename test directories in infix_containers/ to remove the redundant 'container_' prefix since they already live under infix_containers/: container_basic -> basic container_bridge -> bridge container_enabled -> enabled container_environment -> environment container_firewall_basic -> firewall_basic container_host_commands -> host_commands container_phys -> phys container_veth -> veth container_volume -> volume Also update references in all.yaml and Readme.adoc files. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
When copying to the running datastore we cannot use sr_copy_config(), instead we must use sr_replace_config(). This fix covers both the case of 'copy startup-config running-config' and 'copy FILE running-config'. Fixes #1203 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This commit adds config file validation to the copy command, discussed in #373. Allowing users to test their config files before restoring a backup. The feature could also be used for the automatic rollback when downgrading to an earlier version of the OS. Fixes #373 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Fixes #981 Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
This is a follow-up to PR #717 where path traversal protection was discussed. A year later and it's clear that having a user-friendly copy tool in the shell is a good thing, but that we proably want to restrict what it can do when called from the CLI. A sanitize flag (-s) is added to control the behavior, when used in the shell without -s, both commands act like traditional UNIX tools and do assume . for relative paths, and allow ../, whereas when running from the CLI only /media/ is allowed and otherwise files are assumed to be in $HOME or /cfg Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
The regular file-to-file copy, was missing calls to cfg_adjust(), this commit fixes that and adds some helpful comments for each use-case. Also, drop insecure mktemp() in favor of our own version which uses the basename of the remote source file. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Add bash completion for the common datastores, like we already do in the CLI, and update the usage text accordingly. Also, make sure to install to /usr/bin, not /bin since we've now merged the hierarchies since a while back. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Since there is no safe way (unique hash/id) to identify unused named volumes, we cannot automate removal of them. Since volume data, when compared to a container image, is quite small, it was decided that we document this instead. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Much of the content was made obsolete by recent changes, we now optimize the experience for users, avoiding recreate at boot unless checksums for configuration or base image have changed. This was also a good time to explain the difference between mutable and immutable tags, which sometimes is a cause for great confusion. Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Description
This PR introduces significant improvements to container lifecycle management, focusing on optimization, safety, and ease of upgrades. A long-missing container upgrade test has also been added to verify both the functionality and that old images are removed. Not just any old downloaded OCI archives but also old/unused images loaded into the container store.
Key Features
Container Lifecycle Optimization
containers: possible to set longer instance name than the system supports #1146
Container Removal and Cleanup
containers: large containers sometimes never start up #1147
.sha256sidecar files when images are removedFailing containers with a local OCI archive always waits for network connectivity #1148
Container Upgrade Support
:latest,:edge)admin@example:/> container upgrade NAMEDocumentation
Test
container_prefix from test directoriesV=1support formake test-specto enable debug modetest: verify container upgrade #624
Technical Details
.sha256sidecar files in/var/lib/containers/oci//var/lib/containers/cleanup/<name>/<container-id>, issuecontainers: large containers sometimes never start up #1147
podman stoptimeout to prevent abrupt terminations, issueStopping containers sometimes fails #1149
podman load/createoperations to reduce system impact (at boot mostly)Checklist
Tick relevant boxes, this PR is-a or has-a: