-
Notifications
You must be signed in to change notification settings - Fork 67
Tweaks for FOSDEM demo #1186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Tweaks for FOSDEM demo #1186
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
f8aff04 to
fb6c149
Compare
The script are tiny but it is nice to have verify them with flake8, pylint and black. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add "path" config, so test config looks like OpenShift UI:
repo: https://github.com/RamenDR/ocm-ramen-samples.git
path: subscription
branch: main
name: busybox-sample
namespace: busybox-sample
With this we can use basic test to test any subscription based
application in ocm-ramen-samples[1] and ocm-kubevirt-samples[2].
[1] https://github.com/RamenDR/ocm-ramen-samples
[2] https://github.com/aglitke/ocm-kubevirt-samples
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
It is possible now to enable Kubernetes feature gates[1] using minikube
--feature-gates option[2]. We will use this to enable
StatefulSetAutoDeletePVC feature gate.
Example config:
profiles:
- name: featured
feature_gates:
- StatefulSetAutoDeletePVC=true
[1] https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
[2] https://minikube.sigs.k8s.io/docs/handbook/config/#enabling-feature-gates
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
With this we can create a statesfulset with persistentVolumeClaimRetentionPolicy[1] to have the PVCs deleted when a stateful set is deleted. This policy is required for relocate, otherwise ramen get stuck waiting for vrs to become secondary. [1] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#persistentvolumeclaim-retention Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We use `--namespace ramen-system` but these are deployed in cluster scope. I guess the `--namespace` is ignored in this case since this code works as is. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add `pvc_label` configuration so we can test any application. With this we can run basic-test with vms from ocm-kubeivrt-samples[1]. [1] https://github.com/aglitke/ocm-kubevirt-samples Signed-off-by: Nir Soffer <nsoffer@redhat.com>
The basic-test can be used now with OpenShift clusters (using new ocm-ramen-sampels providing subscription and dr kustomizations). The only issue is the dr policy - basic-test is hard coded to use `dr-policy` installed by ramenctl, which is not available in our OpenShift test clusters. Fix by using a dr policy owned by the test, created by the tests when deploying the application, and removed when undeploying the application. The name of the policy must be configured in the test `config.yaml`. To be able to run concurrent tests, each test config must have its own dr policy. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Since we can test any application now (e.g. busybox, kubvirt), we don't want to mention busybox in the logs. Use config['name'] when we can to make the logs more clear. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
The channel is part of the subscription kustomization in ocm-ramen-samples, so we don't need to deploy or undeploy it. The basic config uses now the new deployment from my repo. We will update the repo when the ocm-ramen-samples PR is merged. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
basic-test can be run now with custom configuration file. This can be
used to run multiple tests concurrently.
test/basic-test/run --config rbd-deploy.yaml $env 2>rbd.log &
test/basic-test/run --config cephfs-deploy.yaml $env 2>cephfs.log &
wait
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add tests configurations for multiple applications for OpenShift and
Kubernetes using ocm-ramen-samples repo.
Currently using my own repo until the new applications are merged.
To run test using a custom configuration use:
basic-test/run \
--config configs/odr/busybox-regional-rbd-deploy.yaml \
env.yaml
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Using my repo until the relevant PR[1] is merged. [1] aglitke/ocm-kubevirt-samples#6 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This tiny tool reads test suites yaml and run the tests in parallel
logging test logs to separate files.
A test suite binds tests (e.g. basic-test) to application configurations
(e.g. busybox-regional-rbd-deploy).
We have 2 test suites:
$ tree suites/
suites/
├── basic-k8s.yaml
└── basic-odr.yaml
Example run with drenv created environment:
$ ./drtest --outdir /tmp/k8s-logs suites/basic-k8s.yaml envs/regional-dr.yaml
2023-11-28 00:55:54,099 INFO Running 'Basic Kubernetes Regional DR tests'
2023-11-28 00:55:54,099 INFO Storing output to '/tmp/k8s-logs'
2023-11-28 00:55:54,101 INFO Starting test 'deploymnet'
2023-11-28 00:55:54,101 INFO Starting test 'statefulset'
2023-11-28 00:55:54,102 INFO Starting test 'daemonset'
2023-11-28 01:04:23,274 INFO Test 'daemonset' PASS
2023-11-28 01:04:24,161 INFO Test 'deploymnet' PASS
2023-11-28 01:04:53,600 INFO Test 'statefulset' PASS
2023-11-28 01:04:53,600 INFO PASS (3 pass, 0 fail)
The test logs to separate file:
$ tree /tmp/k8s-logs
/tmp/k8s-logs
├── daemonset.log
├── deploymnet.log
└── statefulset.log
To test with OpenShift we need to create a tiny environment file:
$ cat env.yaml
ramen:
hub: hub
clusters: [cluster1, cluster2]
topology: regional-dr
And use a kubeconfig file with the clusters. The file can be created
with `oc login` and some `oc config` commands, or using the
oc-clusterset plugin:
$ cat config.yaml
clusters:
- name: cluster1
url: perf1.example.com:6443
username: kubeadmin
password: PeSkM-R6YcH-LyPZa-oTOO1
- name: cluster2
url: perf2.example.com:6443
username: kubeadmin
password: ZjIZn-SFUyR-aE4gI-fJcfL
- name: hub
url: perf3.example.com:6443
username: kubeadmin
password: 7C700-oVS3Q-25rtx-YMew5
current-context: hub
$ oc clusterset login --config config.yaml --kubeconfig kubeconfig
$ oc config get-contexts --kubeconfig kubeconfig
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
cluster1 perf1-example-com:6443 kube:admin/perf1-example-com:6443 default
cluster2 perf2-example-com:6443 kube:admin/perf2-example-com:6443 default
* hub perf3-example-com:6443 kube:admin/perf3-example-com:6443 default
Example run with the OpenShift environment:
$ ./drtest --kubeconfig kubeconfig --outdir /tmp/odr-logs suites/basic-odr.yaml env.yaml
2023-11-29 23:45:14,849 INFO Running 'Basic OpenShift Regional DR tests'
2023-11-29 23:45:14,849 INFO Storing output to '/tmp/odr-logs'
2023-11-29 23:45:14,850 INFO Starting test 'rbd'
2023-11-29 23:45:14,850 INFO Starting test 'cephfs'
2023-11-29 23:54:24,599 INFO Test 'rbd' PASS
2023-11-29 23:54:51,461 INFO Test 'cephfs' PASS
2023-11-29 23:54:51,461 INFO PASS (2 pass, 0 fail)
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
https: //github.com/kubevirt/containerized-data-importer/releases/tag/v1.58.0 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
https: //github.com/kubevirt/kubevirt/releases/tag/v1.1.1 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
There is no point in using tow versions of the same image. Using this image in the CDI test can save time in the kubvirt tests later, using the cached image. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
CDI may become available before it is ready to use. If we try to use it
while it is progressing we may fail with errors about missing CRDS. Wait
until the progressing condition becomes false.
Example run showing the issue:
2024-01-10 21:42:24,080 DEBUG [kubevirt/1] Deploying cdi cr
2024-01-10 21:42:25,674 DEBUG [kubevirt/1] Waiting until cdi cr is available
2024-01-10 21:42:26,005 DEBUG [kubevirt/1] cdi.cdi.kubevirt.io/cdi condition met
We stopped waiting here...
2024-01-10 21:42:26,007 DEBUG [kubevirt/1] Waiting until cdi cr finished progressing
2024-01-10 21:42:39,472 DEBUG [kubevirt/1] cdi.cdi.kubevirt.io/cdi condition met
But CDI finished progressing 13 seconds later.
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
We cannot use volsync with ramen yet, and the kubevirt environment is already too big. Without volsync we can remove the volumesnapshot addon and submariner, which does not handle well suspending of the machine running the minikube VMs. With this change we should be able to start an environment, suspend the laptop, and resume it in an environment with unreliable network or no network access. This will be useful for live demo in conferences. Keep volsync enabled in `regional-dr` and `regional-dr-hubless` to keep the submariner and volsync addons functional. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
This is useful for starting a stopped working environment quickly
without trying to redeploy everything. The main motivation is using a
pre created environment in location with weak network like a conference.
Other use cases are working around bugs in addons that do not work well
when starting a stopped cluster, for example clusteradm.
With `--skip-addons` we skip the `start` and `stop` hooks, but we do run
the `test` hooks. This is useful for starting a stopped environment
faster but testing that the environment works. To skip all hooks run
with both `--skip-addons` and `--skip-tests`.
Example run:
$ drenv start --skip-addons --skip-tests $env
2023-11-20 00:59:25,341 INFO [rdr-kubevirt] Starting environment
2023-11-20 00:59:25,464 INFO [dr1] Starting minikube cluster
2023-11-20 00:59:29,566 INFO [hub] Starting minikube cluster
2023-11-20 00:59:29,578 INFO [dr2] Starting minikube cluster
2023-11-20 01:00:23,402 INFO [dr1] Cluster started in 57.94 seconds
2023-11-20 01:00:23,402 INFO [dr1] Configuring containerd
2023-11-20 01:00:24,936 INFO [dr1] Waiting until all deployments are available
2023-11-20 01:00:28,749 INFO [hub] Cluster started in 59.18 seconds
2023-11-20 01:00:28,750 INFO [hub] Waiting until all deployments are available
2023-11-20 01:00:53,834 INFO [dr2] Cluster started in 84.26 seconds
2023-11-20 01:00:53,834 INFO [dr2] Configuring containerd
2023-11-20 01:00:55,042 INFO [dr2] Waiting until all deployments are available
2023-11-20 01:01:01,063 INFO [hub] Deployments are available in 32.31 seconds
2023-11-20 01:01:09,482 INFO [dr1] Deployments are available in 44.55 seconds
2023-11-20 01:01:34,661 INFO [dr2] Deployments are available in 39.62 seconds
2023-11-20 01:01:34,661 INFO [rdr-kubevirt] Dumping ramen e2e config to '/home/nsoffer/.config/drenv/rdr-kubevirt'
2023-11-20 01:01:34,827 INFO [rdr-kubevirt] Environment started in 129.49 seconds
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Configure CDI to allow pulling from a local insecure registry. This is useful for demos in an environment with unreliable network, or for CI environment when we want to avoid random failures due to flaky network. The image must be pushed to the local registry, this is easy using standard podman push command. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Using a local git server we can deploy ocm applications without network access to github. This is useful for demos when the network is unreliable, for example in a conference. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Using local registry is useful for demos when network is unreliable, for example in a conference. It can also be used to avoid random failures when the network is flaky, by caching remove images locally. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Suspend or resume the underlying virtual machines. Assuming kvm2 driver to keep it simple for now, need to implement it better later so it works also with qemu2 driver. The use case is building the environment with good network, suspending it, and resuming it in an environment with flaky network for demo. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Using the local server to verify that we can demo kubevirt dr flows in environment with unreliable network. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
To avoid certificate renewals during testing.
Without this I experienced this error:
drenv.commands.Error: Command failed:
command: ('kubectl', 'apply', '--context', 'dr1', '--kustomize=cr')
exitcode: 1
error:
Error from server (InternalError): error when applying patch:
{"spec":{"configuration":{"developerConfiguration":{"featureGates":[]}}}}
to:
Resource: "kubevirt.io/v1, Resource=kubevirts", GroupVersionKind: "kubevirt.io/v1, Kind=KubeVirt"
Name: "kubevirt", Namespace: "kubevirt"
for: "cr": error when patching "cr": Internal error occurred: failed calling webhook
"kubevirt-update-validator.kubevirt.io": failed to call webhook: Post
"https://kubevirt-operator-webhook.kubevirt.svc:443/kubevirt-validate-update?timeout=10s":
tls: failed to verify certificate: x509: certificate has expired or is not yet valid:
current time 2024-01-26T19:05:52Z is after 2024-01-26T16:24:46Z
Thanks: Michael Henriksen <mhenriks@redhat.com>
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
To avoid certificate renewals during testing.
Without this I experienced this error when starting a stopped
environment after a day:
drenv.commands.Error: Command failed:
command: ('kubectl', 'apply', '--context', 'dr2', '--kustomize=disk')
exitcode: 1
error:
Error from server (InternalError): error when creating "disk": Internal
error occurred: failed calling webhook "populator-validate.cdi.kubevirt.io":
failed to call webhook: Post "https://cdi-api.cdi.svc:443/populator-validate?timeout=30s":
tls: failed to verify certificate: x509: certificate has expired or is not yet valid:
current time 2024-01-28T14:08:01Z is after 2024-01-27T19:15:20Z
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Instead of patching the installed resource, patch it via kustomization.
With this can can check the correctness using:
kustomize build addons/cdi/cr
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
With this you can run the local registry as a systemd service starting at boot, instead of starting the registry manually when you want to use it. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Explain why we need Go 1.20 and how to maintain multiple Go versions so ramen can be built and tested while using newer default Go version. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
When comparing PVs, skip comparing unset "Spec.ClaimRef.kind". This
breaks validation when using KubeVirt VM, and actual resources in the
system do not match the backed up resources in the s3 store. It is
correct to ignore unset kind since this is an optional field[1].
Previously we failed with:
Failed to restore PVs: failed to restore ClusterData for VolRep
(failed to restore PVs and PVCs using profile list
([s3profile-perf8-ocs-storagecluster]): failed to restore all
[]v1.PersistentVolume. Total/Restored 1/0)
And then the VRG will not make any progress. Now we consider unset
"kind" as equal and continue the flow normally.
[1] https://github.com/kubernetes/api/blob/f3648a53522eb60ea75d70d36a50c799f7e4e23b/core/v1/types.go#L6381
Bug: https://bugzilla.redhat.com/2262455
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Member
Author
|
Not needed now, replaced by #1213 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Various tweaks for FOSDEM demo:
Based on #1140