feat: split airgap RKE2 infra pipeline into setup/destroy jobs#595
feat: split airgap RKE2 infra pipeline into setup/destroy jobs#595floatingman wants to merge 18 commits intorancher:mainfrom
Conversation
…-infra) Create a single Declarative Pipeline that replaces both Jenkinsfile.setup.airgap.rke2 and Jenkinsfile.destroy.airgap.rke2, controlled by an ACTION parameter (setup/destroy). Consumes shared functions from qa-jenkins-library (rancher#589): - airgap.standardCheckout, airgap.configureAnsible, airgap.deployRKE2, airgap.deployRancher, airgap.teardownInfrastructure - s3.uploadArtifact, s3.downloadArtifact, s3.deleteArtifact - tofu.initBackend, tofu.createWorkspace, tofu.apply, tofu.getOutputs Original Jenkinsfiles remain untouched for parallel coexistence. Refs rancher#590
Pass TESTS_REPO_URL and QA_INFRA_REPO_URL to airgap.standardCheckout() so the pipeline can be pointed at a fork for testing. Refs rancher#590
The per-stage useWithProperties wrapping caused AWS_SSH_PEM_KEY_NAME to be unavailable when only AWS_SSH_PEM_KEY was loaded. Now all credentials are loaded in a single wrapper around the stages block, matching the original pipeline's approach. Refs rancher#590
AWS credentials were unavailable inside Docker containers because the property.useWithProperties wrapper was removed. Restructured to use a parent stage with nested stages inside a single credentials wrapper, ensuring all credentials are available throughout the pipeline. Refs rancher#590
env.WORKSPACE_NAME was null inside nested closures (script > useWithProperties > stage). Use a local variable which persists correctly across all nested stages within the same script block. Refs rancher#590
The s3.groovy shared functions default to config.groovy values (rancher-qa-artifacts, us-east-1) but the pipeline uses S3_BUCKET_NAME and S3_BUCKET_REGION from Jenkins job parameters. Pass these as explicit overrides to uploadArtifact, downloadArtifact, and deleteArtifact. Also move env.WORKSPACE_NAME assignment after tofu.createWorkspace succeeds so the post-failure cleanup block has a valid value.
The feature/decouple-tofu branch removed the local_file resource that previously wrote inventory.yml during tofu apply. The pipeline now generates the inventory by fetching the airgap_inventory_json output, parsing it, and rendering the YAML inventory file before Ansible stages.
Replace the hand-rolled Groovy inventory generation with qa-infra-automation's scripts/generate_inventory.py, which is the canonical inventory renderer. Runs inside the Docker container that already has Python and PyYAML.
Instead of uploading terraform.tfvars to S3 during setup and downloading during destroy, reconstruct the tfvars file from TERRAFORM_CONFIG environment variable (set via Jenkins folder properties) in both the destroy path and post-failure cleanup. Removes: - Setup: Upload Terraform Variables to S3 - Destroy: Download Terraform Variables from S3 - Destroy: Delete S3 Artifacts Adds: - Destroy: Configure Tofu Variables (reconstructs tfvars) - Post-failure cleanup now writes tfvars before teardown
The bastion node needs the SSH key to scp bundles to airgap nodes. The original setup pipeline had this stage but it was missing from the unified pipeline. Runs setup-ssh-keys.yml playbook after Ansible configuration and before RKE2 deployment.
Add echo statements showing which conditions pass/fail so we can diagnose why teardown isn't triggering when the build fails.
env.WORKSPACE_NAME doesn't persist from inside the property.useWithProperties closure to the post block (shows 'null'). Instead, read workspace_name.txt which is written by infrastructure.archiveWorkspaceName during the setup stages.
Split unified Jenkinsfile.airgap-rke2-infra into separate setup and destroy pipelines. Setup uses make.runTarget() for cluster/registry/ rancher Makefile targets; destroy uses Jenkins library directly. Also adds make to Dockerfile.infra and archives Rancher admin token.
In Jenkinsfiles, shared library vars are global variables — use make.runTarget() directly, not new make(). The new X() pattern is only needed inside other vars/*.groovy files.
…troy Jenkinsfile The `text` parameter `default: |` is YAML multiline syntax, not valid Groovy. Replaced with `defaultValue: '''...'''` to fix the compilation error: "unexpected token: | @ line 84".
…t jobs) The setup and destroy pipelines are now separate: - Jenkinsfile.setup.airgap-rke2-infra - Jenkinsfile.destroy.airgap-rke2-infra
There was a problem hiding this comment.
Pull request overview
Splits the airgap RKE2 infrastructure Jenkins pipeline into dedicated setup and destroy jobs to better isolate provisioning vs teardown flows, while updating the infra tools image to support Makefile-driven automation.
Changes:
- Add
Jenkinsfile.setup.airgap-rke2-infrato provision infra, deploy RKE2, optionally configure registry, and optionally deploy Rancher (via Makefile targets). - Add
Jenkinsfile.destroy.airgap-rke2-infrato teardown infra for a specified workspace using shared Jenkins library functions. - Update
Dockerfile.infrato installmakefor Makefile-driven automation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| validation/pipeline/Jenkinsfile.setup.airgap-rke2-infra | New setup pipeline: checkout, build tools image, provision infra via tofu, configure Ansible, run Make targets, archive token, and teardown-on-failure logic |
| validation/pipeline/Jenkinsfile.destroy.airgap-rke2-infra | New destroy pipeline: checkout, build tools image, init tofu backend, write tfvars, teardown infra, and cleanup artifacts |
| validation/pipeline/Jenkinsfile.airgap-rke2-infra | Unified pipeline file present in the PR despite the PR’s “split” goal |
| validation/pipeline/Dockerfile.infra | Adds make to the infra tools image packages |
| string( | ||
| name: 'PRIVATE_REGISTRY_USERNAME', | ||
| defaultValue: '', | ||
| description: 'Private registry username' | ||
| ) | ||
| string( | ||
| name: 'PRIVATE_REGISTRY_PASSWORD', | ||
| defaultValue: '', | ||
| description: 'Private registry password' | ||
| ) |
There was a problem hiding this comment.
PRIVATE_REGISTRY_PASSWORD is defined as a plain string parameter, which makes it visible in Jenkins job configuration/build parameters and potentially logs. This should be a masked password parameter (or sourced from Jenkins credentials/folder properties) to avoid credential exposure.
| # Whether to enable private registry configuration | ||
| enable_private_registry: true | ||
|
|
||
| # Deploy Rancher | ||
| deploy_rancher: true | ||
| install_helm: true | ||
|
|
||
| rancher_hostname: "${HOSTNAME_PREFIX}.qa.rancher.space" | ||
| rancher_bootstrap_password: "rancherrocks" | ||
| rancher_admin_password: "rancherrocks" | ||
| rancher_image_tag: ${RANCHER_VERSION} | ||
| rancher_use_bundled_system_charts: true''' |
There was a problem hiding this comment.
The default ANSIBLE_VARIABLES hard-codes enable_private_registry: true and deploy_rancher: true, but the pipeline behavior makes both optional (PRIVATE_REGISTRY_URL gating + DEPLOY_RANCHER parameter). This mismatch can lead to Ansible/Make behavior diverging from the selected pipeline options; consider templating these values from parameters (or removing them if they’re not used).
| stage('Deploy RKE2 Cluster') { | ||
| make.runTarget( | ||
| target: 'cluster', | ||
| dir: 'qa-infra-automation', | ||
| makeArgs: 'ENV=airgap', | ||
| passAwsCreds: false | ||
| ) |
There was a problem hiding this comment.
Several steps use hard-coded paths like dir: 'qa-infra-automation' / qa-infra-automation/tmp/... even though checkout returns env.INFRA_DIR. If airgap.standardCheckout ever changes its target directory (or the workspace layout differs), these stages will break; prefer using env.INFRA_DIR consistently.
| stage('Archive Rancher Admin Token') { | ||
| if (params.DEPLOY_RANCHER) { | ||
| def tokenPath = 'qa-infra-automation/tmp/rancher-admin-token.json' | ||
| if (fileExists(tokenPath)) { | ||
| archiveArtifacts artifacts: tokenPath, fingerprint: true | ||
| echo "Archived Rancher admin token from ${tokenPath}" | ||
| } else { | ||
| echo "Warning: Rancher admin token file not found at ${tokenPath}" | ||
| } |
There was a problem hiding this comment.
Archiving rancher-admin-token.json as a build artifact (with fingerprinting) exposes a highly privileged credential to anyone who can access artifacts (and potentially to downstream systems via fingerprinting). Consider storing it in a credentials store/secret manager instead, or at minimum avoid fingerprinting and ensure artifact access is tightly restricted and short-lived.
| try { | ||
| property.useWithProperties([ | ||
| 'AWS_ACCESS_KEY_ID', | ||
| 'AWS_SECRET_ACCESS_KEY' | ||
| ]) { | ||
| def tofuDir = "${env.INFRA_DIR}/tofu/aws/modules/airgap" | ||
|
|
||
| def terraformConfig = infrastructure.parseAndSubstituteVars( | ||
| content: env.TERRAFORM_CONFIG, | ||
| envVars: [ | ||
| 'AWS_ACCESS_KEY_ID': env.AWS_ACCESS_KEY_ID, | ||
| 'AWS_SECRET_ACCESS_KEY': env.AWS_SECRET_ACCESS_KEY, | ||
| 'HOSTNAME_PREFIX': env.HOSTNAME_PREFIX, | ||
| 'AWS_SSH_PEM_KEY_NAME': env.AWS_SSH_PEM_KEY_NAME | ||
| ] | ||
| ) | ||
| infrastructure.writeConfig( | ||
| path: "${tofuDir}/terraform.tfvars", | ||
| content: terraformConfig | ||
| ) | ||
|
|
||
| airgap.teardownInfrastructure( | ||
| dir: tofuDir, | ||
| name: wsName, | ||
| varFile: 'terraform.tfvars' | ||
| ) |
There was a problem hiding this comment.
In post { failure { ... } }, the teardown path rebuilds terraform.tfvars using env.AWS_SSH_PEM_KEY_NAME, but that variable is not loaded in the useWithProperties block used for cleanup (only AWS access/secret are). If TERRAFORM_CONFIG references ${AWS_SSH_PEM_KEY_NAME}, substitution may produce an incorrect tfvars and prevent cleanup; include the needed property in the cleanup useWithProperties list or avoid substituting it here.
| * rancher — Deploy Rancher via Helm | ||
| */ | ||
|
|
||
| def libraryBranch = params.QA_JENKINS_LIBRARY_BRANCH ?: 'main' |
There was a problem hiding this comment.
This Jenkinsfile loads the shared library branch using params.QA_JENKINS_LIBRARY_BRANCH at top-level. Elsewhere in this repo the convention is to use env.QA_JENKINS_LIBRARY_BRANCH for early Jenkinsfile evaluation (e.g. validation/pipeline/Jenkinsfile.setup.airgap.rke2:3); using params here can be brittle depending on when parameters are bound. Consider switching to env.QA_JENKINS_LIBRARY_BRANCH for consistency and reliability.
| def libraryBranch = params.QA_JENKINS_LIBRARY_BRANCH ?: 'main' | |
| def libraryBranch = env.QA_JENKINS_LIBRARY_BRANCH ?: 'main' |
| * infrastructure.writeConfig, infrastructure.cleanupArtifacts | ||
| */ | ||
|
|
||
| def libraryBranch = params.QA_JENKINS_LIBRARY_BRANCH ?: 'main' |
There was a problem hiding this comment.
This Jenkinsfile loads the shared library branch using params.QA_JENKINS_LIBRARY_BRANCH at top-level. The established pattern in this repo is env.QA_JENKINS_LIBRARY_BRANCH (e.g. validation/pipeline/Jenkinsfile.destroy.airgap.rke2:3), which is safer during Jenkinsfile evaluation. Consider switching to env for consistency/reliability.
| def libraryBranch = params.QA_JENKINS_LIBRARY_BRANCH ?: 'main' | |
| def libraryBranch = env.QA_JENKINS_LIBRARY_BRANCH ?: 'main' |
| 'HOSTNAME_PREFIX': env.HOSTNAME_PREFIX, | ||
| 'AWS_SSH_PEM_KEY_NAME': env.AWS_SSH_PEM_KEY_NAME |
There was a problem hiding this comment.
Configure Tofu Variables substitutes AWS_SSH_PEM_KEY_NAME, but this pipeline never loads/provides AWS_SSH_PEM_KEY_NAME (it’s not in useWithProperties and not a parameter). If TERRAFORM_CONFIG references ${AWS_SSH_PEM_KEY_NAME}, substitution may yield an incorrect tfvars and break destroy; either load the property/parameter or drop it from the substitution map.
| 'HOSTNAME_PREFIX': env.HOSTNAME_PREFIX, | |
| 'AWS_SSH_PEM_KEY_NAME': env.AWS_SSH_PEM_KEY_NAME | |
| 'HOSTNAME_PREFIX': env.HOSTNAME_PREFIX |
| rancher_bootstrap_password: "rancherrocks" | ||
| rancher_admin_password: "rancherrocks" |
There was a problem hiding this comment.
ANSIBLE_VARIABLES defaults hard-code the Rancher bootstrap/admin password (rancherrocks). Even for ephemeral environments this is a weak, well-known credential and is stored in cleartext in the Jenkinsfile/job config. Prefer sourcing these from Jenkins credentials (or at least a masked parameter) and generating per-run values if possible.
| rancher_bootstrap_password: "rancherrocks" | |
| rancher_admin_password: "rancherrocks" | |
| rancher_bootstrap_password: "${RANCHER_BOOTSTRAP_PASSWORD}" | |
| rancher_admin_password: "${RANCHER_ADMIN_PASSWORD}" |
- Use env instead of params for library branch resolution - Change PRIVATE_REGISTRY_PASSWORD to password type parameter - Template rancher bootstrap/admin passwords from parameters - Template enable_private_registry and deploy_rancher from params - Use env.INFRA_DIR consistently in make.runTarget and archiveArtifacts - Remove fingerprint from archiveArtifacts for token file - Add AWS_SSH_PEM_KEY_NAME to cleanup useWithProperties block - Remove unused AWS_SSH_PEM_KEY_NAME from destroy substitution map
Summary
Jenkinsfile.airgap-rke2-infrainto separate setup (Jenkinsfile.setup.airgap-rke2-infra) and destroy (Jenkinsfile.destroy.airgap-rke2-infra) pipelinesmake.runTarget()for RKE2 cluster, private registry, and Rancher deployment via the qa-infra-automation Makefileinfra-downprompt and lack of workspace managementmaketoDockerfile.infraAlpine packagesrancher-admin-token.json) as a build artifactairgap.teardownInfrastructure()Test Plan
airgap-rke2-infra-setupjob in Jenkins — verify provisions infrastructure, deploys RKE2 cluster, and optionally deploys Rancherairgap-rke2-infra-destroyjob — verify infrastructure is torn down and workspace deletedDepends on:
make.groovylibrary function)