Skip to content

Commit 252e0b0

Browse files
Update PR description: remove second diagram and document recent improvements
1 parent b22bb3a commit 252e0b0

File tree

1 file changed

+213
-0
lines changed

1 file changed

+213
-0
lines changed

scripts/pr-description.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# Add comprehensive OpenShift cluster destroyer script
2+
3+
## Summary
4+
5+
This PR introduces a robust bash script for safely destroying OpenShift clusters on AWS. The script handles multiple cluster states including properly installed clusters, orphaned clusters without state files, and partially created clusters that failed during installation.
6+
7+
## Key Features
8+
9+
### Core Capabilities
10+
- **Multi-method destruction**: Attempts openshift-install first, falls back to manual AWS cleanup
11+
- **Comprehensive resource cleanup**: Handles EC2, VPC, ELB, Route53, S3, and all associated resources
12+
- **Auto-detection**: Automatically discovers infrastructure IDs from cluster names
13+
- **Orphaned cluster support**: Can destroy clusters even without metadata/state files
14+
- **Reconciliation loop**: Multiple attempts with intelligent retry logic for stubborn resources
15+
16+
### Safety Features
17+
- **Dry-run mode**: Preview all resources before deletion with `--dry-run`
18+
- **Confirmation prompts**: Requires explicit confirmation before destructive actions
19+
- **Input validation**: Prevents injection attacks with strict input sanitization
20+
- **Detailed logging**: Local file logging + optional CloudWatch integration
21+
- **Resource verification**: Post-destruction verification to ensure complete cleanup
22+
23+
### Operational Features
24+
- **List clusters**: Discover all OpenShift clusters in a region with `--list`
25+
- **Flexible targeting**: Destroy by cluster name, infra-id, or metadata file
26+
- **Parallel operations**: Optimized API calls for faster resource counting
27+
- **Progress tracking**: Real-time status updates during destruction
28+
- **S3 state management**: Automatic cleanup of cluster state files
29+
- **Flexible logging**: Custom log paths with `--log-file`, disable with `--no-log`
30+
- **Color control**: Disable colors with `--no-color` for CI/CD pipelines
31+
- **No jq dependency**: Uses native Unix tools for JSON parsing
32+
33+
## Architecture Overview
34+
35+
```mermaid
36+
flowchart TD
37+
A[Start: User runs script] --> B[Setup logging<br/>+ CloudWatch if available]
38+
B --> C{--list?}
39+
40+
%% List mode
41+
C -- yes --> L1[List clusters]
42+
L1 --> L2[Collect EC2/VPC tags]
43+
L2 --> L3[List S3 prefixes]
44+
L3 --> L4[Merge + deduplicate]
45+
L4 --> L5{--detailed?}
46+
L5 -- yes --> L6[Count resources in parallel]
47+
L5 -- no --> L7[Quick VPC status check]
48+
L6 --> L8[Print cluster list]
49+
L7 --> L8
50+
L8 --> Z[End]
51+
52+
%% Destroy mode
53+
C -- no --> D[Parse args + validate inputs]
54+
D --> E{metadata-file?}
55+
E -- yes --> E1[Extract infraID, clusterName, region]
56+
E -- no --> F{infra-id provided?}
57+
E1 --> G
58+
F -- yes --> G[Use provided infra-id]
59+
F -- no --> H{cluster-name provided?}
60+
H -- yes --> H1[Detect infra-id via VPC tag or S3]
61+
H -- no --> X[Exit: missing identifier]
62+
H1 --> G
63+
64+
G --> I[Count resources parallel]
65+
I --> J{resources == 0?}
66+
J -- yes --> J1[Cleanup S3 state] --> Z
67+
J -- no --> K[Show detailed resources]
68+
69+
K --> Q{--force or --dry-run?}
70+
Q -- no --> Q1[Prompt confirm] --> Q2{confirmed?}
71+
Q2 -- no --> Z
72+
Q2 -- yes --> R
73+
Q -- yes --> R[Proceed]
74+
75+
R --> S{openshift-install + metadata?}
76+
S -- yes --> S1[Run openshift-install destroy]
77+
S1 --> S2{success?}
78+
S2 -- yes --> S3[Clean Route53 records] --> T
79+
S2 -- no --> U
80+
S -- no --> U[Manual cleanup]
81+
82+
subgraph Reconciliation Loop
83+
direction TB
84+
U --> M1[1. Terminate EC2 instances]
85+
M1 --> M2[2. Delete Classic ELBs + ALB/NLBs<br/>by name and by VPC]
86+
M2 --> M3[3. Delete NAT Gateways]
87+
M3 --> M4[4. Release Elastic IPs]
88+
M4 --> M5[5. Delete orphan ENIs]
89+
M5 --> M6[6. Delete VPC Endpoints]
90+
M6 --> M7[7. Delete Security Groups<br/>remove rules first]
91+
M7 --> M8[8. Delete Subnets]
92+
M8 --> M9[9. Delete Route Tables + associations]
93+
M9 --> M10[10. Detach & Delete Internet Gateway]
94+
M10 --> M11[11. Delete VPC]
95+
M11 --> M12[12. Cleanup Route53: api and *.apps]
96+
M12 --> V[Recount resources]
97+
V --> W{remaining > 0 and attempts < MAX_ATTEMPTS?}
98+
W -- yes --> U
99+
W -- no --> T[Proceed]
100+
end
101+
102+
T --> Y[Cleanup S3 state<br/>resolve by cluster or infra-id]
103+
Y --> V2[Final verification count]
104+
V2 --> CW[Send summary to CloudWatch if enabled]
105+
CW --> Z
106+
```
107+
108+
109+
## Usage Examples
110+
111+
### List all clusters in a region
112+
```bash
113+
./scripts/destroy-openshift-cluster.sh --list
114+
./scripts/destroy-openshift-cluster.sh --list --detailed # With resource counts
115+
```
116+
117+
### Destroy a cluster
118+
```bash
119+
# By cluster name (auto-detects infra-id)
120+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster
121+
122+
# By infrastructure ID
123+
./scripts/destroy-openshift-cluster.sh --infra-id my-cluster-abc12
124+
125+
# Using metadata file
126+
./scripts/destroy-openshift-cluster.sh --metadata-file /path/to/metadata.json
127+
```
128+
129+
### Preview destruction (dry-run)
130+
```bash
131+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster --dry-run
132+
```
133+
134+
### Force deletion without prompts
135+
```bash
136+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster --force
137+
```
138+
139+
### Customize reconciliation attempts
140+
```bash
141+
./scripts/destroy-openshift-cluster.sh --cluster-name stubborn-cluster --max-attempts 10
142+
```
143+
144+
### Logging options
145+
```bash
146+
# Custom log file location
147+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster --log-file /var/log/destroy.log
148+
149+
# Disable file logging (console only)
150+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster --no-log
151+
152+
# Disable colored output for CI/CD
153+
./scripts/destroy-openshift-cluster.sh --cluster-name my-cluster --no-color
154+
```
155+
156+
## Resource Deletion Order
157+
158+
The script follows a carefully designed deletion order to handle AWS dependencies:
159+
160+
1. **EC2 Instances** - Terminate all instances first
161+
2. **Load Balancers** - Delete ELBs/ALBs/NLBs (releases public IPs)
162+
3. **NAT Gateways** - Remove NAT gateways
163+
4. **Elastic IPs** - Release allocated IPs
164+
5. **Network Interfaces** - Clean orphaned ENIs
165+
6. **VPC Endpoints** - Remove endpoints
166+
7. **Security Groups** - Delete after removing dependencies
167+
8. **Subnets** - Delete VPC subnets
168+
9. **Route Tables** - Remove custom route tables
169+
10. **Internet Gateway** - Detach and delete IGW
170+
11. **VPC** - Finally delete the VPC itself
171+
12. **Route53** - Clean DNS records
172+
13. **S3 State** - Remove cluster state files
173+
174+
## Error Handling
175+
176+
- **Timeout protection**: Commands timeout after 30 seconds to prevent hanging
177+
- **Graceful degradation**: Falls back to manual cleanup if openshift-install fails
178+
- **Reconciliation loop**: Automatically retries failed deletions
179+
- **Dependency resolution**: Removes security group rules before deletion to break circular dependencies
180+
- **State verification**: Post-destruction check ensures complete cleanup
181+
182+
## Requirements
183+
184+
- AWS CLI configured with appropriate credentials
185+
- Standard Unix tools (grep, sed, awk - pre-installed on most systems)
186+
- Optional: openshift-install binary for metadata-based destruction
187+
- Optional: timeout command (coreutils) for operation timeouts
188+
189+
## Security Considerations
190+
191+
- Input validation prevents injection attacks
192+
- Restricted file permissions on log files (600)
193+
- No sensitive data logged to CloudWatch
194+
- AWS profile validation before operations
195+
- Confirmation prompts prevent accidental deletions
196+
197+
## Files Changed
198+
199+
- `scripts/destroy-openshift-cluster.sh` - New comprehensive destroyer script (2000+ lines)
200+
201+
## Testing Recommendations
202+
203+
1. Test with `--dry-run` first to verify resource detection
204+
2. Test on a small test cluster before production use
205+
3. Verify S3 state cleanup for your bucket naming convention
206+
4. Test reconciliation with partially deleted clusters
207+
5. Validate CloudWatch logging if using in CI/CD
208+
209+
## Related Documentation
210+
211+
- [OpenShift on AWS Documentation](https://docs.openshift.com/container-platform/latest/installing/installing_aws/installing-aws-default.html)
212+
- [AWS Resource Tagging](https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html)
213+
- Script includes comprehensive inline documentation and help text

0 commit comments

Comments
 (0)