Cloud resource assessment collectors for AWS, Azure, GCP, and Microsoft 365.
Collects cloud resource inventory including:
- Compute (VMs, containers, serverless)
- Storage (block, object, file)
- Databases (managed SQL, NoSQL)
- Snapshots and backups
- Protection status analysis
- Backup/snapshot cost analysis
- Data change rate metrics (optional)
# Download and setup
curl -sL https://github.com/LFigg/cca-cloudshell/archive/refs/heads/main.tar.gz | tar xz
cd cca-cloudshell-main && ./setup.sh
# Run the unified collector (recommended)
python3 collect.py # Auto-detects cloud credentials and runs
python3 collect.py --setup # Interactive setup wizard for first-time users
python3 collect.py --cloud aws # Specify cloud explicitly
# Or run individual collectors directly
python3 aws_collect.py # AWS
python3 azure_collect.py # Azure
python3 gcp_collect.py # GCP
python3 m365_collect.py # Microsoft 365
python3 cost_collect.py --aws # Backup/snapshot costs (run from management account)The collect.py entry point provides:
- Auto-Detection: Finds configured cloud credentials automatically
- Permission Verification: Validates credentials before collection
- Setup Wizard: Interactive setup for first-time users (
--setup)
# Auto-detect and run (single cloud detected = runs automatically)
python3 collect.py
# Setup wizard - configure credentials and test permissions
python3 collect.py --setup
# Specify cloud explicitly
python3 collect.py --cloud aws
python3 collect.py --cloud azure --skip-check # Skip permission verification
# Pass arguments to underlying collector
python3 collect.py --cloud aws -- --org-role CCARole --regions us-east-1
# Show collector-specific help
python3 collect.py --cloud aws --help-collectorEach collector generates:
cca_<cloud>_inv_<time>.json- Full resource inventorycca_<cloud>_sum_<time>.json- Aggregated summarycca_log_<time>.log- Collection log for troubleshooting
- Progress Tracking: Rich terminal UI with spinners, progress bars, and resource counts (falls back to plain text when piping output)
- Retry Logic: Automatic retry with exponential backoff for transient API failures
- Multi-Account/Project: Collect across all accessible accounts, subscriptions, or projects
- Cloud Shell Ready: Works out of the box in AWS, Azure, and Google Cloud Shell environments
| Document | Description |
|---|---|
| Getting Started | Installation and first run |
| Admin Machine Setup | Running from local workstation |
| AWS Collector | Multi-account, regions, options |
| Azure Collector | Subscriptions, resources |
| GCP Collector | Projects, regions, resources |
| M365 Collector | App registration, Graph API |
| Cost Collector | Backup/snapshot spending |
| Required Permissions | IAM policies for each cloud |
| AWS CloudFormation & StackSets | IAM role deployment for 100+ accounts |
| Permission Setup Scripts | Setup scripts for Azure/GCP |
| Config Examples | YAML config file examples |
| Output Formats | JSON schema, CSV fields |
| Troubleshooting | Common errors and solutions |
# AWS - multi-account via Organizations
python3 aws_collect.py --org-role CCARole
# AWS - specific regions
python3 aws_collect.py --regions us-east-1,us-west-2
# Azure - specific subscription
python3 azure_collect.py --subscription-id xxx
# GCP - all projects
python3 gcp_collect.py --all-projects
# Custom output directory
python3 aws_collect.py -o ./my_output/
# Include full resource IDs/ARNs in output (default: redact for privacy)
python3 aws_collect.py --include-resource-ids
# Azure - include individual recovery points (slow for large environments)
python3 azure_collect.py --include-recovery-points
# Skip change rate metrics (faster collection)
python3 aws_collect.py --skip-change-rate
# Analyze backup/snapshot costs (requires management account for AWS Organizations)
python3 cost_collect.py --aws --org-costs # Break down by linked account
python3 cost_collect.py --aws --start-date 2026-01-01When using python3 collect.py in interactive mode, you'll be prompted for cost collection options after selecting a cloud platform. This enables collecting both inventory and backup/snapshot costs in a single workflow.
Change rate metrics are collected by default from CloudWatch/Monitor. Use --skip-change-rate or --change-rate-days to customize:
python3 aws_collect.py --change-rate-days 14 # Use 14-day sample instead of 7
python3 azure_collect.py --skip-change-rate # Skip for faster collection
python3 gcp_collect.py --skip-change-rateThis outputs a separate cca_*_change_rates_*.json file with estimated daily change rates by service family. Use these values to override default DCR assumptions in sizing tools.
Note: Requires additional monitoring permissions (CloudWatch for AWS, Azure Monitor for Azure, Cloud Monitoring for GCP). See PERMISSIONS.md for details.
PersistentVolumeClaims (PVCs) are automatically collected when managed Kubernetes clusters are discovered. Use --skip-pvc to disable this:
python3 aws_collect.py --skip-pvc # Skip PVC collection from EKS clusters
python3 azure_collect.py --skip-pvc # Skip PVC collection from AKS clusters
python3 gcp_collect.py --skip-pvc # Skip PVC collection from GKE clustersThis collects:
- PVC name, namespace, and storage class
- Requested and actual storage sizes
- Access modes (ReadWriteOnce, ReadWriteMany, etc.)
- Bound PersistentVolume information
- Pods using each PVC
Requirements:
kubernetesPython package:pip install kubernetes- K8s RBAC permissions to list PVCs, PVs, and Pods in the cluster
- Network connectivity to cluster API endpoints
Use YAML config files for repeated runs or complex configurations:
# Generate a sample config
python3 collect.py --generate-config aws > cca-config.yaml
# Edit the config, then run with it
python3 collect.py --config cca-config.yaml
# Config is auto-discovered if named cca-config.yaml in current directoryConfig files support environment variable substitution (${VAR} or ${VAR:-default}).
See config-examples/ for samples.
Generate an Excel report with protection status analysis:
python scripts/generate_protection_report.py inventory.json report.xlsxGenerate a comprehensive multi-tab Excel report combining inventory and cost data:
# Single inventory file
python scripts/generate_assessment_report.py cca_aws_inv_*.json assessment.xlsx
# Multiple inventory files (multi-cloud)
python scripts/generate_assessment_report.py cca_*_inv_*.json --cost cca_cost_*.json -o assessment.xlsxThe assessment report includes:
- Executive summary with sizing overview
- Regional distribution for cluster placement
- Protection analysis and unprotected resources
- TCO inputs for Cohesity sizing calculator
- Multi-account breakdown
Verify your environment has the required dependencies:
python3 tests/test_cloudshell_compat.pyDeploy the IAM role with required permissions:
aws cloudformation create-stack \
--stack-name cca-collector \
--template-body file://setup/aws-iam-role.yaml \
--capabilities CAPABILITY_NAMED_IAMFor organizations with 100+ accounts, use CloudFormation StackSets for automated deployment. See AWS CloudFormation & StackSets for detailed instructions.
See setup/ for Azure/GCP permission setup scripts.
cca-cloudshell/
├── collect.py # Unified collector entry point
├── aws_collect.py # AWS collector
├── azure_collect.py # Azure collector
├── cost_collect.py # Cost analyzer
├── gcp_collect.py # GCP collector
├── m365_collect.py # M365 collector
├── pyproject.toml # Project config (mypy, pytest, ruff)
├── setup/ # IAM/permission setup scripts
├── config-examples/ # YAML config file examples
├── lib/ # Shared models and utilities
│ ├── constants.py # Centralized constants
│ ├── models.py # Resource data models
│ ├── utils.py # Common utilities
│ └── ...
├── scripts/ # Report generators
│ ├── generate_assessment_report.py
│ ├── generate_protection_report.py
│ ├── generate_cost_report.py
│ └── merge_batch_outputs.py
├── docs/ # Documentation
# Run tests
pytest tests/
# Type checking
mypy aws_collect.py azure_collect.py gcp_collect.py
# Linting
ruff check .MIT License