A comprehensive batch processing script for running ATX CLI transformations on multiple repositories with support for both serial and parallel execution modes.
Get up and running in 4 simple steps:
git clone https://github.com/venuvasu/atxclibatch.git
cd atxclibatchFollow the installation instructions in the official ATX CLI documentation:
curl -fsSL https://desktop-release.transform.us-east-1.api.aws/install.sh | bashEdit the sample CSV file with your repositories and transformations:
# Edit sample-repos.csv with your repositories
nano sample-repos.csv
# or
vim sample-repos.csvFor private GitHub repositories, you have two options:
Advantages:
- ✅ Set up once, works forever
- ✅ No tokens to manage/expire
- ✅ More secure (no credentials in files)
- ✅ No prompts during git clone operations
- ✅ No script modifications needed
Setup:
# 1. Generate SSH key WITHOUT passphrase (press Enter when prompted)
ssh-keygen -t ed25519 -C "your_email@example.com"
# 2. Add to ssh-agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
# 3. Copy public key to GitHub
cat ~/.ssh/id_ed25519.pub
# Go to GitHub → Settings → SSH and GPG keys → New SSH key
# Paste the public key content
# 4. Test SSH connection
ssh -T git@github.com
# Should show: "Hi username! You've successfully authenticated..."Usage: Use SSH URLs in your CSV file:
git@github.com:user/private-repo.gitSetup:
- Go to GitHub → Settings → Developer settings → Personal access tokens
- Generate new token with
repopermissions - Configure git credentials or use HTTPS URLs with embedded tokens
Note: SSH is recommended as it's simpler and more secure.
# Basic execution using sample files
./sample-execution.sh
# Or run directly
./atx-batch-launcher.sh --csv-file sample-repos.csv
# Parallel execution with custom settings
./atx-batch-launcher.sh \
--csv-file "sample-repos.csv" \
--mode "parallel" \
--max-jobs 10 \
--output-dir "./batch_results" \
--clone-dir "./batch_repos"That's it! Check batch_results/summary.log for execution results.
This script demonstrates common usage patterns and serves as a template for your own execution scripts:
#!/bin/bash
./atx-batch-launcher.sh \
--csv-file "sample-repos.csv" \
--mode "parallel" \
--max-jobs 8 \
--output-dir "./batch_results" \
--clone-dir "./batch_repos"Key Features Demonstrated:
- Parallel execution with 8 concurrent jobs
- Sample CSV file with realistic repository examples
- Standard output directories for results and cloned repos
Usage Examples:
# Run the sample script directly
./sample-execution.sh
# Copy and customize for your needs
cp sample-execution.sh my-java-upgrades.sh
# Edit my-java-upgrades.sh with your parameters
./my-java-upgrades.sh
# Create custom execution scripts
cat > my-custom-execution.sh << 'EOF'
#!/bin/bash
./atx-batch-launcher.sh \
--csv-file "sample-repos.csv" \
--mode "serial" \
--max-jobs 4 \
--build-command "mvn clean install" \
--output-dir "./results-$(date +%Y%m%d)"
EOF
chmod +x my-custom-execution.shCustomization Options:
- Change
--csv-fileto your repository list - Adjust
--max-jobsbased on system resources - Modify
--mode(serial/parallel) for your workflow - Set custom
--output-dirand--clone-dirpaths - Add
--build-commandfor default build instructions
This file demonstrates the CSV format with realistic examples:
repo_path,build_command,transformation_name,validation_commands,additional_plan_context
https://github.com/spring-projects/spring-petclinic.git,./mvnw clean test,java-version-upgrade,"Use JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64/bin/java and run all tests","Java 8 to 21 transformation with Spring Boot 3.4.5 and dependency migrations"
https://github.com/eugenp/tutorials.git,./gradlew clean build test,aws-sdk-migration,"Build with Java 21 and validate AWS SDK v2 usage","Migrate from AWS SDK v1 to v2 with proper error handling"
./local-spring-app,mvn clean install,spring-boot-upgrade,"Run integration tests with TestContainers","Spring Boot 2.7 to 3.4 migration with security updates"
https://github.com/Netflix/eureka.git,./gradlew build,modernization-package,"Use Java 21 and run all unit tests","Modernize to latest Spring Cloud and remove deprecated APIs"Repository Types Demonstrated:
- Public GitHub repos (Spring PetClinic, Baeldung Tutorials, Netflix Eureka)
- Local repositories (./local-spring-app)
- Different build systems (Maven, Gradle)
- Various transformations (Java upgrades, AWS SDK migration, Spring Boot upgrades)
For Private Repositories: Use SSH URLs like git@github.com:user/private-repo.git after setting up SSH keys.
- CSV-based input - Define repositories and parameters in a simple CSV format
- Flexible repository sources - Support for local paths, GitHub HTTPS URLs, and SSH URLs
- Execution modes - Serial or parallel processing with configurable job limits
- Comprehensive logging - Individual logs per repository plus summary statistics
- Error handling - Retry mechanisms, timeout handling, and graceful cleanup
- Trust management - Trust-all-tools enabled by default for automation
- Progress tracking - Real-time status updates and completion statistics
- Resume capability - Retry failed repositories from previous runs
- Production ready - File locking, signal handling, disk space validation
- ATX version management - Automatic version checking and update notifications
- Make the script executable:
chmod +x atx-batch-launcher.sh- Ensure ATX CLI is installed and accessible in your PATH
# Basic serial execution (trust-all-tools enabled by default, non-interactive by default)
./atx-batch-launcher.sh --csv-file repos.csv --build-command "mvn clean install"
# Parallel execution with 8 jobs
./atx-batch-launcher.sh --csv-file repos.csv --mode parallel --max-jobs 8
# Disable trust-all-tools for manual approval
./atx-batch-launcher.sh --csv-file repos.csv --no-trust-tools
# Dry run to see what would be executed
./atx-batch-launcher.sh --csv-file repos.csv --dry-run| Option | Description | Default |
|---|---|---|
--csv-file <file> |
CSV file containing repository information | Required |
--mode <serial|parallel> |
Execution mode | serial |
--no-trust-tools |
Disable trust-all-tools (default: enabled) | false |
--max-jobs <number> |
Max parallel jobs (must be positive integer) | 4 |
--max-retries <number> |
Max retry attempts per repo (must be non-negative integer) | 1 |
--output-dir <dir> |
Output directory for logs | ./batch_results |
--clone-dir <dir> |
Directory for cloning GitHub repos | ./batch_repos |
--build-command <cmd> |
Default build command | None |
--additional-params <params> |
Additional ATX CLI parameters | --non-interactive |
--dry-run |
Show what would be executed without running | false |
--retry-failed |
Retry previously failed repositories | false |
--help |
Show help message | - |
The CSV file should have the following columns:
repo_path,build_command,transformation_name,validation_commands,additional_plan_contextColumn Descriptions:
repo_path: Local path or GitHub URL (HTTPS/SSH)build_command: Build command (optional, uses default if empty)transformation_name: Transformation to use (required)validation_commands: Validation commands for this transformation (optional)additional_plan_context: Additional context for transformation planning (optional)
Note: Repository names are automatically generated from the path (directory name for local paths, repository name for URLs).
You can use --additional-params to pass any ATX CLI parameters. Common options include:
Execution Control:
--non-interactive- Run without user interaction (default)--trust-all-tools- Trust all tools without prompting (enabled by default)--configuration <config>- Use configuration file (e.g.,file://config.yaml)
Knowledge Management:
--do-not-use-knowledge-items- Disable knowledge items from previous transformations--do-not-learn- Prevent extracting knowledge items from execution
Conversation Management:
--conversation-id <id>- Resume specific conversation--resume- Resume most recent conversation
Examples of additional parameters:
# Disable learning and knowledge items
--additional-params "--non-interactive --do-not-learn --do-not-use-knowledge-items"
# Resume specific conversation
--additional-params "--conversation-id abc123def456"Each repository can have its own validation commands and additional plan context specified directly in the CSV file. This allows for transformation-specific customization:
Validation Commands: Commands or instructions for validating the transformation success Additional Plan Context: Extra context to help the transformation agent understand the specific requirements
Examples:
# Java transformations with specific JDK requirements
repo_path,build_command,transformation_name,validation_commands,additional_plan_context
./java8-app,mvn clean test,java-version-upgrade,"Use JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64/bin/java","Java 8 to 21 transformation with Spring Boot 3.4.5"
# Python transformations with version requirements
./python-app,pytest,python-upgrade,"Run tests with Python 3.11","Django 3.2 to 4.2 migration with async support"
# Node.js transformations
./node-app,npm test,node-upgrade,"Use Node.js 20 LTS","Express 4 to 5 migration with TypeScript"Benefits of per-repository configuration:
- Different transformations can have different validation requirements
- Specific context can be provided for each codebase
- More flexible than global parameters
- Better suited for mixed-technology batch processing
repo_path,build_command,transformation_name,validation_commands,additional_plan_context
./local-java-app,mvn clean install,aws-sdk-v1-to-v2-java-migration,"Use JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64/bin/java","Java 8 to 21 transformation with Spring Boot 3.4.5"
https://github.com/example/spring-boot-app.git,./gradlew build,spring-boot-upgrade,"Run all tests with Java 21","Migrate to Spring Boot 3.4.5 and AWS SDK 2.31.40"
/home/user/projects/legacy-app,npm run build,modernization-package,"","Node.js 16 to 20 migration"
git@github.com:company/microservice.git,make build,java-version-upgrade,"Use JAVA_HOME=/usr/lib/jvm/java-21-openjdk-amd64/bin/java","Include all dependency migrations"| Variable | Description | Default |
|---|---|---|
ATX_SHELL_TIMEOUT |
Shell command timeout in seconds | 10800 (3 hours) |
Note: The script automatically checks for ATX CLI updates at startup and displays notifications if newer versions are available.
The script creates the following output structure:
batch_results/
├── summary.log # Complete execution summary with statistics
├── <repo_name>_execution.log # Individual repository logs
├── <repo_name>_config.yaml # Generated config files (if validation/context provided)
├── results.txt # Machine-readable results
├── failed_repos.csv # CSV of failed repositories (if any)
└── parsed_repos.txt # Internal parsed repository data
batch_repos/ # Cloned GitHub repositories
├── <repo_name>/ # Auto-generated from repository URLs
└── ...
The summary report includes:
- Statistics Table: Total repositories, success/failure counts, success rate
- Execution Details: Wall time, execution time, mode, parameters
- Failed Repositories: List of failed repos with error messages
- Detailed Results: Complete status table for all repositories
- Log File Locations: Paths to all generated logs
Example summary output:
STATISTICS TABLE
==================
Metric | Value
---------------------+----------
Total Repositories | 6
Successful | 4
Failed | 2
Success Rate | 66%
Execution Mode | parallel
Trust All Tools | true
Several example scripts are provided:
Basic Execution:
# Simple execution with default parameters
./execute-batch.shDocumentation Analysis:
# Run comprehensive codebase analysis with 8 parallel jobs
./run-doc-analysis.sh
# Dry run with additional parameters
./run-doc-analysis-dryrun.shThese scripts demonstrate:
- Basic serial and parallel execution
- Custom output and clone directories
- Dry run mode with validation commands
- Additional plan context usage
After a batch execution, you can retry only the failed repositories:
./atx-batch-launcher.sh --retry-failed --output-dir ./previous_batch_resultsUse environment variables for advanced configuration:
# Set 5-hour timeout
export ATX_SHELL_TIMEOUT=18000
# Run with custom parameters
./atx-batch-launcher.sh \
--csv-file repos.csv \
--mode parallel \
--max-jobs 6 \
--trust-all-tools \
--transformation-name "my-custom-transformation" \
--additional-params "--non-interactive --do-not-learn"- CPU-bound transformations: Set max-jobs to number of CPU cores
- I/O-bound transformations: Can use 2-4x CPU cores
- Memory considerations: Monitor memory usage with high job counts
- Network limitations: Consider bandwidth when cloning multiple repos
The script includes comprehensive error handling:
- Repository preparation failures: Logged and skipped
- ATX CLI execution failures: Retried once with detailed logging
- Timeout handling: Uses system timeout command if available
- Signal handling: Graceful cleanup on SIGINT/SIGTERM
- Parallel job management: Proper cleanup of background processes
-
ATX CLI not found
- Ensure ATX CLI is installed and in PATH
- Check with
which atxoratx --version
-
Permission denied on repositories
- Ensure proper SSH keys for Git repositories
- Check file permissions for local paths
-
Timeout issues
- Increase
ATX_SHELL_TIMEOUTenvironment variable - Check individual repository logs for specific issues
- Increase
-
Memory issues with parallel execution
- Reduce
--max-jobsparameter - Monitor system resources during execution
- Reduce
- summary.log: Overall execution status and statistics
- _execution.log: Detailed ATX CLI output for each repository
- results.txt: Machine-readable results for scripting
The batch launcher is production-ready with the following features:
- Comprehensive error handling and configurable retry mechanisms (via
--max-retries) - Signal handling for graceful cleanup (SIGINT/SIGTERM)
- Timeout protection with configurable limits
- Automatic cleanup of failed clones and temporary files
- File locking for concurrent writes in parallel mode (prevents race conditions)
- Early validation of ATX CLI availability and version checking before processing
- Disk space validation (requires minimum 1GB free space)
- Detailed logging per repository with timestamps
- Comprehensive summary reports with statistics tables
- Machine-readable results for automation integration
- Progress tracking for long-running operations
- Automatic generation of failed repositories CSV for retry capability
- Configurable parallel job limits to prevent resource exhaustion
- Input validation for all numeric parameters (prevents invalid configurations)
- Automatic cleanup of cloned repositories between runs
- Memory-efficient CSV parsing for large repository lists
- Proper process management for parallel execution
- Trust-all-tools enabled by default for automation but configurable via
--no-trust-tools - No hardcoded credentials or sensitive data
- Secure temporary file handling
- Git clone error handling to prevent malicious repositories
- Supports unlimited repositories via CSV input
- Parallel execution with configurable job limits
- Efficient workspace management for large codebases
- Resume capability for interrupted executions
- Atomic file operations for safe concurrent processing
The batch launcher can be integrated into CI/CD pipelines:
# Example CI/CD usage
./atx-batch-launcher.sh \
--csv-file $WORKSPACE/repos.csv \
--mode parallel \
--max-jobs 4 \
--trust-all-tools \
--output-dir $WORKSPACE/atx-results
# Check exit code
if [ $? -eq 0 ]; then
echo "Batch transformation completed successfully"
else
echo "Batch transformation failed"
exit 1
fiThis script is provided under the same license as the ATX CLI.