Copies one Dynamo DB Table to another
This script will copy one DynamoDB table to another one along with it's metadata.
This uses multiprocessing.Process to have multiple parallel scanners on the source table in order to
significantly improve the performance (and decrease the runtime).
- This script should be run on an environment where
awscliis already configured.- Otherwise, you need to export these environment variables.
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEYandAWS_DEFAULT_REGION
- Otherwise, you need to export these environment variables.
- Python3 (3.7+) to run this script.
- IAM Role corresponding to
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEYMUST have the privileges to
read from the source table and create new tables.
python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtor
pip3 install -r requirements.txtGo to the src directory:
cd srcGet help about the script:
python copy_dynamodb_table.py [-h]or
python3 copy_dynamodb_table.py [-h]Perform copy:
python copy_dynamodb_table.py -s <source_table_name> -t <target_table_name> [-c] [-v]or
python3 copy_dynamodb_table.py -s <source_table_name> -t <target_table_name> [-c] [-v]| Param/Flag | Purpose |
|---|---|
-s or --source |
Name of the source DynamoDB table (Required) |
-t or --target |
Name of the target DynamoDB table (Required) |
-n or --num-threads |
Number of parallel threads/processes to scan the source table (default=5) |
-c or --create-table |
Whether to create the target table if it does not exist (False if not passed) |
-v or --verbose-copy |
Whether to copy additional information (i.e. Tags, Encryption, Stream) (False if not passed) |
Example:
python copy_dynamodb_table.py -n 10 -c -v -s prod_table -t dev_tableor
python3 copy_dynamodb_table.py -n 10 -c -v -s prod_table -t dev_tableIt's advised to run this script on a powerful computer because otherwise, it will take a lot of time to finish for a larger table.
Running on a powerful AWS EC2 instance will benefit a lot since it will reduce the network overhead.
This script will consume read capacity of your source table and the write capacity of your target table.
So, running this script on a large table will cost you a lot.