Automated log cleanup and archival tool for EC2 fleets.
Finds old .log files, uploads them to S3, deletes the local copies, and emails a summary via AWS SES.
Application servers accumulate log files in /var/log over time, eventually consuming all available disk space. Log Archiver solves this by:
- Scanning
/var/logfor.logfiles older than 30 days - Uploading each file to an S3 bucket (organised by instance ID)
- Deleting the local file only after a confirmed upload
- Sending an email report via AWS SES with the run summary
The script is designed to be executed across 50+ EC2 instances simultaneously using AWS EventBridge + Systems Manager (SSM) — no SSH or cron required.
┌──────────────────────┐
│ EventBridge Rule │ ← runs every midnight: cron(0 0 * * ? *)
│ (scheduled) │
└────────┬─────────────┘
│ triggers
▼
┌──────────────────────┐
│ SSM Run Command │ ← sends the script command to the fleet
└────────┬─────────────┘
│ executes on
▼
┌──────────────────────┐
│ EC2 Instance Fleet │ ← 50+ servers, each runs log_archiver.py
│ (tagged targets) │
└────────┬─────────────┘
│ each instance:
▼
┌─────────────┐ ┌──────────┐ ┌──────────┐
│ Scan /var/log│──▶│ Upload S3│──▶│ Send SES │
│ for old logs │ │ + delete │ │ report │
└─────────────┘ └──────────┘ └──────────┘
See architecture.md for a detailed explanation of every design decision.
| Requirement | Detail |
|---|---|
| Python | 3.9 or later |
| AWS Account | With S3, SES, SSM, and EventBridge access |
| S3 Bucket | Created ahead of time (e.g. your-log-backup-bucket) |
| SES | Sender and recipient addresses verified in your SES region |
| IAM Role | Attached to every EC2 instance — see IAM Permissions |
| SSM Agent | Installed and running on each EC2 instance (Amazon Linux 2 / Ubuntu AMIs include it by default) |
# Clone the repo
git clone https://github.com/your-org/log-archiver.git
cd log-archiver
# Install Python dependencies
pip install -r requirements.txtEdit config.py to match your environment:
S3_BUCKET = "your-log-backup-bucket" # Target S3 bucket
S3_REGION = "us-east-1"
SES_SENDER = "devops@example.com" # Verified SES sender
SES_RECIPIENT = "team@example.com" # Report recipient
LOG_DIR = "/var/log"
MAX_AGE_DAYS = 30Note: Never hard-code AWS credentials. The script uses the IAM role attached to the EC2 instance.
sudo python3 log_archiver.py
sudois needed because some files in/var/logare owned by root.
Use EventBridge + SSM Run Command to execute the script across all instances every midnight without SSH. See deployment.md for full setup instructions.
| Concern | Cron | EventBridge + SSM |
|---|---|---|
| Scale to 50+ instances | Must configure each one individually | One rule targets all tagged instances |
| New instance joins fleet | Must manually add cron entry | Auto-included if it has the tag |
| Instance replaced by ASG | Cron entry is lost | New instance inherits tag → auto-included |
| Audit trail | None | SSM logs every execution to CloudTrail |
| Visibility into results | Must SSH in to check | SSM console shows pass/fail per instance |
| Change the schedule | Edit 50+ crontabs | Change one EventBridge rule |
| Error handling | Silent failures | SSM captures stdout/stderr, can alert on failure |
| Component | Purpose |
|---|---|
| EventBridge Rule | The clock — fires at midnight UTC every day using cron(0 0 * * ? *) |
| SSM Run Command | The executor — sends a shell command to EC2 instances matching a tag filter |
| EC2 Tag | The targeting mechanism — instances tagged log-archiver: enabled are included |
1. Tag your instances:
aws ec2 create-tags \
--resources i-0a1b2c3d4e5f67890 i-0b2c3d4e5f678901a \
--tags Key=log-archiver,Value=enabled2. Create the EventBridge rule (midnight UTC daily):
aws events put-rule \
--name "log-archiver-midnight" \
--schedule-expression "cron(0 0 * * ? *)" \
--state ENABLED \
--description "Trigger log archiver on all tagged EC2 instances at midnight UTC"3. Attach SSM Run Command as the target:
aws events put-targets \
--rule "log-archiver-midnight" \
--targets '[{
"Id": "LogArchiverTarget",
"Arn": "arn:aws:ssm:us-east-1::document/AWS-RunShellScript",
"RoleArn": "arn:aws:iam::<ACCOUNT_ID>:role/EventBridgeSSMRole",
"RunCommandParameters": {
"RunCommandTargets": [{
"Key": "tag:log-archiver",
"Values": ["enabled"]
}],
"Parameters": {
"commands": ["cd /opt/log-archiver && sudo python3 log_archiver.py"]
}
}
}]'Replace
<ACCOUNT_ID>with your 12-digit AWS account ID.
Use SSM Run Command to push the script from an S3 deployment bucket to every tagged instance in a single command — no SSH needed:
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "Key=tag:log-archiver,Values=enabled" \
--parameters 'commands=[
"mkdir -p /opt/log-archiver",
"aws s3 cp s3://your-deployment-bucket/log-archiver/ /opt/log-archiver/ --recursive",
"pip3 install -r /opt/log-archiver/requirements.txt"
]'- 00:00 UTC — EventBridge rule fires.
- EventBridge calls SSM
SendCommand, targeting all instances with taglog-archiver=enabled. - SSM Agent on each of the 50+ instances receives the command in parallel.
- Each instance executes
python3 /opt/log-archiver/log_archiver.py. - The script runs its full flow: find old logs → upload to S3 → delete locally → email report.
- SSM captures the output (all
logger.infomessages) — visible in the SSM console or CloudWatch. - If any instance fails, you can see exactly which one and why.
For the full step-by-step setup (IAM roles, console walkthrough, verification, troubleshooting), see deployment.md.
Attach a role to every EC2 instance with these least-privilege policies:
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::your-log-backup-bucket",
"arn:aws:s3:::your-log-backup-bucket/*"
]
}{
"Effect": "Allow",
"Action": "ses:SendEmail",
"Resource": "*"
}Attach the AWS-managed policy:
arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
log-archiver/
├── log_archiver.py # Main script — scan, upload, delete, email
├── config.py # All tuneable settings in one place
├── requirements.txt # Python dependencies (boto3, requests)
├── README.md # This file
├── architecture.md # Design decisions explained for interviews
└── deployment.md # Step-by-step EventBridge + SSM setup
- Show the code — walk through
log_archiver.pytop-to-bottom. - Run manually on a test EC2 instance:
sudo python3 log_archiver.py
- Check S3 — verify files appear under
s3://your-log-backup-bucket/<instance-id>/. - Check email — show the SES summary report.
- Show EventBridge rule in the AWS Console and explain the SSM target.
- Explain IAM — open the EC2 instance role and walk through the permissions.
2026-03-06 00:00:01 [INFO] ==================================================
2026-03-06 00:00:01 [INFO] Log Archiver started
2026-03-06 00:00:01 [INFO] ==================================================
2026-03-06 00:00:01 [INFO] Running on EC2 instance: i-0a1b2c3d4e5f67890
2026-03-06 00:00:01 [INFO] Scanning /var/log for files older than 30 days...
2026-03-06 00:00:01 [INFO] Found 12 log file(s) older than 30 days in /var/log
2026-03-06 00:00:01 [INFO] Uploading logs to S3...
2026-03-06 00:00:02 [INFO] Uploaded -> s3://your-log-backup-bucket/i-0a1b2c3d4e5f67890/app.log
2026-03-06 00:00:02 [INFO] Deleted local file: /var/log/app.log
...
2026-03-06 00:00:05 [ERROR] Failed to upload auth.log: An error occurred (AccessDenied)
2026-03-06 00:00:06 [INFO] --------------------------------------------------
2026-03-06 00:00:06 [INFO] Summary:
2026-03-06 00:00:06 [INFO] Total logs scanned : 12
2026-03-06 00:00:06 [INFO] Successfully uploaded: 10
2026-03-06 00:00:06 [INFO] Deleted locally : 10
2026-03-06 00:00:06 [INFO] Failures : 2
2026-03-06 00:00:06 [INFO] --------------------------------------------------
2026-03-06 00:00:06 [INFO] Sending email summary via SES...
2026-03-06 00:00:07 [INFO] Email report sent to team@example.com
2026-03-06 00:00:07 [INFO] Log Archiver finished
MIT