Skip to content

96528025/distributed-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Distributed System with Auto Failover

A highly available distributed web service deployed across 3 AWS regions, with automatic health monitoring, Round Robin load balancing, and zero-downtime failover.

Architecture

                    ┌─────────────────────┐
                    │   Load Balancer      │
                    │  (load_balancer.py)  │
                    └──────────┬──────────┘
                               │ Round Robin
              ┌────────────────┼────────────────┐
              ▼                ▼                ▼
   ┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
   │  US East         │  │  US West     │  │  EU West         │
   │  Virginia        │  │  Oregon      │  │  Ireland         │
   │  3.236.124.247   │  │ 100.23.127.116│  │ 34.245.154.170  │
   └──────────────────┘  └──────────────┘  └──────────────────┘
              ▲                ▲                ▲
              └────────────────┼────────────────┘
                               │ Health Checks (every 30s)
                    ┌──────────┴──────────┐
                    │  AWS Route 53       │
                    │  (8 global regions) │
                    └─────────────────────┘

Features

  • Multi-region deployment — 3 EC2 instances across US East, US West, and EU West
  • Health monitoring — AWS Route 53 checks from 8 global locations every 30 seconds
  • Round Robin load balancing — requests distributed evenly across healthy servers
  • Automatic failover — unhealthy servers removed from rotation instantly
  • Auto recovery — servers rejoin rotation automatically when they recover
  • Process resilience — systemd restarts crashed processes automatically

Tech Stack

Component Technology
Servers AWS EC2 (t3.micro), 3 regions
Web service Python http.server
Process management systemd
Health monitoring AWS Route 53 Health Checks
Load balancing Python Round Robin algorithm

Demo

Normal operation — requests rotate across all 3 servers:

✅ Healthy  US East (Virginia)
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 3/3

请求 #1 → US East: Hello from US East (Virginia)!
请求 #2 → US West: Hello from US West (Oregon)!
请求 #3 → EU West: Hello from EU West (Ireland)!
请求 #4 → US East: Hello from US East (Virginia)!

Failover — US East goes down, traffic automatically reroutes:

❌ Down     US East (Virginia)
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 2/3

请求 #5 → US West: Hello from US West (Oregon)!
请求 #6 → EU West: Hello from EU West (Ireland)!
请求 #7 → US West: Hello from US West (Oregon)!

Auto recovery — US East comes back, automatically rejoins:

✅ Healthy  US East (Virginia)   ← back!
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 3/3

How to Run

# Clone the repo
git clone https://github.com/96528025/distributed-system.git
cd distributed-system

# Run the load balancer
python3 load_balancer.py

Project Structure

distributed-system/
├── load_balancer.py          # Load balancer with health checks and Round Robin
├── server.py                 # Web server running on each EC2 instance
└── README.md

Server Setup (EC2)

Each server runs a Python HTTP service managed by systemd:

# /etc/systemd/system/webserver.service
[Unit]
Description=Web Server
After=network.target

[Service]
ExecStart=/usr/bin/python3 /opt/server.py
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

The /health endpoint returns server status for health checks:

{"status": "healthy", "server": "us-east"}

分布式系统(自动故障转移)

一个高可用的分布式 Web 服务,部署在 3 个 AWS 区域,支持自动健康监控、轮询负载均衡和零停机故障转移。

系统架构

                    ┌─────────────────────┐
                    │     负载均衡器        │
                    │  (load_balancer.py)  │
                    └──────────┬──────────┘
                               │ 轮询分发
              ┌────────────────┼────────────────┐
              ▼                ▼                ▼
   ┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
   │  美国东部         │  │  美国西部     │  │  欧洲西部         │
   │  弗吉尼亚         │  │  俄勒冈       │  │  爱尔兰           │
   │  3.236.124.247   │  │ 100.23.127.116│  │ 34.245.154.170  │
   └──────────────────┘  └──────────────┘  └──────────────────┘
              ▲                ▲                ▲
              └────────────────┼────────────────┘
                               │ 健康检查(每 30 秒)
                    ┌──────────┴──────────┐
                    │  AWS Route 53       │
                    │  (全球 8 个节点)    │
                    └─────────────────────┘

功能特点

  • 多区域部署 — 3 台 EC2 实例分布在美东、美西、欧西三个区域
  • 健康监控 — AWS Route 53 从全球 8 个位置每 30 秒检查一次
  • 轮询负载均衡 — 请求均匀分发到所有健康的服务器
  • 自动故障转移 — 检测到异常服务器后立即从轮询中移除
  • 自动恢复 — 服务器恢复正常后自动重新加入轮询
  • 进程守护 — systemd 在进程崩溃后自动重启

技术栈

组件 技术
服务器 AWS EC2(t3.micro),3 个区域
Web 服务 Python http.server
进程管理 systemd
健康监控 AWS Route 53 健康检查
负载均衡 Python 轮询算法

演示效果

正常运行 — 请求轮流发送到 3 台服务器:

✅ Healthy  US East (Virginia)
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 3/3

请求 #1 → US East: Hello from US East (Virginia)!
请求 #2 → US West: Hello from US West (Oregon)!
请求 #3 → EU West: Hello from EU West (Ireland)!
请求 #4 → US East: Hello from US East (Virginia)!

故障转移 — 美东节点宕机,流量自动切换:

❌ Down     US East (Virginia)
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 2/3

请求 #5 → US West: Hello from US West (Oregon)!
请求 #6 → EU West: Hello from EU West (Ireland)!
请求 #7 → US West: Hello from US West (Oregon)!

自动恢复 — 美东节点恢复,自动重新加入轮询:

✅ Healthy  US East (Virginia)   ← 已恢复!
✅ Healthy  US West (Oregon)
✅ Healthy  EU West (Ireland)
可用服务器: 3/3

如何运行

# 克隆仓库
git clone https://github.com/96528025/distributed-system.git
cd distributed-system

# 启动负载均衡器
python3 load_balancer.py

项目结构

distributed-system/
├── load_balancer.py          # 负载均衡器(含健康检查和轮询逻辑)
├── server.py                 # 部署在每台 EC2 上的 Web 服务
└── README.md

EC2 服务器配置

每台服务器运行由 systemd 管理的 Python HTTP 服务:

# /etc/systemd/system/webserver.service
[Unit]
Description=Web Server
After=network.target

[Service]
ExecStart=/usr/bin/python3 /opt/server.py
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

/health 接口返回服务器状态,供健康检查使用:

{"status": "healthy", "server": "us-east"}

About

Distributed web service across 3 AWS regions with auto failover

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages