A high-performance, scalable URL shortening service built with Java and Spring Boot, designed to handle millions of URLs with high availability and low latency. This project implements a distributed architecture featuring database sharding, caching, and load balancing, making it suitable for high-traffic environments comparable to services like bit.ly or tinyurl.com.
The URL Shortener Service provides a robust API to create short, unique aliases for long URLs and redirect users seamlessly. It mimics the architecture of large-scale systems by incorporating:
- Scalable ID Generation: Unique ID generation distributed across multiple machine instances.
- Database Sharding: MongoDB sharded cluster to handle massive data storage requirements.
- High-Performance Caching: Redis integration for sub-millisecond redirect latency.
- Load Balancing: Nginx to distribute traffic across application instances.
- Containerization: Full Docker support for easy deployment and orchestration.
The system follows a distributed architecture designed for horizontal scalability:
- Load Balancer (Nginx): Entry point that distributes incoming HTTP traffic across multiple application instances.
- Application Layer (Spring Boot):
- URL Shortening: Generates unique IDs using a custom distributed algorithm (Snowflake-inspired) and Base62 encoding.
- Redirection: Handles lookup of original URLs with a read-through cache strategy.
- Analytics: Tracks link usage asynchronously.
- Caching Layer (Redis): Stores frequently accessed URL mappings to reduce database load and improve response times.
- Storage Layer (MongoDB Sharded Cluster): Persists URL mappings and analytics data. Data is sharded to ensure write scalability and storage capacity.
- API Gateway / Load Balancer: Nginx
- Service Instances: 3x Spring Boot Applications (simulating distributed nodes)
- Data Store: MongoDB (Config Server + 3 Shards + Mongos Router)
- Cache: Redis
- Language: Java 17
- Framework: Spring Boot 3.5.10
- Database: MongoDB (v7.0) with Sharding
- Cache: Redis (v7.0)
- Containerization: Docker & Docker Compose
- Build Tool: Maven
- Testing: JUnit 5, Testcontainers, Rest Assured, Awaitility
Values a long URL and returns a unique short alias.
- URL:
/api/urls/shorten - Method:
POST - Body:
{ "longUrl": "https://www.example.com/very/long/path/to/resource" } - Response:
{ "shortUrl": "http://localhost/a1B2c3" }
Redirects the user to the original URL associated with the short alias.
- URL:
/{shortUrl} - Method:
GET - Response:
302 Found(Location header set to original URL)
- Docker and Docker Compose installed on your machine.
- Java 17 (optional, only for local development without Docker).
This project comes with a fully configured compose.yaml to spin up the entire infrastructure.
-
Clone the repository:
git clone <repository-url> cd URL_Shortener
-
Start the services:
docker-compose up -d --build
This command will start:
- 3 MongoDB Shards, 1 Config Server, 1 Mongos Router.
- 1 Redis instance.
- 3 Application instances (ports 8081, 8082, 8083).
- 1 Nginx Load Balancer (port 80).
-
Initialize Database Sharding: The
mongo-init.shscript runs automatically to configure the sharding environment in MongoDB. -
Access the Application: The service is available typically at
http://localhost.
If you wish to run the application locally without Docker for the app instances:
-
Start Dependencies: You still need MongoDB and Redis. You can use the docker-compose file but scale down the app:
docker-compose up -d mongo-config mongo-shard1 mongo-shard2 mongo-shard3 mongos redis-master mongo-setup
-
Build and Run:
./mvnw spring-boot:run
The project includes a comprehensive test suite using Testcontainers to ensure integration reliability.
Run unit and integration tests:
./mvnw test-
Base62 Encoding: We use Base62 (A-Z, a-z, 0-9) to encode unique integer IDs into short strings. This provides over 56 billion combinations with just 6 characters (
$62^6$ ). -
Distributed ID Generation: To handle high concurrency, each application instance is assigned a unique
Machine ID. This ID is embedded into the generated unique numbers, ensuring no collisions occur even when multiple servers generate IDs simultaneously. - Sharding Strategy: MongoDB is sharded to distribute data. The shard key is carefully chosen to balance load across shards.
Contributions are welcome! Please fork the repository and submit a Pull Request.