This project was developed for discipline DevOps of course Information Technology Management at PUC-PR.
This service, implemented in Clojure, exposes REST and gRPC endpoints and processes millions of events per second to estimate quantiles (p50, p90, p99, etc.) in real time, using T‑Digest, Count‑Min Sketch, and functional techniques (pure functions + controlled immutability). Each time series consumes only a few MB of memory, ensuring high performance and low cost.
In high‑scale distributed systems (public APIs, microservices, IoT), storing every numeric measurement incurs high cost and query latency. The goals of this project are:
- Collect numeric samples from high‑throughput streams.
- Maintain a compact summary (T‑Digest) per key, supporting sliding windows.
- Serve quantile estimates in milliseconds for dashboards and alerts.
- Latency Monitoring (APM): estimating response-time percentiles of services.
- IoT Metrics: analyzing sensor values in sliding windows.
- Business KPIs: transaction volume and value on operational dashboards.
service QuantileService {
rpc IngestSample (Sample) returns (Ack) {}
rpc QueryQuantile (QuantileRequest) returns (QuantileResponse) {}
}
message Sample {
string key = 1;
double value = 2;
int64 timestamp = 3; // epoch ms
}
message QuantileRequest {
string key = 1;
double q = 2; // 0.0–1.0
int32 windowSec = 3; // window in seconds
}
message QuantileResponse {
double estimate = 1;
int64 count = 2; // number of samples in window
}- POST
/samples→ body asSample. - GET
/quantile?key={key}&q={q}&window={windowSec}→ returns{ "estimate": <value>, "count": <n> }
- core.async
chan(10 000 buffer) drives samples through a backgroundgo-loop. - clojure.spec validates each incoming sample in the loop.
- A single
atom(states) maps eachkeyto aStateData. - Each
StateDatacontains::cms→ a Count‑Min Sketch instance:events→ a vector of maps{:value … :timestamp …}
- Pure function
add-sample->statetakes existingStateDataand a sample, returns a newStateData. - Global
statesatom is updated via(swap! states update key add-sample->state sample).
- Window logic is applied on read:
estimate-quantilefilters:eventsby timestamp ≥ now − windowSec.
- Function
estimate-quantile [state q windowSec]- Filters events by timestamp cutoff
- Builds a fresh TDigest
- Adds each relevant value to the digest
- Returns
{ :estimate digest.quantile(q) :count totalEvents }
- T‑Digest (
com.tdunning.math.stats.TDigest) → quantile estimation - Count‑Min Sketch (
com.clearspring.analytics.stream.frequency.CountMinSketch) → approximate frequency - Back‑Pressure via core.async channel buffer of 10 000
- Key sharding by using independent entries in the
statesatom - Window filtering done dynamically on each query
- IO.Pedestal HTTP (
io.pedestal.http) for the REST serverbody-params&json-bodyinterceptors for parsing/serializing JSON
- gRPC Java +
io.grpc.netty.shaded+ generated stubs +proxyimplementation mount.corefor lifecycle management of ingest loop, HTTP server, and gRPC serverclojure.core.asyncfor asynchronous ingestionclojure.spec.alphafor request validationcom.tdunning/t-digestfor TDigestcom.clearspring.analytics/streamfor Count‑Min Sketchclojure.tools.build(build.clj) for compiling stubs, Clojure code, and generating an uberjar- REPL helpers in
dev/user.clj(start!,stop!,restart!) - Error logging via
clojure.spec/explain-strin case of invalid samples
-
Kaocha
An advanced Clojure test runner that organizes unit, integration, load and end‑to‑end suites.- Readable reports, tag support, parallel execution, and optional coverage reporting via plugins (e.g.
kaocha-cloverage). - Makes it easy to group and filter tests by type, ensuring each layer is validated in isolation.
- Documentation here.
- Readable reports, tag support, parallel execution, and optional coverage reporting via plugins (e.g.
-
Kibit
A static code analyzer for Clojure that suggests idiomatic refactorings and stylistic improvements.- Helps maintain consistent style and avoid common “code smells.”
- Simple integration via
clj -X:kibit, failing the build if any warnings are emitted. - Documentatoin here.
All tests and lint checks run automatically in the Docker/CI pipeline. Any failure in lint or tests immediately aborts the build.
- Docker Engine
- Docker Compose
git(optional, for cloning the repo)
-
Clone the repo (or unpack the project folder).
git clone https://your‑repo/pucpr-quantile‑service.git
-
Access the project root folder
cd pucpr-quantile‑service -
Build the prod image:
docker compose build --no-cache prod
-
Run the container in prod mode:
docker compose up -d prod
That's it!! 🚀🚀🚀
The application JAR file will be generated and should start the services on ports8080(HTTP) and50051(gRPC).
Now, you can test the endpoints.
-
Clone the repo (or unpack the project folder).
git clone https://your‑repo/pucpr-quantile‑service.git
-
Access the project root folder
cd pucpr-quantile‑service -
Build the dev container.
docker compose build --no-cache dev
-
Run the container in dev mode.
dock compose run -i --service-ports dev
-
Start the service: At
user=>:(start!)You should see:
[mount] Starting ingest loop [mount] Starting HTTP server on port 8080 [mount] Starting gRPC server on port 50051 Quantile service running on HTTP:8080 and gRPC:50051
That's it!
The endpoints should be available on ports 8080 (HTTP) e 50051 (gRPC).
Now, you can test the endpoints. -
Run the linter:
clj -X:kibit
-
Run the unit tests:
clj -M:kaocha :unit
-
Run the integration tests:
clj -M:kaocha :integration
-
HTTP Endpoints
Ingest a sample:
TIMESTAMP_MS=$(($(date +%s)*1000))curl -X POST http://localhost:8080/samples \ -H 'Content-Type: application/json' \ -d '{ "key": "foo", "value": 42.0, "timestamp": "$TIMESTAMP_MS" }'Expected HTTP 200 with JSON body, e.g.:
{ "ack": true, "ingestedAt": 1682001234567 }Query a Quantile:
curl -i -G http://localhost:8080/quantile \ --data-urlencode "key=foo" \ --data-urlencode "q=0.5" \ --data-urlencode "window=60"
Possible responses:
-
200 OK if data exists:
{ "estimate": { "estimate": 42.0, "count": 1 } } -
404 Not Found if key missing:
{ "error": "Key not found" } -
400 Bad Request on invalid params:
{ "error": "Missing key, q or window" }
-
-
gRPC Services
Pay attention to the-protoflag, which needs to point to the correct path of the file and this depends on the current directory where you will execute thegrpcurlcommand.Install
grpcurlor use any other gRPC client.IngestSample:
TIMESTAMP_MS=$(($(date +%s)*1000))grpcurl -plaintext \ -proto proto/quantile_service.proto \ -d '{"key":"foo","value":42.0,"timestamp":'"$TIMESTAMP_MS"'}' \ localhost:50051 \ quantile.QuantileService/IngestSampleResponse:
{ "success": true }QueryQuantile:
grpcurl -plaintext \ -proto proto/quantile_service.proto \ -d '{"key":"foo","q":0.5,"windowSec":60}' \ localhost:50051 \ quantile.QuantileService/QueryQuantileResponse:
{ "estimate": 42.0, "count": 1 }
This project leverages GitHub Actions to automate both continuous integration and continuous deployment. The workflow is defined in .github/workflows/ci-cd.yml and includes the following stages:
-
Checkout & Setup
- Checks out the repository.
- Installs JDK 19 and the Clojure CLI.
-
Linting
- Runs
clj -X:kibitto enforce idiomatic Clojure and fails the build on any warnings.
- Runs
-
Testing
- Executes the full Kaocha test suite (unit, integration, load, and end-to-end) inside the Docker builder stage.
- Aborts immediately if any test fails.
-
Docker Build
- Builds a multi-stage Docker image using the provided
Dockerfile. - Base stage installs dependencies and tools; builder stage compiles and generates the uberjar; prod stage produces a lean runtime image with just the JRE and the jar.
- Builds a multi-stage Docker image using the provided
-
Image Tag & Push
- Tags the image with the Git commit SHA (e.g.
quantile-service:${{ github.sha }}). - Pushes the image to Docker Hub registry.
- Tags the image with the Git commit SHA (e.g.
-
Deployment
- Uses the AWS CLI (configured via GitHub Secrets) to update the running service on AWS.
- Performs a rolling deployment so that new containers replace old ones without downtime.
All steps are configured to fail fast, ensuring only lint- and test-verified artifacts make it to production.
The service runs in a Kubernetes-managed environment on Amazon EKS, with fully automated deployments via GitHub Actions.
- Deployed into a dedicated VPC across multiple Availability Zones for high availability.
- Uses managed node groups to run pods.
- Ingress traffic is handled by a Kubernetes Ingress, routing HTTP (port 8080) and gRPC (port 50051) requests to the
quantile-serviceDeployment. - IAM permissions for the service account are scoped via an AWS IAM Role for Service Accounts (IRSA) to limit access to only required AWS resources.
This EKS-based setup ensures your quantile aggregation service is highly available, scalable, and continuously delivered from code push through to production without manual intervention.