Skip to content

Commit 76e7f85

Browse files
authored
docs(readme): update main README for Phase 5 completion (#13)
1 parent 07cd01f commit 76e7f85

File tree

1 file changed

+21
-153
lines changed

1 file changed

+21
-153
lines changed

README.md

Lines changed: 21 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -1,170 +1,38 @@
11
# Spring Boot Security & Observability Lab
22

3-
This repository is a hands-on lab designed to demonstrate the architectural evolution of a modern Java application. We will build a system from the ground up, starting with a secure monolith and progressively refactoring it into a fully observable, distributed system using cloud-native best practices.
3+
This repository is an advanced, hands-on lab demonstrating the architectural evolution of a modern Java application. We will build a system from the ground up, starting with a secure monolith and progressively refactoring it into a fully observable, distributed system using cloud-native best practices.
44

55
---
66

7-
## Lab Progress: Phase 5 - Correlated Logs & Access Auditing
7+
## Workshop Guide: The Evolutionary Phases
88

9-
The `main` branch currently represents the completed state of **Phase 5**.
9+
This lab is structured in distinct, self-contained phases. The `main` branch always represents the latest completed phase. To explore a previous phase's code and detailed documentation, use the links below.
1010

11-
* **Git Tag for this Phase:** `v5.0-correlated-logs-auditing`
12-
13-
### Objective
14-
15-
The goal of this phase was to complete the "three pillars of observability" by introducing a centralized, structured logging pipeline. We have also added a critical security layer by implementing a non-invasive, AOP-based audit logging mechanism. The system is now not only fully observable (metrics, traces, and logs), but all three pillars are correlated, allowing for seamless navigation from a distributed trace directly to the logs generated during that specific transaction.
16-
17-
### Key Concepts Demonstrated
18-
19-
* **Centralized Logging:** Introducing Grafana Loki as a scalable, efficient log aggregation system.
20-
* **Unified Telemetry Collection:** Adopting Grafana Alloy as the modern, state-of-the-art agent for collecting **both logs and traces**, replacing older, single-purpose agents.
21-
* **Docker Service Discovery:** Configuring Alloy to use the Docker socket to automatically discover and scrape logs from all running containers, creating a "zero-touch" logging pipeline that scales automatically.
22-
* **Trace-to-Log Correlation:** Configuring Grafana to provide one-click navigation from a trace span in Tempo to the exact logs in Loki that correspond to that trace ID.
23-
* **Structured JSON Logging:** Ensuring all application logs are emitted as single-line, machine-readable JSON, a critical prerequisite for reliable parsing and querying.
24-
* **Aspect-Oriented Programming (AOP):** Creating a shared `lab-aspects` module to implement cross-cutting concerns without modifying business logic.
25-
* **Custom Audit Logs & Metrics:** Building `@Auditable` aspect that generates rich, structured audit logs and corresponding Micrometer metrics (`Counter` and `Timer`) for security monitoring and alerting.
26-
* **Multi-Module Maven Project:** Refactoring the build to support a shared library module and creating robust, multi-module-aware `Dockerfiles`.
27-
28-
### Architecture Overview
29-
30-
Phase 5 enriches our distributed system with a complete, correlated observability pipeline managed by Grafana Alloy.
31-
32-
```mermaid
33-
graph TD
34-
subgraph "User's Machine"
35-
B[Browser]
36-
end
37-
38-
subgraph "Docker Compose Network (lab-net)"
39-
P[Traefik Proxy]
40-
41-
subgraph "Application Services"
42-
WC[Web Client]
43-
RS[Resource Server]
44-
end
45-
46-
subgraph "Identity Services"
47-
KC[Keycloak]
48-
DB[(PostgreSQL)]
49-
end
50-
51-
subgraph "Observability Stack"
52-
A[Alloy Agent]
53-
L[Loki]
54-
T[Tempo]
55-
Prom[Prometheus]
56-
G[Grafana]
57-
end
58-
end
59-
60-
%% User and Service Flows (Unchanged)
61-
B -- "User Interaction" --> P --> WC;
62-
WC -- "Backend API Call" --> RS;
63-
RS -- "Token Validation" --> KC;
64-
65-
%% NEW: Observability Data Flow
66-
subgraph "Telemetry Collection"
67-
RS -- "1a. Emits Traces (OTLP)" --> A;
68-
WC -- "1b. Emits Traces (OTLP)" --> A;
69-
RS -- "2a. Writes Logs (stdout)" --> Docker;
70-
WC -- "2b. Writes Logs (stdout)" --> Docker;
71-
Docker -- "3. Scraped by" --> A;
72-
end
73-
74-
subgraph "Telemetry Processing & Storage"
75-
A -- "4a. Forwards Traces" --> T;
76-
A -- "4b. Forwards Logs" --> L;
77-
RS -- "5. Exposes Metrics" --> Prom;
78-
WC -- "6. Exposes Metrics" --> Prom;
79-
end
80-
81-
subgraph "Visualization"
82-
G -- "Queries Traces" --> T;
83-
G -- "Queries Logs" --> L;
84-
G -- "Queries Metrics" --> Prom;
85-
end
86-
```
87-
88-
1. **[Grafana Loki](config/loki/loki-config.yml):** The new log storage backend. It is configured to run in a simple, single-tenant mode and stores its data in a persistent Docker volume.
89-
2. **[Grafana Alloy](config/alloy/alloy-config.river):** The new heart of our collection pipeline. It performs two critical functions:
90-
* **Log Collection:** It connects to the Docker socket to discover our running application containers, scrapes their `stdout` log streams, and forwards them to Loki.
91-
* **Trace Collection:** It acts as an OTLP endpoint, receiving traces from our applications' Java agents and forwarding them to Tempo.
11+
| Phase | Description & Key Concepts | Code & Docs (at tag) | Key Pull Requests |
12+
|:-----------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
13+
| **1. The Secure Monolith** | A standalone service that issues and validates its own JWTs. Concepts: `AuthenticationManager`, custom `JwtAuthenticationFilter`, `jjwt` library, and a foundational CI pipeline. | [`v1.0-secure-monolith`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v1.0-secure-monolith) | [#2](https://github.com/apenlor/spring-boot-security-observability-lab/pull/2), [#3](https://github.com/apenlor/spring-boot-security-observability-lab/pull/3), [#4](https://github.com/apenlor/spring-boot-security-observability-lab/pull/4) |
14+
| **2. Observing the Monolith** | The service is containerized and orchestrated via `docker-compose`. Concepts: Micrometer, Prometheus, Grafana, custom metrics, and automated dashboard provisioning. | [`v2.0-observable-monolith`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v2.0-observable-monolith) | [#6](https://github.com/apenlor/spring-boot-security-observability-lab/pull/6) |
15+
| **3. Evolving to Federated Identity** | The system is refactored into a multi-service architecture with an external IdP. Concepts: Keycloak, OIDC, OAuth2 Client (`web-client`) vs. Resource Server, Traefik reverse proxy, service-to-service security. | [`v3.0-federated-identity`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v3.0-federated-identity) | [#8](https://github.com/apenlor/spring-boot-security-observability-lab/pull/8) |
16+
| **4. Tracing a Distributed System** | Services are instrumented with the OpenTelemetry agent to generate traces. Concepts: Tempo, agent-based instrumentation, W3C Trace Context, Service Graphs, and a hybrid PUSH/PULL metrics architecture. | [`v4.0-distributed-tracing`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v4.0-distributed-tracing) | [#10](https://github.com/apenlor/spring-boot-security-observability-lab/pull/10) |
17+
| **5. Correlated Logs & Access Auditing** | The three pillars of observability are complete (metrics, traces, logs). Alloy is the unified collection agent. Concepts: Loki, Grafana Alloy, Docker service discovery, structured JSON logs, AOP-based auditing, trace-to-log correlation, and detailed audit metrics. | [`v5.0-correlated-logs-auditing`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v5.0-correlated-logs-auditing) | [#12](https://github.com/apenlor/spring-boot-security-observability-lab/pull/12) |
18+
| **6. Proactive Alerting** | _Upcoming..._ | - | - |
19+
| **7. Continuous Security Integration** | _Upcoming..._ | - | - |
20+
| **8. Advanced Secret Management** | _Upcoming..._ | - | - |
9221

9322
---
9423

95-
### Key Configuration Details
24+
## How to Follow This Lab
9625

97-
#### 1. Grafana Alloy & Docker Service Discovery
98-
99-
To achieve a fully automated logging pipeline the `alloy` service in our `docker-compose.yml` mounts the host's Docker socket (`/var/run/docker.sock`) in read-only mode.
100-
101-
This is a privileged operation, and the decision to use it is a deliberate architectural trade-off, as documented in the `docker-compose.yml`'s security disclaimer.
102-
* **Benefit:** Alloy can query the Docker API to automatically discover every container on our project's network. It gets rich metadata like the `container_name` for free, which it uses to create labels in Loki. This means we can add new services, and our logging pipeline will **automatically start collecting their logs** with zero configuration changes.
103-
* **Mitigation:** The risk is managed by using the official, minimalist Grafana Alloy image and mounting the socket as **read-only**. Anyway, not recommendable for production environments.
104-
105-
The [Alloy configuration](config/alloy/alloy-config.river) is written in the River (`.river`) language and defines a clear pipeline: discover Docker containers, filter them by our project's network, relabel them with a clean `container_name`, and forward their logs to Loki.
106-
107-
#### 2. AOP-based Audit Logging
108-
109-
To handle security auditing as a cross-cutting concern, we introduced a new, shared Maven module: [`lab-aspects`](lab-aspects).
110-
* This module contains a custom `@Auditable` annotation and the `AuditLogAspect`.
111-
* The aspect intercepts any method marked with `@Auditable` and performs two actions:
112-
1. **Logs a Structured Event:** It uses SLF4J's Fluent API to create a rich, nested JSON object containing detailed context about the event (principal, roles, outcome, duration, sanitized request details, and exception info). These are logged to a dedicated `AUDIT` logger.
113-
2. **Emits Metrics:** It records a `Counter` (`app.audit.events.total`) and a `Timer` (`app_audit_events_duration_seconds`) for every audit event. These metrics are tagged with low-cardinality labels (`method`, `outcome`), making them perfect for building high-performance dashboards and alerts.
114-
115-
This implementation is fully tested with its own integration test suite, which validates every feature, including the metric emission and context handling.
26+
1. **Start with the `main` branch** to see the latest state of the project.
27+
2. To go back in time, use the **"Code & Docs" link** for a specific phase. This will show you the `README.md` for that phase, which contains the specific instructions and examples for that version of the code.
28+
3. To understand the *"why"* behind the changes, review the **Key Pull Requests** for each phase.
11629

11730
---
11831

119-
## Local Development & Quick Start
120-
121-
The prerequisites and setup are the same as in previous phases.
122-
123-
1. **Configure Local Hostnames (One-Time Setup, if not already done):**
124-
Edit your local `hosts` file to add:
125-
```
126-
127.0.0.1 keycloak.local
127-
```
128-
2. **Create and Configure Your Environment File:**
129-
```bash
130-
cp .env.example .env
131-
# ...then edit .env to add your WEB_CLIENT_SECRET from Keycloak.
132-
```
133-
3. **Build and run the entire stack:**
134-
```bash
135-
docker-compose up --build -d
136-
```
137-
4. **Access the Services:**
138-
* **Web Client Application:** [http://localhost:8082](http://localhost:8082) (Login with `lab-user`/`lab-user` or
139-
`lab-admin`/`lab-admin`)
140-
* **Keycloak Admin Console:** [http://keycloak.local](http://keycloak.local) (Login with `admin`/`admin`)
141-
* **Traefik Dashboard:** [http://localhost:8080](http://localhost:8080)
142-
* **Prometheus UI:** [http://localhost:9090](http://localhost:9090)
143-
* **Grafana UI:** [http://localhost:3000](http://localhost:3000) (Login with `admin`/`admin`)
144-
---
145-
146-
## Validating the New Observability Features
147-
148-
1. **Generate Traffic:** Log in to the web client as `lab-user`/`lab-user` and click the "Call Secure API" and "Call Admin API" buttons several times.
149-
150-
2. **Validate Audit Logs:**
151-
* In Grafana, go to Explore -> Loki.
152-
* Run the query: `{container_name="resource-server"} | json | logger_name="AUDIT"`
153-
* Inspect the logs. You will see the structured `audit` object with `outcome="SUCCESS"` for successful calls and `outcome="FAILURE"` for the denied admin call.
154-
155-
3. **Validate Audit Metrics:**
156-
* In Grafana, go to Explore -> Prometheus.
157-
* Run the query: `rate(app_audit_events_total{outcome="FAILURE"}[1m])`
158-
* You should see the rate of failed audit events for the `getAdminData` method.
32+
## Running the Project
15933

160-
4. **Validate Trace-to-Log Correlation:**
161-
* Find a trace in Tempo for a `GET /fetch-data` operation.
162-
* Click on the span for the `resource-server`.
163-
* In the span details panel, a **blue "Logs for this span" button** will be visible.
164-
* Clicking it will open Loki and show you the exact logs—including the audit log—for that specific trace.
34+
To run the application and see usage examples for the **current phase**, please refer to the detailed instructions in its tagged `README.md` file.
16535

166-
#### Stop the Environment
36+
**[>> Go to instructions for the current phase: `v5.0-correlated-logs-auditing` <<](https://github.com/apenlor/spring-boot-security-observability-lab/blob/v5.0-correlated-logs-auditing/docs/phase-5-readme.md#local-development--quick-start)**
16737

168-
```bash
169-
docker-compose down -v
170-
```
38+
As the lab progresses, this link will always be updated to point to the latest completed phase.

0 commit comments

Comments
 (0)