Skip to content

Commit 13c6c70

Browse files
authored
docs(readme): update main README for Phase 6 completion (#15)
1 parent fe2c50e commit 13c6c70

File tree

1 file changed

+21
-105
lines changed

1 file changed

+21
-105
lines changed

README.md

Lines changed: 21 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -1,122 +1,38 @@
11
# Spring Boot Security & Observability Lab
22

3-
This repository is a hands-on lab designed to demonstrate the architectural evolution of a modern Java application. We will build a system from the ground up, starting with a secure monolith and progressively refactoring it into a fully observable, distributed system using cloud-native best practices.
3+
This repository is an advanced, hands-on lab demonstrating the architectural evolution of a modern Java application. We will build a system from the ground up, starting with a secure monolith and progressively refactoring it into a fully observable, distributed system using cloud-native best practices.
44

55
---
66

7-
## Lab Progress: Phase 6 - Proactive Alerting with Alertmanager
7+
## Workshop Guide: The Evolutionary Phases
88

9-
The `main` branch currently represents the completed state of **Phase 6**.
9+
This lab is structured in distinct, self-contained phases. The `main` branch always represents the latest completed phase. To explore a previous phase's code and detailed documentation, use the links below.
1010

11-
* **Git Tag for this Phase:** `v6.0-proactive-alerting`
12-
13-
### Objective
14-
15-
The goal of this phase was to transition our monitoring strategy from passive (dashboards) to **proactive**. We have integrated the Prometheus Alertmanager into our stack to create a system that can automatically detect and route notifications about problems, without requiring a human to be watching a screen. This demonstrates the completion of a production-grade monitoring feedback loop.
16-
17-
### Key Concepts Demonstrated
18-
19-
* **Prometheus Alerting Pipeline:** Understanding the distinct roles of Prometheus (which evaluates rules and generates alerts) and Alertmanager (which receives, de-duplicates, groups, and routes alerts).
20-
* **Declarative Alerting Rules:** Defining alerting conditions as code using PromQL expressions in a version-controlled YAML file.
21-
* **Alerting on Technical & Security Metrics:** Creating two distinct types of alerts:
22-
1. A **technical alert** (`ApiServerErrorRateHigh`) that fires on infrastructure-level signals like a spike in 5xx server errors.
23-
2. A **security alert** (`UnauthorizedAdminAccessSpike`) that fires on application-level signals, such as an abnormal rate of `4xx` errors on a privileged endpoint.
24-
* **Alert Lifecycle:** Observing the full lifecycle of an alert: `Inactive` -> `Pending` -> `Firing` -> `Resolved`.
25-
* **UI-Driven Test Harness:** Building a dedicated "Alerting Test Panel" in our web application to reliably trigger alert conditions on demand, proving the entire pipeline works end-to-end.
26-
27-
### Architecture Overview
28-
29-
Phase 6 introduces Alertmanager and connects it to our existing Prometheus instance. The data flow for alerting is now a core part of our observability stack.
30-
31-
```mermaid
32-
graph TD
33-
subgraph "Application Services"
34-
RS[Resource Server]
35-
WC[Web Client]
36-
end
37-
38-
subgraph "Observability Stack"
39-
Prom[Prometheus] -->|1. Scrapes Metrics| RS
40-
Prom -->|1. Scrapes Metrics| WC
41-
42-
subgraph "Alerting Pipeline"
43-
Rules[alerts.yml] -->|2. Evaluates| Prom
44-
Prom -->|3. Sends Firing Alerts| AM[Alertmanager]
45-
end
46-
47-
G[Grafana]
48-
end
49-
50-
subgraph "Operators / External Systems"
51-
AM -->|4. Routes Notifications| Notif[Email, Slack, etc.]
52-
Ops[Operator] -->|Views & Manages Alerts| AM
53-
Ops -->|Views Dashboards| G
54-
end
55-
```
56-
57-
1. **[Prometheus](config/prometheus/prometheus.yml):** Its role is expanded. It is now configured to load a [rule file](config/prometheus/alerts.yml) and to send any alerts that become "Firing" to the Alertmanager service. The `--web.external-url` flag is set to ensure backlinks are generated with a browser-resolvable hostname.
58-
2. **[Alertmanager](config/alertmanager/alertmanager.yml):** The new central hub for all alerts. It receives alerts from Prometheus, groups them to reduce noise, and would (in a production setup) route them to configured receivers. For this lab, we use a "null" receiver.
11+
| Phase | Description & Key Concepts | Code & Docs (at tag) | Key Pull Requests |
12+
|:-----------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
13+
| **1. The Secure Monolith** | A standalone service that issues and validates its own JWTs. Concepts: `AuthenticationManager`, custom `JwtAuthenticationFilter`, `jjwt` library, and a foundational CI pipeline. | [`v1.0-secure-monolith`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v1.0-secure-monolith) | [#2](https://github.com/apenlor/spring-boot-security-observability-lab/pull/2), [#3](https://github.com/apenlor/spring-boot-security-observability-lab/pull/3), [#4](https://github.com/apenlor/spring-boot-security-observability-lab/pull/4) |
14+
| **2. Observing the Monolith** | The service is containerized and orchestrated via `docker-compose`. Concepts: Micrometer, Prometheus, Grafana, custom metrics, and automated dashboard provisioning. | [`v2.0-observable-monolith`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v2.0-observable-monolith) | [#6](https://github.com/apenlor/spring-boot-security-observability-lab/pull/6) |
15+
| **3. Evolving to Federated Identity** | The system is refactored into a multi-service architecture with an external IdP. Concepts: Keycloak, OIDC, OAuth2 Client (`web-client`) vs. Resource Server, Traefik reverse proxy, service-to-service security. | [`v3.0-federated-identity`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v3.0-federated-identity) | [#8](https://github.com/apenlor/spring-boot-security-observability-lab/pull/8) |
16+
| **4. Tracing a Distributed System** | Services are instrumented with the OpenTelemetry agent to generate traces. Concepts: Tempo, agent-based instrumentation, W3C Trace Context, Service Graphs, and a hybrid PUSH/PULL metrics architecture. | [`v4.0-distributed-tracing`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v4.0-distributed-tracing) | [#10](https://github.com/apenlor/spring-boot-security-observability-lab/pull/10) |
17+
| **5. Correlated Logs & Access Auditing** | The three pillars of observability are complete (metrics, traces, logs). Alloy is the unified collection agent. Concepts: Loki, Grafana Alloy, Docker service discovery, structured JSON logs, AOP-based auditing, trace-to-log correlation, and detailed audit metrics. | [`v5.0-correlated-logs-auditing`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v5.0-correlated-logs-auditing) | [#12](https://github.com/apenlor/spring-boot-security-observability-lab/pull/12) |
18+
| **6. Proactive Alerting** | The system transitions from passive to proactive monitoring. Concepts: Alertmanager, declarative PromQL alert rules, alerting on technical vs. security metrics, and a UI-driven test harness. | [`v6.0-proactive-alerting`](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v6.0-proactive-alerting) | [#14](https://github.com/apenlor/spring-boot-security-observability-lab/pull/14) |
19+
| **7. Continuous Security Integration** | _Upcoming..._ | - | - |
20+
| **8. Advanced Secret Management** | _Upcoming..._ | - | - |
5921

6022
---
6123

62-
### Key Configuration Details
24+
## How to Follow This Lab
6325

64-
#### 1. Prometheus Alert Rules
65-
66-
The core of this phase is the [alerts.yml](config/prometheus/alerts.yml) file. We have defined two rules that are specifically tailored for our application and optimized for a lab environment with short `for` durations for rapid testing.
67-
68-
* **`ApiServerErrorRateHigh`:** This rule fires when the rate of `5xx` status codes from the `resource-server` exceeds 0 for a continuous period. It is designed to be triggered by our `ChaosController`.
69-
* **`UnauthorizedAdminAccessSpike`:** This security-focused rule fires when the rate of `4xx` status codes on the specific `/api/secure/admin` endpoint exceeds 0. This is more robust than checking for just `403` as it captures any client-side error on this privileged endpoint, signaling a potential issue.
70-
71-
#### 2. UI-Driven Test Harness
72-
73-
To validate the entire alerting pipeline, we implemented a dedicated "Alerting Test Panel" in the `web-client`.
74-
* The `ChaosController` in the `resource-server` was enhanced with a guaranteed-failure endpoint (`/api/chaos/error`).
75-
* The `WebController` in the `web-client` was updated with two new `POST` endpoints that call the backend to generate `5xx` and `4xx` errors.
76-
77-
---
78-
79-
## Local Development & Quick Start
80-
81-
The prerequisites and setup are the same as in previous phases.
82-
83-
1. **Configure Local Hostnames (One-Time Setup, if not already done):**
84-
Edit your local `hosts` file to add:
85-
```
86-
127.0.0.1 keycloak.local
87-
```
88-
2. **Create and Configure Your Environment File:**
89-
```bash
90-
cp .env.example .env
91-
# ...then edit .env to add your WEB_CLIENT_SECRET from Keycloak.
92-
```
93-
3. **Build and run the entire stack:**
94-
```bash
95-
docker-compose up --build -d
96-
```
97-
4. **Access the Services:**
98-
* **Web Client Application:** [http://localhost:8082](http://localhost:8082) (Login with `lab-user`/`lab-user` or `lab-admin`/`lab-admin`)
99-
* **Keycloak Admin Console:** [http://keycloak.local](http://keycloak.local) (Login with `admin`/`admin`)
100-
* **Prometheus UI:** [http://localhost:9090](http://localhost:9090)
101-
* **Alertmanager UI:** [http://localhost:9093](http://localhost:9093)
102-
* **Grafana UI:** [http://localhost:3000](http://localhost:3000)
26+
1. **Start with the `main` branch** to see the latest state of the project.
27+
2. To go back in time, use the **"Code & Docs" link** for a specific phase. This will show you the `README.md` for that phase, which contains the specific instructions and examples for that version of the code.
28+
3. To understand the *"why"* behind the changes, review the **Key Pull Requests** for each phase.
10329

10430
---
10531

106-
## Validating the New Alerting Features
107-
108-
1. **Confirm Rules are Loaded:**
109-
* Navigate to the Prometheus UI's "Alerts" tab ([http://localhost:9090/alerts](http://localhost:9090/alerts)).
110-
* Verify that both new alerts are present and in the green "Inactive" state.
32+
## Running the Project
11133

112-
2. **Trigger the Alerts via the UI:**
113-
* Log in to the Web Client as **`lab-user` / `lab-user`**.
114-
* In the "Alerting Test Panel", repeatedly click the buttons to generate `403` and `5xx` errors.
115-
* Watch the Prometheus Alerts UI. The alerts will transition from `Inactive` to `Pending` (yellow) and then to `Firing` (red).
116-
* Once firing, the alerts will appear in the Alertmanager UI.
34+
To run the application and see usage examples for the **current phase**, please refer to the detailed instructions in its tagged `README.md` file.
11735

118-
#### Stop the Environment
36+
**[>> Go to instructions for the current phase: `v6.0-proactive-alerting` <<](https://github.com/apenlor/spring-boot-security-observability-lab/tree/v6.0-proactive-alerting?tab=readme-ov-file#local-development--quick-start)**
11937

120-
```bash
121-
docker-compose down -v
122-
```
38+
As the lab progresses, this link will always be updated to point to the latest completed phase.

0 commit comments

Comments
 (0)