incident review 20251017 by peterbjohnson · Pull Request #23 · lambda-feedback/user-documentation

peterbjohnson · 2025-10-21T12:28:25Z

No description provided.

timothee-alby · 2025-10-22T12:33:34Z

docs/releases/status.md


 The Severity of incidents is the product of number of users affected (for 100 users, N = 1), magnitude of the effect (scale 1-5 from workable to no service), and the duration (in hours). Severity below 1 is LOW, between 1 and 100 is SIGNIFICANT, and above 100 is HIGH. The severity is used to decide how much we invest in preventative measures, detection, mitigation plans, and rehearsals.

+## 2025 October 20th: AWS Outage in US East (No effects, brief review)


I don't think it's relevant to mention a non-outage?
Some infra goes down every day somewhere around the world and we don't mention it; this is not different from my perspective.

It's a fair point, where do we draw the line. I can take this out.

timothee-alby · 2025-10-22T12:34:04Z

docs/releases/status.md

+
+Handwriting in response areas (but not in the canvas) did not return a preview and could not be submitted. Users received an error in a toast saying that the service would not work. All other services remained operational.
+
+### Timeline (UK / BST)


Style: I don't know why but this title isn't picked up as Markdown?1

Will fix on next push.

timothee-alby · 2025-10-22T12:35:09Z

docs/releases/status.md

+- Monitoring immediately after pushes, and approximately an hour after pushes, should be standard procedure.
+- Integration tests would help, although they are considered outside the scope of this project at the current stage due to the resource required to continually maintain those tests
+
+N=0.2, effect = 2, duration = 5. Severity = 2 (SIGNIFICANT.)


Should that line be there?
(Great to see how you're using maths to pick a severity level!)

I agree it's not a perfect place for them, but I'll leave them for now for transparency.

incident review 20251017

2dd280b

peterbjohnson requested a review from timothee-alby October 21, 2025 12:28

timothee-alby reviewed Oct 22, 2025

View reviewed changes

timothee-alby assigned peterbjohnson Oct 22, 2025

incident 20251017 updates following review

4d5b73b

peterbjohnson merged commit 25b5e4e into main Oct 22, 2025
1 check passed

timothee-alby deleted the incident-review-20251017 branch October 23, 2025 08:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

incident review 20251017#23

incident review 20251017#23
peterbjohnson merged 2 commits intomainfrom
incident-review-20251017

peterbjohnson commented Oct 21, 2025

Uh oh!

timothee-alby Oct 22, 2025

Uh oh!

peterbjohnson Oct 22, 2025

Uh oh!

timothee-alby Oct 22, 2025

Uh oh!

peterbjohnson Oct 22, 2025

Uh oh!

timothee-alby Oct 22, 2025

Uh oh!

peterbjohnson Oct 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		The Severity of incidents is the product of number of users affected (for 100 users, N = 1), magnitude of the effect (scale 1-5 from workable to no service), and the duration (in hours). Severity below 1 is LOW, between 1 and 100 is SIGNIFICANT, and above 100 is HIGH. The severity is used to decide how much we invest in preventative measures, detection, mitigation plans, and rehearsals.

		## 2025 October 20th: AWS Outage in US East (No effects, brief review)


		Handwriting in response areas (but not in the canvas) did not return a preview and could not be submitted. Users received an error in a toast saying that the service would not work. All other services remained operational.

		### Timeline (UK / BST)

Comments

Conversation

peterbjohnson commented Oct 21, 2025

Uh oh!

timothee-alby Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

peterbjohnson Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

timothee-alby Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

peterbjohnson Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

timothee-alby Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

peterbjohnson Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants