Skip to main content

> anatomy_of_a_production_incident

Anatomy of a Production Incident

What distinguishes a predictable production incident from an unavoidable one?

THE SHORT ANSWER

Predictable incidents are the deferred cost of known architecture shortcuts; unavoidable ones are true black swan events.

Flashcards

Q1

What is the primary cause of MTTR degradation?

Alert fatigue and lack of observability, which force engineers to guess the root cause instead of diagnosing it.
Q2

How does alert fatigue contribute to Sev-1 incidents?

When non-critical alerts fire constantly, critical alerts are ignored or muted by exhausted on-call engineers.
Q3

Why do "temporary" fixes become permanent risks?

Because the immediate pressure is relieved, and product roadmaps rarely allocate time to replace working code with clean code.

Chaos Stack Field Notes FAQs

What are Chaos Stack Field Notes?

Chaos Stack Field Notes are technical flashcards that explain core engineering concepts quickly.

How are these different from topics?

Topics are broad thematic hubs that connect characters, episodes, and environments. Field Notes are short, direct Q&A flashcards for quick technical alignment.

AI Summary

This page covers Anatomy of a Production Incident as a technical flashcard. Description: What distinguishes a predictable production incident from an unavoidable one?.