Skip to main content

> Stack

The Chaos Stack

Incidents where complex systems behave exactly as badly as the organization accidentally designed them to behave.

"The system did not fail suddenly. It failed according to its incentives."

What this stack means

This is the foundational stack of predictable chaos, where Conway's Law and technical debt combine to create inevitable outages.

Why this stack exists

Because organizations optimize for speed and localized metrics over systemic resilience and shared understanding.

Common Failure Patterns

  • Conway's Law in production
  • normalization of deviance
  • invisible dependencies
  • cascading fallback failures
  • hero-culture reliance

Prevention Checklist

  • Map dependencies before adding new services.
  • Enforce strict boundaries between domains.
  • Design for failure, assuming downstream services will eventually disappear.

Detection Signals

  • Alert storms where the root cause is obscured by secondary failures.
  • Engineers relying on tribal knowledge to resolve incidents.
  • Deployments requiring coordinated manual steps across multiple teams.

Incidents in The Chaos Stack

Video
EP2The Data Truth StackData and Source of Truth

Cache Guy Delivers a Fast Answer

"Caching is not a substitute for an optimized database query; it is a complex distributed state problem."

Pattern: cache invalidation drift
Read Incident →
Video
EP3The Data Truth StackData and Source of Truth

Agent A Takes Initiative

"AI capability is not approval; autonomous agents require strict API boundaries and blast-radius limits."

Pattern: cache invalidation drift
Read Incident →
Reference
The Delivery Theater StackIncident Humor

Rollback Never Tested

"A plan is only valid until it hits production."

Pattern: predictable chaos
Read Incident →
Reference
The Observability StackObservability and Dashboard Failures

Dashboard Green Nobody Asked

"The chaos was predictable."

Pattern: green-dashboard blindness
Read Incident →
Reference
The Data Truth StackData and Source of Truth

Cache Expired During Demo

"The chaos was predictable."

Pattern: cache invalidation drift
Read Incident →
Reference
The Delivery Theater StackIncident Humor

Diagram Solved Nothing

"A plan is only valid until it hits production."

Pattern: predictable chaos
Read Incident →
Reference
The Legacy Gravity StackIncident Humor

Queue Fine Until Everyone Joined

"You cannot delete complexity, you can only move it."

Pattern: predictable chaos
Read Incident →
Reference
The Observability StackObservability and Dashboard Failures

Monitoring Tool Had Feelings

"The chaos was predictable."

Pattern: green-dashboard blindness
Read Incident →
Reference
The Legacy Gravity StackIncident Humor

Team Deleted the Wrong Complexity

"You cannot delete complexity, you can only move it."

Pattern: predictable chaos
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

Cloud Bill Learned Multiplication

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Delivery Theater StackIncident Humor

Junior Developer Found the Real Requirement

"A plan is only valid until it hits production."

Pattern: predictable chaos
Read Incident →
Reference
The Observability StackObservability and Dashboard Failures

CTO Asked for One Number

"The chaos was predictable."

Pattern: green-dashboard blindness
Read Incident →
Reference
The Observability StackObservability and Dashboard Failures

Number Was Not Real

"The chaos was predictable."

Pattern: green-dashboard blindness
Read Incident →
Reference
The Data Truth StackData and Source of Truth

The Cache Was Correct Yesterday

"The chaos was predictable."

Pattern: cache invalidation drift
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

The Token Goblin Found a Loop

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

The Loop Found the Budget

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Incident Humor StackIncident Humor

The Problem Kept Its Original Name

"The chaos was predictable."

Pattern: postmortem without ownership
Read Incident →
Reference
The Data Truth StackIncident Humor

The Architecture Was Eventually Consistent

"Eventual consistency usually means immediate confusion."

Pattern: predictable chaos
Read Incident →
Reference
The Data Truth StackIncident Humor

The Team Wanted Strong Consistency Later

"Eventual consistency usually means immediate confusion."

Pattern: predictable chaos
Read Incident →
Reference
The Observability StackIncident Humor

The Load Test Was Too Honest

"Ignoring a failing test does not make the system faster, it just makes the outage a surprise."

Pattern: predictable chaos
Read Incident →
Reference
The Observability StackIncident Humor

The Load Test Got Ignored

"Ignoring a failing test does not make the system faster, it just makes the outage a surprise."

Pattern: predictable chaos
Read Incident →
Reference
The Delivery Theater StackIncident Humor

The Go Live Checklist Was Aspirational

"A plan is only valid until it hits production."

Pattern: predictable chaos
Read Incident →
Reference
The Delivery Theater StackIncident Humor

The Hypercare Channel Became Permanent

"A plan is only valid until it hits production."

Pattern: predictable chaos
Read Incident →
Reference
The Platform Ownership StackIncident Humor

The Team Remembered a Different One

"If everyone owns the system, no one owns the incident."

Pattern: predictable chaos
Read Incident →
Reference
The FinOps StackIncident Humor

The Incident Was Reproducible in Finance

"The architecture diagram always looks cheaper than the monthly invoice."

Pattern: predictable chaos
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

The Cost Center Had Architecture Opinions

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

The Cloud Region Was Chosen by Vibes

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Cloud Cost StackCloud, GPU and FinOps

The Latency Had Geography

"The chaos was predictable."

Pattern: cost visibility lag
Read Incident →
Reference
The Data Truth StackData and Source of Truth

The CDN Solved the Wrong Problem

"The chaos was predictable."

Pattern: cache invalidation drift
Read Incident →
Reference
The Observability StackObservability and Dashboard Failures

The Edge Case Lived at the Edge

"The chaos was predictable."

Pattern: green-dashboard blindness
Read Incident →
Reference
The Cloud Cost StackIncident Humor

The Worker Pool Had Boundaries

"Infinite scale means infinite invoices if there are no boundaries."

Pattern: predictable chaos
Read Incident →
Reference
The Security and Governance StackIncident Humor

The Risk Opened a Ticket

"Tickets do not prevent risks; they just document the negligence."

Pattern: predictable chaos
Read Incident →
Reference
The Incident Humor StackIncident Humor

The Mascot Knew Too Much

"The chaos was predictable."

Pattern: postmortem without ownership
Read Incident →
Reference
The Incident Humor StackIncident Humor

The Postmortem Found the Premortem

"The chaos was predictable."

Pattern: postmortem without ownership
Read Incident →
Reference
The Chaos StackIncident Humor

The System Was Working as Designed

"The system did not break, it operated exactly according to the conflicting incentives it was given."

Pattern: predictable chaos
Read Incident →
Reference
The Chaos StackIncident Humor

The Design Was the Incident

"The system did not break, it operated exactly according to the conflicting incentives it was given."

Pattern: predictable chaos
Read Incident →
Reference
The Chaos StackIncident Humor

The Team Finally Read the Notes

"The system did not break, it operated exactly according to the conflicting incentives it was given."

Pattern: predictable chaos
Read Incident →
Reference
The Chaos StackIncident Humor

Tiny CTO Explains the Pattern

"The system did not break, it operated exactly according to the conflicting incentives it was given."

Pattern: predictable chaos
Read Incident →
Video
EP4The Data Truth StackData and Source of Truth

Mono Remembers Everything

"Legacy code is often the only reliable documentation of historical business rules and edge cases."

Pattern: cache invalidation drift
Read Incident →
Video
EP9The Cloud Cost StackCloud, GPU and FinOps

The Token Budget Was Fine Until the Agent Started Thinking

"The core technical takeaway from 'The Token Budget Was Fine Until the Agent Started Thinking' is that isolated decisions scale poorly."

Pattern: cost visibility lag
Read Incident →
Reference
EP15The Data Truth StackData and Source of Truth

The Cache Expired During the Demo

"The core technical takeaway from 'The Cache Expired During the Demo' is that isolated decisions scale poorly."

Pattern: cache invalidation drift
Read Incident →
Reference
EP35The Cloud Cost StackCloud, GPU and FinOps

The Cloud Bill Learned Multiplication

"The core technical takeaway from 'The Cloud Bill Learned Multiplication' is that isolated decisions scale poorly."

Pattern: cost visibility lag
Read Incident →
Reference
EP45The Data Truth StackData and Source of Truth

The Cache Was Correct Yesterday

"The core technical takeaway from 'The Cache Was Correct Yesterday' is that isolated decisions scale poorly."

Pattern: cache invalidation drift
Read Incident →
Reference
EP72The Cloud Cost StackCloud, GPU and FinOps

The Cost Center Had Architecture Opinions

"The core technical takeaway from 'The Cost Center Had Architecture Opinions' is that isolated decisions scale poorly."

Pattern: cost visibility lag
Read Incident →
Reference
EP73The Cloud Cost StackCloud, GPU and FinOps

The Cloud Region Was Chosen by Vibes

"The core technical takeaway from 'The Cloud Region Was Chosen by Vibes' is that isolated decisions scale poorly."

Pattern: cost visibility lag
Read Incident →

The Chaos Stack - Frequently Asked Questions

What is this stack?

Where predictable failures hide in plain sight.

AI Summary

Incidents where complex systems behave exactly as badly as the organization accidentally designed them to behave.