Good intentions, bad outcomes

The AWS Outage That Broke Half the Internet: Your Cloud Isn't as Safe as You Think

Xodiac Season 1 Episode 16

Ever moved to the cloud thinking you'd finally eliminate those dreaded outages? In this episode, Gino Marckx and Wayne Hetherington break down what happened when AWS went down and took half the world's services with it.

The intent behind cloud migration is solid. Move off your own hardware, get better reliability, scale as needed, and never worry about infrastructure again. The cloud provider handles redundancy, right? Except when AWS goes down, so does everything running on it. You've just traded one single point of failure for another.

We walk through why this keeps happening. Most organizations assume the cloud provider has built-in redundancy across regions and availability zones. And they do - within their own system. But if you're only on AWS, or only on Azure, or only on Google Cloud, you're still vulnerable when that one provider has issues.

The solution? Multi-cloud architecture. Spread your critical services across different providers. Yes, it costs more. Yes, it adds complexity. But if uptime actually matters for your business, it's the only real answer.

We also talk about when it's okay to accept the risk. A pet grooming appointment booking site can probably survive a few hours down per year. Medical services or air traffic control? That's a different calculation. It comes down to understanding how many nines of uptime you actually need and what you're willing to pay for it.

Timestamps: 

0:00 - Introduction 

0:33 - AWS outage hits half the world 

1:16 - Why organizations move to cloud in the first place 

2:50 - The promise of always-available infrastructure 

4:39 - So why did everything go down? 

5:28 - You still have a single point of failure 

6:25 - The assumption of built-in redundancy 

7:21 - Building real backup plans across providers

 8:40 - How unlikely is a multi-cloud failure? 

9:41 - The challenge of keeping environments consistent 

10:58 - Cost vs. redundancy: the eternal tradeoff 

11:34 - How many nines do you actually need? 

12:39 - Making the right choice for your situation 

13:24 - Wrap-up

Contact us at feedback@goodintentionsbadoutcomes.org

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

Definitely, Maybe Agile Artwork

Definitely, Maybe Agile

Peter Maddison and Dave Sharrock