The Scalable Thread

The Scalable Thread

Home
🏍️ Starter
🚙 Intermediate
✈️ Advanced
📺 Case Studies
Archive
About
Why are Event-Driven Systems Hard?
Understanding the Core Challenges of Asynchronous Architectures
Sep 14, 2025
Why "What Happened First?" Is One of the Hardest Questions in Large-Scale Systems
Understanding Why Exact Ordering of Events is Hard in Large Scale Systems
Aug 30, 2025
How to Keep Services Running During Failures?
Strategies for Graceful Degradation in Large Scale Distributed Systems
Aug 16, 2025
Most Popular
View all
What is Event Sourcing?
Feb 14, 2025
What is Saga Pattern in Distributed Systems?
Feb 21, 2025
How to Improve Performance of Your Database?
May 9, 2025
What is Service Discovery?
Feb 7, 2025
How to Build Idempotent APIs?
Apr 25, 2025
What is the Claim-Check Pattern in Event-Driven Systems?
Mar 7, 2025

Starter

View all
How Tool Calling Works in LLMs
Understanding the Internals of Tool Calling in Large Language Models
Jun 20, 2025
What is Function Sharding in Serverless Computing?
Understanding How Data Computation Can be Divided-and-Conquered in Serverless Architecture
Jan 17, 2025
Sidecar Pattern for Single Node Multi-Container Applications
Understanding Sidecar Design Pattern for Containerized Applications
Jan 3, 2025
The Scalable Thread
The Scalable Thread
One well-researched system design article simplified like you're five, every two weeks!
Social
Threads
Instagram
LinkedIn

Intermediate

View all
How to Keep Services Running During Failures?
Strategies for Graceful Degradation in Large Scale Distributed Systems
Aug 16, 2025
How to Optimize Performance with Cache Warming?
Optimizing Performance and User Experience in Large-Scale Distributed Systems
Aug 1, 2025
How Feature Flags Enable Safer, Faster, and Controlled Rollouts
Understanding Effective Rollouts Using Feature Flags in Distributed Systems
Jun 7, 2025
How to Improve Performance of Your Database?
Strategies for Scaling Databases in Distributed Systems
May 9, 2025

Advanced

View all
Why are Event-Driven Systems Hard?
Understanding the Core Challenges of Asynchronous Architectures
Sep 14, 2025
Why "What Happened First?" Is One of the Hardest Questions in Large-Scale Systems
Understanding Why Exact Ordering of Events is Hard in Large Scale Systems
Aug 30, 2025
How to Handle Concurrency with Optimistic Locking?
Understanding How Distributed Systems Avoid Race Conditions
May 17, 2025
How Failover Works in Single Leader Databases
Strategies for Handling Failover in Single Leader Architectures
May 2, 2025

Production Case Studies

View all
How Nginx Handles Thousands of Concurrent Requests
Understanding Event-driven Non-blocking Architecture of Nginx
Nov 29, 2024
How Amazon Route 53 Handles DDoS Attacks with Shuffle Sharding
Understanding How to Provide Clients Single Tenant Experience in a Shared Cluster
Nov 22, 2024
How Canva Handles Billions of Events to Track Content Usage
Understanding The Evolvement of Canva's Content Usage Counting Service Architecture
Nov 15, 2024
How Grab Stores and Processes Millions of Orders Everyday
Understanding the Distributed Data Solution That Powers the Grab Orders Platform
Nov 1, 2024
© 2026 Sid · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture