Here is a clear, real-world explanation of how Load Balancing works in Microservices, with diagrams and simple examples.
This will help you understand it like an architect. π
π How Load Balancing Works in Microservices






In microservices, multiple instances of the same service run to handle high traffic.
A load balancer distributes incoming requests across these service instances so no single instance gets overloaded.
⭐ Types of Load Balancing in Microservices
There are three main types:
1️⃣ Client-Side Load Balancing
The client decides which instance to call.
How it works:
-
Client calls the Service Registry (Eureka/Consul)
-
It gets a list of available service instances
-
Uses an algorithm (round-robin, random, etc.)
-
Sends the request directly to the chosen instance
Example in Spring Cloud:
-
Ribbon (deprecated but used historically)
-
Spring Cloud LoadBalancer (new)
Diagram


Code Example (Spring Cloud LoadBalancer)
@LoadBalanced
@Bean
public RestTemplate restTemplate() {
return new RestTemplate();
}
When calling:
restTemplate.getForObject("http://payment-service/pay", String.class);
It selects one available instance automatically.
2️⃣ Server-Side Load Balancing
The API Gateway or Load Balancer does the balancing.
Clients do NOT know where services are hosted.
Flow:
Client → API Gateway → Load Balancer → Service Instance
Tools:
-
NGINX
-
HAProxy
-
AWS ALB/ELB
-
Envoy
-
Kong / Zuul / Spring Cloud Gateway
Diagram


Why it's useful?
-
Centralized routing
-
Security enforcement
-
Reliable failover
-
Good for external traffic
3️⃣ Sidecar / Service Mesh Load Balancing
Here, each service has a sidecar proxy (Envoy) that handles:
✔ Load balancing
✔ Traffic routing
✔ TLS/mTLS
✔ Circuit breaking
✔ Retries
You don’t write LB logic in the service.
Technologies:
-
Istio
-
Linkerd
-
Consul Connect
Diagram
⚙️ Load Balancing Algorithms Used
| Algorithm | How it works | When used |
|---|---|---|
| Round Robin | Each request goes to the next instance | Most common |
| Random | Pick any instance randomly | Avoids patterns |
| Least Connections | Sends request to instance with lowest load | High-traffic systems |
| Weighted Round Robin | Prioritize stronger instances | Heterogeneous servers |
| Consistent Hashing | Same client → same instance | Caching, session stickiness |
Most load balancers support these.
π₯ Example: Load Balancing in a Real Microservices System
Let’s say you have a Payment Service with 3 instances:
-
payment-1 → 10.0.1.15
-
payment-2 → 10.0.1.16
-
payment-3 → 10.0.1.17
The client calls:
http://payment-service/pay
Behind the scenes:
π Client-Side LB
The client selects one of the 3 IPs and calls it directly.
π Server-Side LB
Gateway or ALB forwards requests:
| Request # | Instance |
|---|---|
| 1 | payment-1 |
| 2 | payment-2 |
| 3 | payment-3 |
| 4 | payment-1 |
| etc… | round robin |
π Service Mesh
Sidecar proxies handle routing automatically.
⭐ Why Load Balancing is Critical in Microservices?
✔ Ensures high availability
✔ Prevents overload of a single instance
✔ Allows auto-scaling (add/remove instances dynamically)
✔ Supports zero-downtime deployments (rolling updates)
✔ Enables traffic shaping and rate limiting
π Final Summary
| Type | Used In | Who chooses the instance |
|---|---|---|
| Client-Side LB | Spring Cloud, Netflix OSS | Client |
| Server-Side LB | API Gateways, AWS ALB | Gateway/LB |
| Service Mesh LB | Istio, Consul Connect | Sidecar proxy |
If you want, I can also prepare:
✅ A full microservices architecture diagram
✅ Load balancing flow for Spring Boot + K8s
✅ AWS ALB + ECS/EKS load balancing explanation
Just tell me your tech stack!
No comments:
Post a Comment