Load Balancer Strategies
Contents
Intro
- balance requests across server pools
- depend on service or application type being served
- status of network and server at time of request
- server health / pre-defined condition
Strategies (algorithms)
Network Layer Algorithms
1. Round Robin
- simplest and most used
- distributed on rotation > circular list of available servers are given each request as received
- assumes all servers are same
2. Weighted Round Robin
- based on application server characteristics
- each server is assigned a weight based on its capabilities
- higher weight gets more requests
- Dynamic RR where weight is calculated in realtime
3. Least Connection
- dynamic allocation
- given to server with least number of active connections
- helps when some servers handle longer lived connections
4. Weighted Least Connection
- mix of active connections and the capablities of each server based on the weight assigned to it
5. Resource Based (Adaptive) **[Custom load]**
- Monitoring agent on each server reports its current load to load balancer
- based on the reports the queries are distributed
- used when traffic is predictable used in enterprises
- not suitable for uneven and sudden traffic
6. Resource Based (SDN Adaptive)
- knowledge from Layers 2, 3, 4 and 7 and input from SDN Controller
- statuse of server and applications running on them
- health of network infrastructure and congestion levels
7. Fixed Weighted
- Server with the highest weight receives all requests
- if the highest weight fails, all traffic is directed to the next highest weight servers
- WHY?? minimise spread and only as backup?
- might use 100/0, 80/20 weightage.
8. Weighted Response Time
- response times to regular health checks of each server used to assign weight
- determines which server receives the next request
9. Source IP Hash
- source and destination IP combined to generate an unique hash key
- even if a session gets disconnected, can regenerate and connect to same server to reconnect session
10. URL hash
- each read request served by specific server based on a set of content (object)
- improve backend cache performance of each server
- write requests are distributed evenly to all holding the object
11. Least Response Time
- least number of connection and least avg response time is selected
12. Least Bandwidth / Least Packets
Application Layer Algorithms
- based on content of requests including session cookies, HTTP header / message
- allows for intelligent distribution
- response data also used to monitor server performance and active load
1. Least Pending Requests (LPR)
- adapts to abrupt changing traffic
- monitor and efficiently distribute to most available servers
- Accurate load distribution
- Request specific distribution