Skip to content
Matt Harlow edited this page Mar 13, 2024 · 2 revisions
  • Vertical Scaling
  • In vertical scaling, when our server is overwhelmed we simply buy or build a larger server. This strategy is limited however, as there is an upper limit on how powerful a single server can be.

  • Horizontal Scaling
  • In horizontal scaling, when our server is overwhelmed we buy or build more servers, and then split the requests among our multiple servers.

  • Load Balancing When we use horizontal scaling, we are faced with the additional problem of how we decide which servers are assigned to which requests. We answer that question by employing a load balancer, which is another piece of hardware that intercepts incoming requests, and then assigns those requests to one of our servers. There are a number of different methods for deciding which server receives which request, but here are a few:

    • Random: In this simple method, the load balancer will decide randomly which server it should assign a request to.
    • Round-Robin: In this method, the load balancer will alternate which server receives an incoming request. If we have three servers, the first request might go to server A, the second to server B, the third to server C, and the fourth back to server A.
    • Fewest Connections: In this method, the load balancer looks for the server that is currently handling the fewest requests, and assigns the incoming request to that server. This allows us to make sure we’re not overworking one particular server, but it also takes longer for the load balancer to calculate the number of requests each server is currently handling than it dows for it to simply choose a random server.
  • Autoscaling

  • Database Scaling

  • Caching

Clone this wiki locally