HAProxy load balancer. Part 2: Backend section and the algorithms

This is the second article about HAProxy. The last one was about basic terms and layers. Here I’m going to tell you about balancing algorithms.

The backend section of the configuration file is responsible for the balancing algorithm. Haproxy has 9 algorithms.

Roundrobin

The first in the list and the simplest algorithm is Round Robin. You can turn it on by entering “balance round robin” in the backend section.

With this option, HAProxy will iterate over the servers and evenly load your “farm”. The example:

backend
    balance roundrobin
        server srv1 luna-1:80
        server srv2 luna-2:80
* in my case luna-1 — it is an alias inside the private network. In your case, it could be IP-address.

If you’ll add to the server list the weight parameter, you’ll get a more advanced version of the algorithm called weighted round robin: 

backend
    balance roundrobin
        server srv1 luna-1:80 weight 2
        server srv2 luna-2:80 weight 1

Due to different values of weights, the balancer distributes the load based on the physical capabilities of the servers. And a server with a weight parameter 2 in the example above gets 2 times more requests than with a weight parameter of 1.

The pros and cons of the algorithm

Artem Zaytsev

Evil marketer

+

The algorithm could achieve a clear, predictable and stable balancing. All servers in this case, in accordance with their weight value, get a fair number of processing requests.

However, the Round Robin has one drawback, which prevents of using it with long sessions. The algorithm ignores the number of active connections. And even if one node is fully loaded, it will still receive new requests.

Static-rr

I will say briefly about this algorithm. In fact, the Static-rr it’s the same as Round robin. One exception: change of the servers’ weight on the fly will have no effect. But Static-rr is not limited by the number of backends in contrast to Round robin, which can work with up to 4095 active servers.

You can turn it on by entering “balance static-rr” in the backend section. 

backend
    balance static-rr
        server srv1 luna-1:80
        server srv2 luna-2:80

Least Connections

The algorithm called “least connections” count the number of active connections to each server. Thus, every following request is delivered to the server with the lowest number of active connections. You can turn the algorithm on by entering “balance leastconn” in the backend section. 

backend
    balance leastconn
        server srv1 luna-1:80
        server srv2 luna-2:80

This algorithm is dynamic, which means that server weights may be adjusted on the fly for slow starts for instance.

The pros and cons of the least connections algorithm

+

“Least connections” is suitable for tasks associated with long-term connections. For example, it could be load balancing between database servers. If some nodes would have too many active connections, they will not get new queries.

There is no sense to use the algorithm for tasks with short sessions, it could be HTTP protocol for example. For this purposes, it will be better to use a Round Robin option.

First

The algorithm appeared in HAProxy from version 1.5. If you apply it, the balancer will start to fill free connection slots from the first server to the next one by one.

To turn the algorithm on you should enter “balance first” in the backend section. 

backend
    balance first
        server srv1 luna-1:80 maxconn 1000
        server srv2 luna-2:80 maxconn 2000

The pros and cons of the algorithm “First”

+

The main goal of the algorithm “First” is to use the least amount of servers. It allows you to turn off additional servers in non-intensive hours.

To feel the full effect you should set a controller for regularly checking the servers in the farm. It will be turning off unused servers and turning on additional resources at high load.

I can’t say that it is a minus due to the nature of the algorithm. However, the algorithm doesn’t take into consideration the weight of the servers and balance the load depending on maxconn value.

Source

By using this algorithm, the source IP address is hashed and divided by the total weight of the running servers to determine which server will receive the request. Therefore, a source with the same IP address will always be directed to the same server. You can enable this option using “balance source”.

The pros and cons of the algorithm

+

The algorithm is usually used to work by the TCP Protocol, where it is impossible to assign cookies.

If the source has a dynamic IP, the algorithm will not be able to link its session to the same server.

URI

The algorithm selects a server based on a page address. It allows the same server to handle specific page addresses. You can enable the algorithm using the “balance uri” option.

The pros and cons of the algorithm

+

The algorithm is used for load balancing between caching servers. If you’ll try to use other solutions, the total cache size on all nodes will be inflated. In the case of the URI algorithm, a request gotten from a specific address will be sent
to the server which has the cache for this page.

The disadvantage of the algorithm is its narrow specificity of application. My opinion is that the history with caching servers is the only one where the solution is applicable.

URL parameter

The URL parameter algorithm chooses a server based on the GET parameter of the query. If you’ll add the modifier “check_post”, then the algorithm will be making decisions depending on the argument of the parameter. For clarity, I’m going to show you two examples:

 balance url_param userid
 balance url_param session_id check_post 64

In the first example, the algorithm will route user requests to the same server based on the value of their userid. In the second example, the algorithm will assign a specific server for processing requests with a session id 64.

If no parameters are received, the algorithm will balance the load like a Round robin.

The pros and cons of the algorithm

+

URL parameter может быть полезен, если вы хотите закрепить сервера за авторизованными пользователями, а для анонимных распределять нагрузку с помощью простого Round Robin.

The algorithm will only work in HTTP mode. It is useless for TCP traffic.

HDR

The algorithm selects a server based on an HTTP request header. If there won’t be any value in the header, it will work like the Round Robin algorithm.

balance hdr(User-Agent: Mozilla/5.0)

In this case, HAProxy will look for the entry “User-Agent: Mozilla / 5.0” in the request header.

The pros and cons of the algorithm

+

The algorithm will be useful if you need, for example, to link users to servers by browser type, query address, and so on.

On the other hand, the algorithm is suitable for very specific narrow tasks.

rdp-cookie

The algorithm is equivalent to ACL ‘req_rdp_cookie ()’ of the Frontend section. Its purpose is to link the same user to a specific server same server with the identification of the cookie. Its purpose is to link the same user to a specific server by cookies identification.

The pros and cons of the algorithm

+

The algorithm is suitable for linking sessions with certain cookies to specific servers.

If cookies are not used by the client, the algorithm will work as a primitive Round robin.

Conclusion

Usually, 2 algorithms from the list are used: Round Robin and Leastconn. The first is useful for HTTP traffic and the second one for database maintenance. The rest algorithms are used depending on the situation.

If you have several cache servers, the “URI” algorithm will be useful. If you want to link the source of requests to specific servers, you may use the algorithms “source” and “rdp-cookie”. Если захотите распределить запросы по каким-то их свойствам, то пригодятся HDR и URL parameter.

If you want to distribute queries depending on their properties, “HDR” and “URL parameter” algorithms are for you.