"Scalability" First time I heard this word, I never thought one day it's gonna haunt me so much that I will have few sleepless nights. It all started when I was asked to do a horizontal scale testing of our backend system. But don't worry I won't lecture you on scalability test rather I want to share few new things that I learned while doing it (interesting ones).
So the scenario was something like this :
Our webapp(supply chain system) is distributed on 12 VMs (test env) and I had to see how system behaves if we add one more similar setup and use load balancers on top to distribute the load. Can it handle more load. Can it scale ??I said lets do it, only problem was I had no clue about what load balancer I can use, how load balancer works or which one is best for me.
I googled and found three names Apache webserver, Nginx and Haproxy which people use as load balancers. I did a small research on these three tried all of them one by one. This post is all about the pros and cons of these three software load balancers( I am not doing any benchmarking for any of these here, just sharing how to configure and use them and what problems I faced).
So lets start with the easiest to configure and really good load balancer Nginx.
NGINX
Nginx is a webserver and a reverse proxy server for HTTP, SMTP, IMAP and POP3 protocols plus it can work as a software load balancer. Nginx is really fast when it comes to serve static contents, it can scale upto 10000 req/sec. What makes it so fast is its event driven architecture, it doesn't have apache type process or thread model architecture and because of this it has very small memory requirement.
Installation
Linux : sudo apt-get install nginx
Configure
Suppose you have two backend servers 192.168.1.2 and 192.168.1.3 and you installed nginx on 192.168.1.1 .
Create a file /etc/nginx/sites-enabled/myloadbalancer.cfg
upstream myservers {
server 192.168.1.2;
server 192.168.1.3 ;
}
server {
listen 80;
server_name localhost;
access_log /var/log/nginx/access.log;
location / {
proxy_pass http://myservers;
proxy_set_header $host;
}
}
And you are done. One important thing if your application needs the hostname you will have to explicitly set the host header( I needed it and it took me 2 hours to figure out why suddenly our application started giving bad hostname exceptions).
Problems with Nginx
After configuring the load balancers I was happy everything looked fine only problem being some particular rest calls started failing, which was unexpected. After a two days of debugging I finally found that some of the headers which our application was setting up and then making calls, were missing. Nginx was stripping off all the headers starting with X_ . I googled and finally found because of some security measures nginx does strips off certain type of headers. So if your application needs headers starting with X_ or if any header whose name has _ in it ( it converts _ to - ) then probably nginx is not a good idea for load balancing. Though there is a patch which prevents _(underscore) to -(dash) conversion, but in my case it was simply stripping them off so it didn't helped me.
And my 3 days of work went in gutter because I needed those headers, and it forced me to move from nginx to apache, the next easiest one to configure. Lets configure apache in next post.
So the scenario was something like this :
Our webapp(supply chain system) is distributed on 12 VMs (test env) and I had to see how system behaves if we add one more similar setup and use load balancers on top to distribute the load. Can it handle more load. Can it scale ??I said lets do it, only problem was I had no clue about what load balancer I can use, how load balancer works or which one is best for me.
I googled and found three names Apache webserver, Nginx and Haproxy which people use as load balancers. I did a small research on these three tried all of them one by one. This post is all about the pros and cons of these three software load balancers( I am not doing any benchmarking for any of these here, just sharing how to configure and use them and what problems I faced).
So lets start with the easiest to configure and really good load balancer Nginx.
NGINX
Nginx is a webserver and a reverse proxy server for HTTP, SMTP, IMAP and POP3 protocols plus it can work as a software load balancer. Nginx is really fast when it comes to serve static contents, it can scale upto 10000 req/sec. What makes it so fast is its event driven architecture, it doesn't have apache type process or thread model architecture and because of this it has very small memory requirement.
Installation
Linux : sudo apt-get install nginx
Configure
Suppose you have two backend servers 192.168.1.2 and 192.168.1.3 and you installed nginx on 192.168.1.1 .
Create a file /etc/nginx/sites-enabled/myloadbalancer.cfg
upstream myservers {
server 192.168.1.2;
server 192.168.1.3 ;
}
server {
listen 80;
server_name localhost;
access_log /var/log/nginx/access.log;
location / {
proxy_pass http://myservers;
proxy_set_header $host;
}
}
And you are done. One important thing if your application needs the hostname you will have to explicitly set the host header( I needed it and it took me 2 hours to figure out why suddenly our application started giving bad hostname exceptions).
Problems with Nginx
After configuring the load balancers I was happy everything looked fine only problem being some particular rest calls started failing, which was unexpected. After a two days of debugging I finally found that some of the headers which our application was setting up and then making calls, were missing. Nginx was stripping off all the headers starting with X_ . I googled and finally found because of some security measures nginx does strips off certain type of headers. So if your application needs headers starting with X_ or if any header whose name has _ in it ( it converts _ to - ) then probably nginx is not a good idea for load balancing. Though there is a patch which prevents _(underscore) to -(dash) conversion, but in my case it was simply stripping them off so it didn't helped me.
And my 3 days of work went in gutter because I needed those headers, and it forced me to move from nginx to apache, the next easiest one to configure. Lets configure apache in next post.
No comments:
Post a Comment